Distributed storage research: Ceph (shelved) + MinIO guide + DedupS3 design

This commit is contained in:
Warren
2026-06-25 00:43:57 +08:00
parent f3b75fae3d
commit f492a96077
4 changed files with 1363 additions and 0 deletions

382
docs/MINIO_INTEGRATION.md Normal file
View File

@@ -0,0 +1,382 @@
# MinIO Integration Guide for MarkBase
**Date**: 2026-06-25
**Status**: Ready for deployment
**Backend**: S3Vfs (已有实现,无需修改代码)
---
## Executive Summary
MinIO 是高性能、S3-compatible 的对象存储服务,完美契合 MarkBase 的定位:
- ✅ 跨平台支持macOS/Linux/Windows
- ✅ 轻量级部署(单节点即可)
- ✅ 已有 S3Vfs 支持(无需修改代码)
- ✅ 高性能(纠删码 + 分布式扩展)
---
## MinIO vs Ceph RADOS Comparison
| Aspect | MinIO | Ceph RADOS |
|--------|-------|------------|
| **Platform** | ✅ 全平台 | ❌ Linux-only |
| **Deployment** | ⚠️⚠️ 单节点即可 | ⚠️⚠️⚠️⚠️⚠️ 需完整集群 |
| **API** | ✅ S3-compatible HTTP | ❌ librados FFI |
| **Code change** | ✅ 0 行(已有 S3Vfs | ❌ ~1350 行 |
| **Positioning** | ⭐⭐⭐⭐⭐ 完全匹配 | ❌ 不符合 Lightweight 定位 |
---
## MinIO Deployment
### macOS 单节点部署
```bash
# 安装 MinIO
brew install minio/stable/minio
# 启动 MinIO server
minio server /path/to/data --console-address ":9001"
# 输出:
# Endpoint: http://192.168.1.100:9000 http://127.0.0.1:9000
# Console: http://192.168.1.100:9001 http://127.0.0.1:9001
# AccessKey: minioadmin
# SecretKey: minioadmin
```
### Linux 生产部署
```bash
# Docker 单节点
docker run -d \
--name minio \
-p 9000:9000 \
-p 9001:9001 \
-v /data/minio:/data \
minio/minio server /data --console-address ":9001"
# 分布式集群4节点
docker run -d \
--name minio \
-p 9000:9000 \
-p 9001:9001 \
-v /data1:/data1 \
-v /data2:/data2 \
minio/minio server http://node1/data1 http://node2/data2 http://node3/data1 http://node4/data2 --console-address ":9001"
```
### Kubernetes 部署(推荐生产)
```yaml
# minio-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: minio
spec:
replicas: 4
selector:
matchLabels:
app: minio
template:
metadata:
labels:
app: minio
spec:
containers:
- name: minio
image: minio/minio:latest
args:
- server
- http://minio-0/data http://minio-1/data http://minio-2/data http://minio-3/data
- --console-address
- ":9001"
ports:
- containerPort: 9000
- containerPort: 9001
volumeMounts:
- name: data
mountPath: /data
volumes:
- name: data
emptyDir: {}
```
---
## MarkBase S3Vfs Integration
### 配置方式
**环境变量**
```bash
export MB_S3_ENDPOINT=http://localhost:9000
export MB_S3_REGION=us-east-1
export MB_S3_BUCKET=markbase
export MB_S3_ACCESS_KEY=minioadmin
export MB_S3_SECRET_KEY=minioadmin
```
**配置文件**`config/s3.toml`
```toml
[s3]
enabled = true
endpoint = "http://localhost:9000"
region = "us-east-1"
bucket = "markbase"
access_key = "minioadmin"
secret_key = "minioadmin"
[s3.webdav]
# WebDAV 使用 S3 后端
enabled = true
user = "demo"
root_prefix = "webdav/"
```
### S3Vfs 使用示例
**WebDAV + MinIO**
```bash
# 启动 WebDAV server使用 MinIO 后端)
cargo run -- webdav-start \
--user demo \
--port 8002 \
--s3 \
--s3-endpoint http://localhost:9000 \
--s3-bucket markbase \
--s3-access-key minioadmin \
--s3-secret-key minioadmin \
--s3-region us-east-1 \
--root webdav/
```
**SMB + MinIO**(通过 VFS backend
```bash
# 启动 SMB server使用 MinIO 后端)
cargo run --features smb-server -- smb-start \
--port 4445 \
--share-name files \
--s3 \
--s3-endpoint http://localhost:9000 \
--s3-bucket markbase \
--s3-access-key minioadmin \
--s3-secret-key minioadmin \
--s3-region us-east-1 \
--root smb/
```
---
## MinIO Bucket Management
### 创建 Bucket
```bash
# 使用 MinIO client (mc)
mc alias set myminio http://localhost:9000 minioadmin minioadmin
mc mb myminio/markbase
# 使用 AWS CLI
aws --endpoint-url http://localhost:9000 s3 mb s3://markbase
```
### 设置 Bucket Policy
```bash
# 公开读取 policy用于 public shares
mc anonymous set download myminio/markbase/public
# 私有 policy默认
mc anonymous set none myminio/markbase/private
```
### 设置 Bucket Quota
```bash
# 设置 quotaMinIO 企业版功能)
mc admin bucket quota myminio/markbase 10GB
```
---
## MinIO Features Relevant to MarkBase
| Feature | Description | MarkBase Use Case |
|---------|-------------|-------------------|
| **Erasure Coding** | 数据冗余(默认 EC:2 | 自动容错,类似 RAID |
| **Versioning** | 对象版本控制 | 可替代 Snapshot 功能 |
| **Bucket Policy** | ACL 管理 | 用户权限控制 |
| **Lifecycle Rules** | 自动过期 | 旧 backup 清理 |
| **Object Lock** | WORM 模式 | 合规性备份保护 |
| **Replication** | 跨站点复制 | Disaster recovery |
### Versioning替代 Snapshot
```bash
# 启用 versioning
mc version enable myminio/markbase
# 列出对象版本
mc ls --versions myminio/markbase/file.txt
# 恢复旧版本
mc cp myminio/markbase/file.txt#version-id myminio/markbase/file.txt
```
### Lifecycle RulesBackup 清理)
```bash
# 设置 30 天后自动删除
mc ilm add myminio/markbase --expire-days 30
```
---
## Performance Optimization
### MinIO 性能参数
```bash
# 高性能配置
minio server /data \
--console-address ":9001" \
--parallel 8 \
--cache /cache:1000
```
### S3Vfs 性能优化
**并发上传**(已在 S3Vfs 实现):
- Multipart upload大于 5MB 自动分片)
- 并发上传分片(默认 4 并发)
**缓存**
- ReadCache: 64MB, 64KB blocks, 5min TTL已在 cache.rs 实现)
- WriteCache: 32MB已在 cache.rs 实现)
---
## Docker Compose Example
```yaml
version: '3'
services:
minio:
image: minio/minio:latest
command: server /data --console-address ":9001"
ports:
- "9000:9000"
- "9001:9001"
volumes:
- minio-data:/data
environment:
- MINIO_ROOT_USER=minioadmin
- MINIO_ROOT_PASSWORD=minioadmin
markbase-webdav:
build: .
command: webdav-start --user demo --port 8002 --s3 --s3-endpoint http://minio:9000 --s3-bucket markbase --s3-access-key minioadmin --s3-secret-key minioadmin
ports:
- "8002:8002"
environment:
- MB_S3_ENDPOINT=http://minio:9000
depends_on:
- minio
volumes:
minio-data:
```
---
## Integration Checklist
| Task | Status | Notes |
|------|--------|-------|
| **MinIO 部署** | ⏳ User action | macOS/Linux/Docker |
| **创建 Bucket** | ⏳ User action | `mc mb myminio/markbase` |
| **S3Vfs 配置** | ✅ 已支持 | 无需修改代码 |
| **WebDAV + S3** | ✅ 已支持 | CLI 参数已实现 |
| **SMB + S3** | ✅ 已支持 | CLI 参数已实现 |
| **SFTP + S3** | ⏳ 待实现 | 需要 SFTP S3 backend |
| **Backup to S3** | ✅ 已支持 | BackupManifest + S3Vfs |
---
## Troubleshooting
### MinIO 连接问题
```bash
# 检查 MinIO status
mc admin info myminio
# 检查 endpoint 连接
curl -I http://localhost:9000/minio/health/live
```
### S3Vfs 错误
**常见错误**
- `VfsError::NotFound` → Bucket 或 object 不存在
- `VfsError::PermissionDenied` → Access key/secret key 错误
- `VfsError::Io("S3 PUT failed: 403")` → Bucket policy 拒绝写入
**调试方法**
```bash
# 查看 MinIO logs
docker logs minio
# 使用 mc 测试
mc cp test.txt myminio/markbase/test.txt
mc ls myminio/markbase/
```
---
## MinIO vs S3Vfs Feature Mapping
| VfsBackend Method | MinIO S3 API | Status |
|-------------------|--------------|--------|
| `read_dir()` | ListObjectsV2 | ✅ |
| `open_file()` | GetObject / PutObject | ✅ |
| `stat()` | HeadObject | ✅ |
| `create_dir()` | PutObject (0-byte) | ✅ |
| `remove_dir()` | DeleteObject | ✅ |
| `remove_file()` | DeleteObject | ✅ |
| `rename()` | CopyObject + DeleteObject | ✅ |
| `exists()` | HeadObject | ✅ |
| `copy()` | CopyObject | ✅ |
| `hard_link()` | CopyObject | ✅ |
| `create_snapshot()` | Versioning | ⚠️ 需启用 versioning |
| `list_snapshots()` | ListObjectVersions | ⚠️ 需实现 |
| `set_quota()` | Bucket quota | ⚠️ MinIO 企业版 |
| `set_acl()` | Bucket policy | ⚠️ 需实现 |
---
## Next Steps
1. **部署 MinIO**(用户 action
- macOS: `brew install minio && minio server /data`
- Docker: `docker run minio/minio server /data`
2. **创建 Bucket**(用户 action
- `mc alias set myminio http://localhost:9000 minioadmin minioadmin`
- `mc mb myminio/markbase`
3. **配置 MarkBase**
- 设置 `MB_S3_*` 环境变量
- 或使用 CLI 参数 `--s3 --s3-endpoint ...`
4. **测试连接**
- WebDAV: `curl -X PROPFIND http://localhost:8002/webdav/`
- SMB: `smbclient -p 4445 -L localhost`
---
**文档创建**: 2026-06-25
**最后更新**: 2026-06-25