cleanup: remove dead code and duplicate docs
- Remove session-ses_2f27.md (161KB raw session log) - Remove 49 ROOT_* duplicate files across REFERENCE/ - Remove 14 duplicate files between REFERENCE/ root and history/ - Remove asr_legacy.rs (dead code, replaced by asr.rs) - Remove src/core/worker/ (duplicate JobWorker) - Remove src/core/layers/ (empty directory) - Remove 4 .bak files in src/ - Remove 7 dead private methods in worker/processor.rs - Remove backup directory from git tracking
This commit is contained in:
@@ -1,284 +0,0 @@
|
||||
# UUID 长度问题分析报告
|
||||
|
||||
> Date: 2026-04-28 20:15
|
||||
> Issue: 16-char vs 32-char UUID formats
|
||||
|
||||
---
|
||||
|
||||
## 问题发现
|
||||
|
||||
### 实际数据统计
|
||||
|
||||
```sql
|
||||
SELECT DISTINCT LENGTH(uuid) as len, COUNT(*) FROM dev.videos GROUP BY LENGTH(uuid);
|
||||
|
||||
-- Result:
|
||||
len | count
|
||||
-----+-------
|
||||
16 | 20 -- 旧格式 (SHA256[0:16])
|
||||
32 | 7479 -- 新格式 (SHA256[0:32])
|
||||
```
|
||||
|
||||
### 384b0ff44aaaa1f14cb2cd63b3fea966 状态
|
||||
|
||||
| 字段 | 值 |
|
||||
|------|-----|
|
||||
| **uuid** | 384b0ff44aaaa1f14cb2cd63b3fea966 (16 字符) |
|
||||
| **birth_registration** | NULL |
|
||||
| **file_name** | Old_Time_Movie_Show_-_Charade_1963.HD.mov |
|
||||
|
||||
---
|
||||
|
||||
## UUID 格式对比
|
||||
|
||||
| 格式 | 长度 | 生成函数 | 用途 |
|
||||
|------|------|----------|------|
|
||||
| **旧格式** | 16 | `compute_uuid()` | 早期视频注册 |
|
||||
| **新格式** | 32 | `compute_birth_uuid()` | Birth UUID (隐私保护) |
|
||||
|
||||
### 生成逻辑
|
||||
|
||||
**旧格式 (16 字符)**:
|
||||
|
||||
```rust
|
||||
// src/core/storage/uuid.rs:6-11
|
||||
pub fn compute_uuid(user_path: &str, filename: &str) -> String {
|
||||
let key = format!("{}/{}", user_path, filename);
|
||||
let hash = Sha256::digest(key.as_bytes());
|
||||
hex::encode(hash)[0..16].to_string() // 只取前 16 字符
|
||||
}
|
||||
```
|
||||
|
||||
**新格式 (32 字符)**:
|
||||
|
||||
```rust
|
||||
// src/core/storage/uuid.rs:82-91
|
||||
pub fn compute_birth_uuid(
|
||||
mac_address: &str,
|
||||
timestamp: &str,
|
||||
username: &str,
|
||||
filename: &str,
|
||||
) -> String {
|
||||
let key = format!("{}|{}|{}|{}", mac_address, timestamp, username, filename);
|
||||
let hash = Sha256::digest(key.as_bytes());
|
||||
hex::encode(hash)[0..32].to_string() // 取前 32 字符
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 差异分析
|
||||
|
||||
### 隐私保护
|
||||
|
||||
| 特性 | 旧格式 | 新格式 |
|
||||
|------|--------|--------|
|
||||
| **包含元素** | Path + Filename | MAC + Time + User + Filename |
|
||||
| **隐私风险** | 路径暴露 | 全部哈希化 |
|
||||
| **唯一性** | 相对路径依赖 | MAC + Time 确保唯一 |
|
||||
| **不可变性** | 迁移会改变 | 记录原始注册信息 |
|
||||
|
||||
### 代码判断逻辑
|
||||
|
||||
```rust
|
||||
// src/core/storage/uuid.rs:94-96
|
||||
pub fn is_birth_uuid(uuid: &str) -> bool {
|
||||
uuid.len() == 32 && !uuid.contains('_')
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 数据库列定义
|
||||
|
||||
| 表 | 列 | 类型 | 可容纳 |
|
||||
|-----|-----|------|--------|
|
||||
| **videos** | uuid | VARCHAR(32) | ✅ 16 + 32 |
|
||||
| **face_detections** | file_uuid | VARCHAR(255) | ✅ 16 + 32 |
|
||||
| **chunks** | uuid | VARCHAR(32) | ✅ 16 + 32 |
|
||||
| **jobs** | asset_uuid | VARCHAR(32) | ✅ 16 + 32 |
|
||||
|
||||
---
|
||||
|
||||
## 影响范围
|
||||
|
||||
### 受影响的表
|
||||
|
||||
```sql
|
||||
-- 查询使用 16 字符 UUID 的表
|
||||
SELECT 'face_detections' as table_name, COUNT(*)
|
||||
FROM dev.face_detections WHERE LENGTH(file_uuid) = 16;
|
||||
|
||||
SELECT 'chunks' as table_name, COUNT(*)
|
||||
FROM dev.chunks WHERE LENGTH(uuid) = 16;
|
||||
|
||||
SELECT 'jobs' as table_name, COUNT(*)
|
||||
FROM dev.jobs WHERE LENGTH(asset_uuid) = 16;
|
||||
```
|
||||
|
||||
### 依赖关系
|
||||
|
||||
```
|
||||
videos.uuid (16-char)
|
||||
↓
|
||||
face_detections.file_uuid (16-char) -- FK
|
||||
chunks.uuid (16-char)
|
||||
jobs.asset_uuid (16-char) -- FK
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 解决方案
|
||||
|
||||
### 方案 A: 保留兼容模式(推荐)
|
||||
|
||||
**优点**:
|
||||
- 不破坏现有数据
|
||||
- 无需大量迁移
|
||||
- VARCHAR(32) 可容纳两种格式
|
||||
|
||||
**缺点**:
|
||||
- 代码需要处理两种格式
|
||||
- `is_birth_uuid()` 返回 false (16-char)
|
||||
- 部分功能受限(如 Birth UUID 查询)
|
||||
|
||||
**实施**:
|
||||
1. 保持现有数据不变
|
||||
2. 新视频使用 32 字符格式
|
||||
3. 代码兼容两种格式
|
||||
|
||||
### 方案 B: 强制迁移(破坏性)
|
||||
|
||||
**优点**:
|
||||
- 统一格式
|
||||
- 全部启用 Birth UUID 功能
|
||||
- 代码简化
|
||||
|
||||
**缺点**:
|
||||
- 破坏性更改
|
||||
- 需要更新所有关联表
|
||||
- 外键关系需重建
|
||||
|
||||
**实施步骤**:
|
||||
|
||||
```sql
|
||||
-- 1. 生成新 UUID
|
||||
UPDATE dev.videos
|
||||
SET uuid = (
|
||||
SELECT compute_birth_uuid(...)
|
||||
)
|
||||
WHERE LENGTH(uuid) = 16;
|
||||
|
||||
-- 2. 更新关联表
|
||||
UPDATE dev.face_detections
|
||||
SET file_uuid = (
|
||||
SELECT new_uuid FROM videos WHERE old_uuid = file_uuid
|
||||
)
|
||||
WHERE LENGTH(file_uuid) = 16;
|
||||
|
||||
-- 3. 更新 chunks
|
||||
UPDATE dev.chunks
|
||||
SET uuid = (
|
||||
SELECT new_uuid FROM videos WHERE old_uuid = uuid
|
||||
)
|
||||
WHERE LENGTH(uuid) = 16;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 建议
|
||||
|
||||
### 当前阶段:兼容模式
|
||||
|
||||
**理由**:
|
||||
1. 384b0ff44aaaa1f14cb2cd63b3fea966 已有大量关联数据:
|
||||
- face_detections: 78 条
|
||||
- chunks: 5683 条
|
||||
- jobs: 2 条
|
||||
|
||||
2. 迁移风险高:
|
||||
- 外键关系复杂
|
||||
- 可能破坏现有功能
|
||||
|
||||
3. VARCHAR(32) 已足够:
|
||||
- 可容纳两种格式
|
||||
- 无需修改列定义
|
||||
|
||||
### 长期规划:渐进迁移
|
||||
|
||||
1. **Phase 1**: 保持兼容
|
||||
2. **Phase 2**: 新注册使用 32 字符
|
||||
3. **Phase 3**: 逐步迁移旧数据(可选)
|
||||
|
||||
---
|
||||
|
||||
## 判断函数使用
|
||||
|
||||
```rust
|
||||
// 检查是否是 Birth UUID
|
||||
if is_birth_uuid(&uuid) {
|
||||
// 启用 Birth UUID 相关功能
|
||||
} else {
|
||||
// 使用旧格式逻辑
|
||||
}
|
||||
```
|
||||
|
||||
### 受影响的功能
|
||||
|
||||
| 功能 | 依赖 | 影响 |
|
||||
|------|------|------|
|
||||
| **Birth UUID 查询** | `is_birth_uuid()` | 16-char 无法查询 |
|
||||
| **跨设备同步** | Birth UUID | 16-char 不支持 |
|
||||
| **隐私保护** | Birth UUID | 16-char 不受保护 |
|
||||
|
||||
---
|
||||
|
||||
## 代码检查清单
|
||||
|
||||
### 需要兼容的代码
|
||||
|
||||
| 文件 | 行 | 需检查 |
|
||||
|------|-----|--------|
|
||||
| `uuid.rs:95` | is_birth_uuid() | ✅ 已检查 |
|
||||
| `uuid.rs:263` | 测试用例 | ✅ 已验证 |
|
||||
| `postgres_db.rs:620` | videos 表 | ✅ VARCHAR(32) |
|
||||
| `rule1_ingest.rs:9` | execute_rule1() | ✅ 使用 file_uuid 参数 |
|
||||
| `face_recognition.rs` | face_detections | ✅ VARCHAR(255) |
|
||||
|
||||
### 无需修改的代码
|
||||
|
||||
- 所有参数类型使用 `&str` 或 `String`
|
||||
- VARCHAR 定义已足够
|
||||
- 外键关系正常
|
||||
|
||||
---
|
||||
|
||||
## 结论
|
||||
|
||||
### 问题根源
|
||||
|
||||
384b0ff44aaaa1f14cb2cd63b3fea966 使用旧格式 UUID (16 字符),原因是:
|
||||
1. 早期注册(未启用 Birth UUID)
|
||||
2. `compute_uuid()` 只取 SHA256 前 16 字符
|
||||
3. birth_registration = NULL
|
||||
|
||||
### 当前状态
|
||||
|
||||
- **videos 表**: VARCHAR(32) ✅ 可容纳两种格式
|
||||
- **代码逻辑**: `is_birth_uuid()` 区分两种格式 ✅
|
||||
- **外键关系**: 正常 ✅
|
||||
|
||||
### 建议
|
||||
|
||||
**当前**: 保持兼容模式,不强制迁移
|
||||
**未来**: 新视频使用 32 字符 Birth UUID
|
||||
|
||||
---
|
||||
|
||||
## 相关文件
|
||||
|
||||
| 文件 | 说明 |
|
||||
|------|------|
|
||||
| `src/core/storage/uuid.rs` | UUID 生成逻辑 |
|
||||
| `migrations/019_add_birth_registration.sql` | Birth UUID 表结构 |
|
||||
| `migrations/025_rename_video_uuid_to_file_uuid.sql` | 列重命名迁移 |
|
||||
Reference in New Issue
Block a user