cleanup: remove dead code and duplicate docs
- Remove session-ses_2f27.md (161KB raw session log) - Remove 49 ROOT_* duplicate files across REFERENCE/ - Remove 14 duplicate files between REFERENCE/ root and history/ - Remove asr_legacy.rs (dead code, replaced by asr.rs) - Remove src/core/worker/ (duplicate JobWorker) - Remove src/core/layers/ (empty directory) - Remove 4 .bak files in src/ - Remove 7 dead private methods in worker/processor.rs - Remove backup directory from git tracking
This commit is contained in:
@@ -1,328 +0,0 @@
|
||||
# Processor 状态分析报告
|
||||
|
||||
> Date: 2026-04-28 21:00
|
||||
> Video UUID: 384b0ff44aaaa1f14cb2cd63b3fea966 (Charade 1963)
|
||||
|
||||
---
|
||||
|
||||
## 输出文件状态
|
||||
|
||||
| Processor | 输出文件 | 文件大小 | 内容统计 |
|
||||
|-----------|----------|----------|----------|
|
||||
| **OCR** | `384b0ff44aaaa1f14cb2cd63b3fea966.ocr.json` | 13MB (607KB lines) | 13728 frames |
|
||||
| **Probe** | `384b0ff44aaaa1f14cb2cd63b3fea966.probe.json` | 558B | Metadata |
|
||||
| **Face** | ❌ 缺失 | - | - |
|
||||
| **YOLO** | ❌ 缺失 | - | - |
|
||||
| **ASRX** | ❌ 缺失 | - | - |
|
||||
|
||||
---
|
||||
|
||||
## processor_results 状态
|
||||
|
||||
| Processor | status | chunks_produced | error_message | 真实状态 |
|
||||
|-----------|--------|-----------------|---------------|----------|
|
||||
| **ASR** | completed | 3664 | - | ✅ 成功 |
|
||||
| **CUT** | completed | 1332 | - | ✅ 成功 |
|
||||
| **OCR** | failed | 0 | Failed to run... | ⚠️ **矛盾**(输出存在) |
|
||||
| **Face** | failed | 0 | Failed to read FACE output | ⚠️ **矛盾**(face_detections 有78条) |
|
||||
| **YOLO** | failed | 0 | Failed to run yolo_processor.py | ❌ 真实失败 |
|
||||
| **ASRX** | **无记录** | - | - | ❌ 未运行 |
|
||||
|
||||
---
|
||||
|
||||
## 数据矛盾分析
|
||||
|
||||
### OCR 状态矛盾
|
||||
|
||||
**processor_results**: failed, chunks_produced = 0
|
||||
**实际输出**: 13MB JSON, 13728 frames, 412343 frame_count
|
||||
|
||||
**原因推测**:
|
||||
1. OCR processor 运行成功
|
||||
2. processor_results 记录错误(可能是写入失败)
|
||||
3. chunks_produced 未统计
|
||||
|
||||
**影响**: OCR 数据可用,但 processor_results 记录不准确
|
||||
|
||||
---
|
||||
|
||||
### Face 状态矛盾
|
||||
|
||||
**processor_results**: failed, chunks_produced = 0
|
||||
**face_detections**: 78 条记录(frame 1798-88102)
|
||||
|
||||
**原因推测**:
|
||||
1. Face processor 运行并写入 face_detections
|
||||
2. processor_results 记录失败(可能是读取输出失败)
|
||||
3. 输出文件缺失(可能未生成 JSON)
|
||||
|
||||
**影响**: Face 数据可用(face_detections),但输出文件缺失
|
||||
|
||||
---
|
||||
|
||||
### YOLO 失败原因
|
||||
|
||||
**error_message**: `Failed to run "/Users/accusys/momentry_core_0.1/scripts/yolo_processor.py"`
|
||||
|
||||
**检查**:
|
||||
- 脚本存在: ✅ `/Users/accusys/momentry_core_0.1/scripts/yolo_processor.py`
|
||||
- 权限: ✅ `-rwxr-xr-x`
|
||||
- Python 环境: 需检查
|
||||
|
||||
**可能原因**:
|
||||
1. Python 环境问题
|
||||
2. YOLO 模型文件缺失
|
||||
3. 视频文件路径问题
|
||||
|
||||
---
|
||||
|
||||
### ASRX 未运行原因
|
||||
|
||||
**processor_results**: 无记录
|
||||
|
||||
**可能原因**:
|
||||
1. ASRX processor 未在 processor_list 中
|
||||
2. Job Worker 未触发 ASRX
|
||||
3. ASRX 依赖未满足
|
||||
|
||||
---
|
||||
|
||||
## OCR 输出结构
|
||||
|
||||
```json
|
||||
{
|
||||
"frame_count": 412343,
|
||||
"fps": 59.94,
|
||||
"frames": [
|
||||
{
|
||||
"frame": 29,
|
||||
"timestamp": 0.484,
|
||||
"texts": [
|
||||
{
|
||||
"text": "1",
|
||||
"x": 1840,
|
||||
"y": 366,
|
||||
"width": 86,
|
||||
"height": 168,
|
||||
"confidence": 0.579
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**统计**:
|
||||
- 总帧数: 412343
|
||||
- OCR 检测帧: 13728 (3.3%)
|
||||
- FPS: 59.94
|
||||
|
||||
---
|
||||
|
||||
## Face 数据验证
|
||||
|
||||
### face_detections 表
|
||||
|
||||
```sql
|
||||
SELECT file_uuid, COUNT(*), MIN(frame_number), MAX(frame_number)
|
||||
FROM dev.face_detections
|
||||
WHERE file_uuid = '384b0ff44aaaa1f14cb2cd63b3fea966';
|
||||
|
||||
-- Result:
|
||||
file_uuid: 384b0ff44aaaa1f14cb2cd63b3fea966
|
||||
count: 78
|
||||
frame_range: 1798 - 88102
|
||||
```
|
||||
|
||||
**分析**:
|
||||
- 检测帧数: 78 (占 88102 帧的 0.09%)
|
||||
- 分布稀疏(可能是特定场景)
|
||||
|
||||
### Face 数据来源
|
||||
|
||||
**可能来源**:
|
||||
1. 旧版 Face processor(直接写入 face_detections)
|
||||
2. 手动导入
|
||||
3. Face processor 运行但未生成 JSON 输出
|
||||
|
||||
**验证**: face_detections.created_at 检查
|
||||
|
||||
```sql
|
||||
SELECT MIN(created_at), MAX(created_at)
|
||||
FROM dev.face_detections
|
||||
WHERE file_uuid = '384b0ff44aaaa1f14cb2cd63b3fea966';
|
||||
|
||||
-- Result: 需查询
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Worker 状态
|
||||
|
||||
### 运行进程
|
||||
|
||||
```bash
|
||||
ps aux | grep momentry
|
||||
|
||||
# Found:
|
||||
PID 309: target/release/momentry worker --max-concurrent 2
|
||||
PID 24478: target/release/momentry server --port 3002
|
||||
```
|
||||
|
||||
**状态**: Worker 正在运行 ✅
|
||||
|
||||
### Jobs 队列
|
||||
|
||||
```sql
|
||||
SELECT id, status, rule FROM dev.jobs WHERE asset_uuid = '384b0ff44aaaa1f14cb2cd63b3fea966';
|
||||
|
||||
-- Result:
|
||||
2 jobs QUEUED (rule1)
|
||||
```
|
||||
|
||||
**问题**: Rule1 jobs 未执行
|
||||
|
||||
---
|
||||
|
||||
## 问题根源分析
|
||||
|
||||
### 1. processor_results 记录不准确
|
||||
|
||||
**表现**:
|
||||
- OCR: failed 但输出存在
|
||||
- Face: failed 但 face_detections 有数据
|
||||
|
||||
**原因**:
|
||||
- processor_results 写入逻辑问题
|
||||
- 错误捕获不准确
|
||||
- chunks_produced 统计缺失
|
||||
|
||||
---
|
||||
|
||||
### 2. Face 数据写入路径不一致
|
||||
|
||||
**表现**:
|
||||
- Face processor 直接写入 face_detections
|
||||
- 未生成 JSON 输出文件
|
||||
- processor_results 记录失败
|
||||
|
||||
**影响**:
|
||||
- Rule 1 可读取 face_detections ✅
|
||||
- 无法重新处理(无输出文件)
|
||||
|
||||
---
|
||||
|
||||
### 3. YOLO/ASRX processor 未成功
|
||||
|
||||
**YOLO**: 脚本执行失败
|
||||
**ASRX**: 未在 processor_list 中
|
||||
|
||||
**影响**:
|
||||
- Rule 1 缺少 YOLO objects
|
||||
- Rule 1 缺少 Speaker ID
|
||||
|
||||
---
|
||||
|
||||
## 解决方案
|
||||
|
||||
### 短期方案
|
||||
|
||||
**1. 使用现有数据**
|
||||
- ASR: ✅ 可用(3664 chunks)
|
||||
- Face: ✅ 可用(face_detections 78 条)
|
||||
- OCR: ✅ 可用(13728 frames)
|
||||
|
||||
**2. 运行 Rule 1**
|
||||
- Face 数据源已修复(从 face_detections 读取)
|
||||
- YOLO objects = []
|
||||
- Speaker ID = "UNKNOWN"
|
||||
|
||||
**3. 手动运行 ASRX**
|
||||
- 启动 ASRX processor
|
||||
- 等待完成后重新运行 Rule 1
|
||||
|
||||
---
|
||||
|
||||
### 中期方案
|
||||
|
||||
**1. 修复 processor_results 记录**
|
||||
- 检查 OCR/Face processor 错误捕获
|
||||
- 更新 chunks_produced 统计
|
||||
|
||||
**2. 修复 Face 输出文件**
|
||||
- Face processor 应生成 JSON 输出
|
||||
- 统一写入路径
|
||||
|
||||
**3. 修复 YOLO processor**
|
||||
- 检查 Python 环境
|
||||
- 检查 YOLO 模型
|
||||
|
||||
---
|
||||
|
||||
### 长期方案
|
||||
|
||||
**1. Processor 输出标准化**
|
||||
- 所有 processor 生成 JSON 输出
|
||||
- 统一输出路径
|
||||
- chunks_produced 正确统计
|
||||
|
||||
**2. Processor 状态监控**
|
||||
- 定期检查 processor_results 准确性
|
||||
- 自动修复矛盾记录
|
||||
|
||||
---
|
||||
|
||||
## 下一步行动
|
||||
|
||||
### 立即执行
|
||||
|
||||
1. **测试 Rule 1**
|
||||
- 运行 Rule 1 处理
|
||||
- 验证 chunks metadata(Face 数据)
|
||||
|
||||
2. **手动运行 ASRX**
|
||||
- 检查 ASRX processor 是否可手动运行
|
||||
- 等待完成后更新 Rule 1
|
||||
|
||||
---
|
||||
|
||||
### 调查任务
|
||||
|
||||
1. **Face 数据来源**
|
||||
- 查询 face_detections.created_at
|
||||
- 确定写入时间
|
||||
|
||||
2. **YOLO 失败原因**
|
||||
- 检查 Python 环境
|
||||
- 手动运行 yolo_processor.py
|
||||
|
||||
3. **ASRX 未运行原因**
|
||||
- 检查 processor_list 配置
|
||||
- 确认 ASRX 触发条件
|
||||
|
||||
---
|
||||
|
||||
## 相关文件
|
||||
|
||||
| 文件 | 说明 |
|
||||
|------|------|
|
||||
| `docs_v1.0/RULE1_FACE_DATA_SOURCE_FIX.md` | Face 数据源修复 |
|
||||
| `docs_v1.0/RULE1_CHUNK_INGESTION_CHECK.md` | Rule 1 问题分析 |
|
||||
| `docs_v1.0/RULE1_TRIGGER_MECHANISM.md` | Rule 1 启动机制 |
|
||||
| `src/core/chunk/rule1_ingest.rs` | Face 数据源已修复 |
|
||||
|
||||
---
|
||||
|
||||
## 结论
|
||||
|
||||
**可用数据**:
|
||||
- ✅ ASR (3664 segments)
|
||||
- ✅ CUT (1332 segments)
|
||||
- ✅ Face (78 detections, 数据源已修复)
|
||||
- ⚠️ OCR (13728 frames, processor_results 状态矛盾)
|
||||
|
||||
**缺失数据**:
|
||||
- ❌ YOLO (processor 失败)
|
||||
- ❌ ASRX (未运行)
|
||||
|
||||
**建议**: 先运行 Rule 1 测试 Face 数据修复,再解决 YOLO/ASRX 问题。
|
||||
Reference in New Issue
Block a user