|
|
|
@@ -0,0 +1,242 @@
|
|
|
|
|
---
|
|
|
|
|
title: Charade Full Movie Pipeline Checklist
|
|
|
|
|
version: 1.0
|
|
|
|
|
date: 2026-05-27
|
|
|
|
|
author: M5Max48
|
|
|
|
|
status: in_progress
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
# Charade Full Movie Pipeline Checklist
|
|
|
|
|
|
|
|
|
|
**File UUID**: `c3c635e3641da80dde10cc555ffcdda5`
|
|
|
|
|
**File Name**: Charade (1963) Cary Grant & Audrey Hepburn | Comedy Mystery Romance Thriller | Full Movie.mp4
|
|
|
|
|
**Duration**: 6785 seconds (113 minutes)
|
|
|
|
|
**Total Frames**: 169,625
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## P0: Processor Outputs
|
|
|
|
|
|
|
|
|
|
### Purpose
|
|
|
|
|
原始處理器輸出檔案,存放在 `/Users/accusys/momentry/output_dev/`。這些是後續 ingestion 的資料來源。
|
|
|
|
|
|
|
|
|
|
### Processor Details
|
|
|
|
|
|
|
|
|
|
| Processor | Expected Output | Size Estimate | Purpose | Status |
|
|
|
|
|
|-----------|-----------------|---------------|---------|--------|
|
|
|
|
|
| CUT | `c3c635e3641da80dde10cc555ffcdda5.cut.json` | ~170KB | Scene boundary detection,切割點用於 Rule 3 chunking | ✅ Done |
|
|
|
|
|
| YOLO | `c3c635e3641da80dde10cc555ffcdda5.yolo.json` | ~50-80MB | Object detection,每幀的物件類別與位置 | 🔄 Running |
|
|
|
|
|
| Face | `c3c635e3641da80dde10cc555ffcdda5.face.json` | ~1.5GB | Face detection + 512-dim embedding (FaceNet CoreML) | 🔄 44% |
|
|
|
|
|
| Face Traced | `c3c635e3641da80dde10cc555ffcdda5.face_traced.json` | ~1.2GB | Face tracking,同一人物的連續出現 → trace_id | ⏳ Pending (after Face) |
|
|
|
|
|
| OCR | `c3c635e3641da80dde10cc555ffcdda5.ocr.json` | ~50KB | Text recognition from frames | ❌ Skipped |
|
|
|
|
|
| Pose | `c3c635e3641da80dde10cc555ffcdda5.pose.json` | ~20MB | Body pose estimation | 🔄 Running |
|
|
|
|
|
| ASRX | `c3c635e3641da80dde10cc555ffcdda5.asrx.json` | ~8MB | Speaker diarization,語者分段 | ✅ Done (reuse from public) |
|
|
|
|
|
| Visual Chunk | `c3c635e3641da80dde10cc555ffcdda5.visual_chunk.json` | ~60KB | Visual scene chunk metadata | ✅ Done |
|
|
|
|
|
| Scene | `c3c635e3641da80dde10cc555ffcdda5.scene.json` | ~300B | Scene list from CUT | ✅ Done |
|
|
|
|
|
| Scene Meta | `c3c635e3641da80dde10cc555ffcdda5.scene_meta.json` | ~50KB | Heuristic scene metadata (人物 + 物件統計) | ⏳ Pending |
|
|
|
|
|
| Story LLM | `c3c635e3641da80dde10cc555ffcdda5.story_llm.json` | ~800KB | LLM-generated story summaries per chunk | ✅ Done |
|
|
|
|
|
| Story Story | `c3c635e3641da80dde10cc555ffcdda5.story_story.json` | ~800KB | Story parent-child relationships | ✅ Done |
|
|
|
|
|
| TMDb | `c3c635e3641da80dde10cc555ffcdda5.tmdb.json` | ~5KB | TMDb cast list with face embeddings | ⏳ Pending |
|
|
|
|
|
| 5W1H | `c3c635e3641da80dde10cc555ffcdda5.5w1h.json` | ~500KB | 5W1H agent output (who/when/where/what/why/how) | ✅ Done |
|
|
|
|
|
|
|
|
|
|
### Key Dependencies
|
|
|
|
|
- Face Traced 需要 Face 完成後才能執行 (face_traced.json = face.json + tracking)
|
|
|
|
|
- Scene Meta 需要 Face + YOLO 完成
|
|
|
|
|
- TMDb 需要 Face Traced 完成後執行 matching
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## P1: Database Records
|
|
|
|
|
|
|
|
|
|
### Purpose
|
|
|
|
|
將 processor outputs 存入 PostgreSQL,供 API query 使用。
|
|
|
|
|
|
|
|
|
|
### Table Details
|
|
|
|
|
|
|
|
|
|
| Table | Expected Records | Purpose | Verification Query | Status |
|
|
|
|
|
|-------|------------------|---------|-------------------|--------|
|
|
|
|
|
| `dev.videos` | 1 row | Video metadata (duration, fps, status) | `SELECT file_uuid, status FROM dev.videos WHERE file_uuid = 'c3c635e3641da80dde10cc555ffcdda5'` | ✅ Registered |
|
|
|
|
|
| `dev.monitor_jobs` | 1 row | Processing job state machine | `SELECT uuid, status, completed_processors FROM dev.monitor_jobs WHERE uuid = 'c3c635e3641da80dde10cc555ffcdda5'` | 🔄 Running |
|
|
|
|
|
| `dev.pre_chunks` | ~7,000 rows | Raw processor outputs (ASR sentences, YOLO objects, etc.) | `SELECT COUNT(*) FROM dev.pre_chunks WHERE file_uuid = 'c3c635e3641da80dde10cc555ffcdda5'` | ⏳ Pending |
|
|
|
|
|
| `dev.face_detections` | ~70,000 rows | Face detection records (每幀每張臉) | `SELECT COUNT(*) FROM dev.face_detections WHERE file_uuid = 'c3c635e3641da80dde10cc555ffcdda5'` | ⏳ Pending |
|
|
|
|
|
| `dev.face_detections.embedding` | ~70,000 non-NULL | 512-dim FaceNet embedding (用於 identity matching) | `SELECT COUNT(embedding) FROM dev.face_detections WHERE file_uuid = 'c3c635e3641da80dde10cc555ffcdda5'` | ⏳ Pending |
|
|
|
|
|
| `dev.face_detections.trace_id` | ~70,000 non-NULL | Face tracking ID (同一人物跨幀連續出現) | `SELECT COUNT(trace_id) FROM dev.face_detections WHERE file_uuid = 'c3c635e3641da80dde10cc555ffcdda5'` | ⏳ Pending |
|
|
|
|
|
| `dev.face_detections.identity_id` | ~50,000 non-NULL | TMDb identity binding (Audrey, Cary, etc.) | `SELECT COUNT(identity_id) FROM dev.face_detections WHERE file_uuid = 'c3c635e3641da80dde10cc555ffcdda5'` | ⏳ Pending |
|
|
|
|
|
|
|
|
|
|
### Key Points
|
|
|
|
|
- `embedding` 必須非 NULL 才能進行 TMDb matching (之前 store_traced_faces.py bug 修復)
|
|
|
|
|
- `trace_id` 由 `store_traced_faces.py` 從 face_traced.json 計算
|
|
|
|
|
- `identity_id` 由 `match_faces_to_tmdb.py` 計算 (cosine similarity > 0.5)
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## P2: Chunk Ingestion
|
|
|
|
|
|
|
|
|
|
### Purpose
|
|
|
|
|
將 raw processor outputs 轉換為 searchable chunks,用於 RAG query。
|
|
|
|
|
|
|
|
|
|
### Chunk Types
|
|
|
|
|
|
|
|
|
|
| Chunk Type | Expected Count | Purpose | Source | Verification Query | Status |
|
|
|
|
|
|------------|----------------|---------|--------|-------------------|--------|
|
|
|
|
|
| sentence (Rule 1) | ~1,700 | Sentence-level chunks for text search | ASR output → sentence split | `SELECT COUNT(*) FROM dev.chunk WHERE file_uuid = 'c3c635e3641da80dde10cc555ffcdda5' AND chunk_type = 'sentence'` | ⏳ Pending |
|
|
|
|
|
| llm_parent | ~800 | LLM-generated summary parent chunks | Story LLM output | `SELECT COUNT(*) FROM dev.chunk WHERE file_uuid = 'c3c635e3641da80dde10cc555ffcdda5' AND chunk_type = 'llm_parent'` | ⏳ Pending |
|
|
|
|
|
| story_parent | ~800 | Story parent chunks (narrative segments) | Story processor | `SELECT COUNT(*) FROM dev.chunk WHERE file_uuid = 'c3c635e3641da80dde10cc555ffcdda5' AND chunk_type = 'story_parent'` | ⏳ Pending |
|
|
|
|
|
| story_child | ~1,700 | Story child chunks (linked to sentence) | Story processor | `SELECT COUNT(*) FROM dev.chunk WHERE file_uuid = 'c3c635e3641da80dde10cc555ffcdda5' AND chunk_type = 'story_child'` | ⏳ Pending |
|
|
|
|
|
| cut (Rule 3) | ~500 | Scene-level chunks for scene search | CUT output → scene boundaries | `SELECT COUNT(*) FROM dev.chunk WHERE file_uuid = 'c3c635e3641da80dde10cc555ffcdda5' AND chunk_type = 'cut'` | ⏳ Pending |
|
|
|
|
|
| trace | ~3,600 | Face trace chunks (identity-centric) | Face Traced output | `SELECT COUNT(*) FROM dev.chunk WHERE file_uuid = 'c3c635e3641da80dde10cc555ffcdda5' AND chunk_type = 'trace'` | ⏳ Pending |
|
|
|
|
|
|
|
|
|
|
### Ingestion Pipeline
|
|
|
|
|
1. **Rule 1**: ASR → sentence split → chunk + embedding → Qdrant
|
|
|
|
|
2. **Rule 3**: CUT + ASR → scene chunks → chunk + embedding → Qdrant
|
|
|
|
|
3. **Trace**: Face Traced → trace chunks → TKG nodes → Qdrant
|
|
|
|
|
|
|
|
|
|
### Key Points
|
|
|
|
|
- `start_frame` / `end_frame` 必須正確計算 (之前 bug: frame=0)
|
|
|
|
|
- Chunks 必須有 `embedding` 才能 search
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## P3: Vector Embeddings
|
|
|
|
|
|
|
|
|
|
### Purpose
|
|
|
|
|
將 chunks 的 text 轉換為 768-dim embeddings,存入 PostgreSQL + Qdrant,用於 semantic search。
|
|
|
|
|
|
|
|
|
|
### Embedding Targets
|
|
|
|
|
|
|
|
|
|
| Target | Expected Count | Model | Purpose | Verification | Status |
|
|
|
|
|
|--------|----------------|-------|---------|--------------|--------|
|
|
|
|
|
| PostgreSQL `dev.chunk.embedding` | ~5,000 | Gemma-2-9B (768-dim) | Text semantic search | `SELECT COUNT(embedding) FROM dev.chunk WHERE file_uuid = 'c3c635e3641da80dde10cc555ffcdda5'` | ⏳ Pending |
|
|
|
|
|
| Qdrant `momentry_dev_rule1_v2` | ~5,000 points | Gemma-2-9B | Fast vector similarity search | `curl -H "api-key: Test3200Test3200Test3200" "http://localhost:6333/collections/momentry_dev_rule1_v2"` | ⏳ Pending |
|
|
|
|
|
| Qdrant `_face` collection | ~70,000 points | FaceNet-512 (512-dim) | Face identity search | Face embeddings sync via `sync_face_embeddings()` | ⏳ Pending |
|
|
|
|
|
|
|
|
|
|
### Embedding Pipeline
|
|
|
|
|
1. **Text chunks**: `embeddinggemma_server.py` (port 11436) → 768-dim embedding
|
|
|
|
|
2. **Face embeddings**: FaceNet CoreML (from face.json) → 512-dim embedding (已在 P0 產生)
|
|
|
|
|
3. **Sync to Qdrant**: `sync_face_embeddings()` function in Rust
|
|
|
|
|
|
|
|
|
|
### Key Points
|
|
|
|
|
- Text embeddings 使用 Gemma-2-9B (local LLM server)
|
|
|
|
|
- Face embeddings 使用 FaceNet-512 (CoreML ANE accelerated)
|
|
|
|
|
- Qdrant 提供 fast similarity search (cosine similarity)
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## P4: Identity Binding
|
|
|
|
|
|
|
|
|
|
### Purpose
|
|
|
|
|
將 detected faces 綁定到 TMDb identities (Audrey Hepburn, Cary Grant, etc.),用於 identity_text search。
|
|
|
|
|
|
|
|
|
|
### Identity Matching Pipeline
|
|
|
|
|
|
|
|
|
|
| Step | Expected Result | Method | Verification | Status |
|
|
|
|
|
|------|-----------------|--------|--------------|--------|
|
|
|
|
|
| TMDb seeds loaded | 23 identities | `tmdb_embed_extractor.py` → TMDb profile face embeddings | `SELECT COUNT(*) FROM dev.identities WHERE source = 'tmdb' AND face_embedding IS NOT NULL` | ✅ Done |
|
|
|
|
|
| Face matching | ~50,000 bindings | `match_faces_to_tmdb.py` → cosine similarity > 0.5 | `SELECT COUNT(identity_id) FROM dev.face_detections WHERE file_uuid = 'c3c635e3641da80dde10cc555ffcdda5' AND identity_id IS NOT NULL` | ⏳ Pending |
|
|
|
|
|
| Audrey Hepburn faces | ~16,000 | Highest similarity match | `SELECT COUNT(*) FROM dev.face_detections fd JOIN dev.identities i ON fd.identity_id = i.id WHERE fd.file_uuid = 'c3c635e3641da80dde10cc555ffcdda5' AND i.name = 'Audrey Hepburn'` | ⏳ Pending |
|
|
|
|
|
| Cary Grant faces | ~5,000 | Second highest match | Same query for Cary Grant | ⏳ Pending |
|
|
|
|
|
|
|
|
|
|
### Matching Algorithm
|
|
|
|
|
```python
|
|
|
|
|
# match_faces_to_tmdb.py
|
|
|
|
|
for trace_id in traces:
|
|
|
|
|
for face_embedding in trace_faces:
|
|
|
|
|
for tmdb_identity in tmdb_identities:
|
|
|
|
|
similarity = cosine_similarity(face_embedding, tmdb_identity.face_embedding)
|
|
|
|
|
if similarity >= 0.5:
|
|
|
|
|
match trace_id → tmdb_identity
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### Key Points
|
|
|
|
|
- TMDb seeds 需要 `face_embedding` (之前已驗證: 23 identities with embeddings)
|
|
|
|
|
- Face `embedding` 必須非 NULL (之前 store_traced_faces.py bug 修復)
|
|
|
|
|
- Threshold: 0.5 (可調整)
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## P5: API Endpoints
|
|
|
|
|
|
|
|
|
|
### Purpose
|
|
|
|
|
驗證 API endpoints 可以正確返回 identity_text search results。
|
|
|
|
|
|
|
|
|
|
### API Tests
|
|
|
|
|
|
|
|
|
|
| Endpoint | Purpose | Expected Response | Test Command | Status |
|
|
|
|
|
|----------|---------|-------------------|--------------|--------|
|
|
|
|
|
| `/api/v1/search/identity_text` | Search chunk text → identities | Results with `identity_name`, `trace_id`, `identity_source` | `curl "http://localhost:3003/api/v1/search/identity_text?file_uuid=c3c635e3641da80dde10cc555ffcdda5&q=Regina&limit=5"` | ⏳ Pending |
|
|
|
|
|
| `/api/v1/identities` | List identities with TMDb | Identity list with `tmdb_id`, `face_embedding` | `curl "http://localhost:3003/api/v1/identities?name=Audrey"` | ⏳ Pending |
|
|
|
|
|
| `/api/v1/progress/:file_uuid` | Check processing progress | JSON with `status`, `completed_processors` | `curl "http://localhost:3003/api/v1/progress/c3c635e3641da80dde10cc555ffcdda5"` | ⏳ Pending |
|
|
|
|
|
|
|
|
|
|
### Expected API Response Example
|
|
|
|
|
```json
|
|
|
|
|
{
|
|
|
|
|
"success": true,
|
|
|
|
|
"total": 5,
|
|
|
|
|
"results": [
|
|
|
|
|
{
|
|
|
|
|
"chunk_id": "sentence_123",
|
|
|
|
|
"start_time": 355.0,
|
|
|
|
|
"text_content": "Oh, mine's Regina Lampert.",
|
|
|
|
|
"identity_id": 9,
|
|
|
|
|
"identity_name": "Audrey Hepburn",
|
|
|
|
|
"identity_source": "tmdb",
|
|
|
|
|
"trace_id": 169
|
|
|
|
|
}
|
|
|
|
|
]
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### Key Points
|
|
|
|
|
- `identity_text` API 需要 `chunk.start_frame` / `chunk.end_frame` 正確 (之前 bug: frame=0)
|
|
|
|
|
- `identity_id` 必須非 NULL 才能返回 identity_name
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## P6: Completion Criteria
|
|
|
|
|
|
|
|
|
|
### Purpose
|
|
|
|
|
驗證 pipeline 完整完成,所有 ingestion steps 成功。
|
|
|
|
|
|
|
|
|
|
### Final Verification Checklist
|
|
|
|
|
|
|
|
|
|
| Criteria | Purpose | Check Command | Expected Result | Status |
|
|
|
|
|
|----------|---------|---------------|-----------------|--------|
|
|
|
|
|
| All processor outputs exist | 確認所有 processor JSON 檔案產生 | `ls -la output_dev/c3c635e3641da80dde10cc555ffcdda5.*` | 14+ files with size > 0 | ⏳ Pending |
|
|
|
|
|
| Job status = completed | 確認 worker 完成 job | `SELECT status FROM dev.monitor_jobs WHERE uuid = 'c3c635e3641da80dde10cc555ffcdda5'` | `completed` | ⏳ Pending |
|
|
|
|
|
| Video status = completed | 確認 video state 更新 | `SELECT status FROM dev.videos WHERE file_uuid = 'c3c635e3641da80dde10cc555ffcdda5'` | `completed` | ⏳ Pending |
|
|
|
|
|
| All chunks have embeddings | 確認 text embeddings 完成 | `SELECT COUNT(*) = COUNT(embedding) FROM dev.chunk WHERE file_uuid = 'c3c635e3641da80dde10cc555ffcdda5'` | `true` (all chunks have embedding) | ⏳ Pending |
|
|
|
|
|
| Face traces assigned | 確認 face tracking 完成 | `SELECT COUNT(*) = COUNT(trace_id) FROM dev.face_detections WHERE file_uuid = 'c3c635e3641da80dde10cc555ffcdda5'` | `true` (all faces have trace_id) | ⏳ Pending |
|
|
|
|
|
| TMDb matching done | 確認 identity binding 完成 | `SELECT COUNT(identity_id) > 40000 FROM dev.face_detections WHERE file_uuid = 'c3c635e3641da80dde10cc555ffcdda5'` | `true` (> 40K identity bindings) | ⏳ Pending |
|
|
|
|
|
| Qdrant synced | 確認 vector search ready | Check Qdrant points count | Points increased by ~5,000 | ⏳ Pending |
|
|
|
|
|
|
|
|
|
|
### Success Thresholds
|
|
|
|
|
- **Face detections**: ~70,000 (169K frames / 3 sample interval)
|
|
|
|
|
- **Identity bindings**: > 40,000 (60% match rate)
|
|
|
|
|
- **Chunks with embeddings**: > 4,000 (all chunk types)
|
|
|
|
|
- **Qdrant points**: > 90,000 (current) → > 95,000 (after Charade)
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## Verification Script
|
|
|
|
|
|
|
|
|
|
```bash
|
|
|
|
|
# Run after completion
|
|
|
|
|
./scripts/verify_charade_pipeline.sh c3c635e3641da80dde10cc555ffcdda5
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## Notes
|
|
|
|
|
|
|
|
|
|
- OCR processor failed, skipped
|
|
|
|
|
- Face detection using SwiftFace (ANE accelerated)
|
|
|
|
|
- TMDb matching using `scripts/match_faces_to_tmdb.py`
|
|
|
|
|
- Expected total processing time: ~2-3 hours
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## Version History
|
|
|
|
|
|
|
|
|
|
| Version | Date | Author | Changes |
|
|
|
|
|
|---------|------|--------|---------|
|
|
|
|
|
| 1.0 | 2026-05-27 | M5Max48 | Initial checklist |
|