Files

Warren 4d75b2e251 docs: update docs_v1.0/ documentation

- Fix markdown lint issues (MD030, MD047, MD051, MD028, MD005)
- Update AI agents, architecture, implementation docs
- Add new identity, face recognition, and API documentation
- Remove deprecated face/person API guides

2026-04-30 15:10:41 +08:00

14 KiB

Raw Blame History

document_type, service, title, date, version, status, owner, created_by, tags, ai_query_hints

document_type

service

title

date

version

status

owner

created_by

Video Processing Pipeline - 處理流程

項目	內容
建立者	Warren
建立時間	2026-03-22
文件版本	V1.2

版本歷史

版本	日期	目的	操作人	工具/模型
V1.0	2026-03-22	創建文件	Warren	OpenCode
V1.1	2026-03-26	更新流程圖文字 (media_url→file_path)	OpenCode	deepseek-reasoner
V1.2	2026-04-27	添加 processing_status 字段說明	OpenCode	GLM-5

處理流程架構

┌─────────────────────────────────────────────────────────────────────────────┐
│                         Video Processing Pipeline                            │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  ┌─────────────────────────────────────────────────────────────────────┐  │
│  │  Stage 1: JSON 生成 (Process)                                        │  │
│  │                                                                       │  │
│  │  video.mp4 ──→ [ASR] ──→ asr.json     (語音辨識)                   │  │
│  │            ──→ [CUT] ──→ cut.json     (場景偵測)                   │  │
│  │            ──→ [ASRX] ──→ asrx.json   (說話者分離)                 │  │
│  │            ──→ [YOLO] ──→ yolo.json   (物體偵測)                   │  │
│  │            ──→ [OCR] ──→ ocr.json     (文字辨識)                   │  │
│  │            ──→ [Face] ──→ face.json   (人臉偵測)                   │  │
│  │            ──→ [Pose] ──→ pose.json   (姿態估計)                   │  │
│  └─────────────────────────────────────────────────────────────────────┘  │
│                                      ↓                                      │
│  ┌─────────────────────────────────────────────────────────────────────┐  │
│  │  Stage 2: 入庫 (Import)                                              │  │
│  │                                                                       │  │
│  │  .json files ──→ PostgreSQL (fs_json = true)                        │  │
│  │                      ↓                                               │  │
│  │                 pre_chunks 表 (from ASR, CUT)                        │  │
│  │                 frames 表 (from YOLO, OCR, Face, Pose)               │  │
│  └─────────────────────────────────────────────────────────────────────┘  │
│                                      ↓                                      │
│  ┌─────────────────────────────────────────────────────────────────────┐  │
│  │  Stage 3: Chunk 生成 (Chunk)                                         │  │
│  │                                                                       │  │
│  │  pre_chunks ──→ [Chunk Rule] ──→ chunks 表                         │  │
│  │                      ↓                                               │  │
│  │              清洗 → 純文字                                            │  │
│  └─────────────────────────────────────────────────────────────────────┘  │
│                                      ↓                                      │
│  ┌─────────────────────────────────────────────────────────────────────┐  │
│  │  Stage 4: 向量化 (Vectorize)                                         │  │
│  │                                                                       │  │
│  │  chunks ──→ [Embedding Model] ──→ vectors                          │  │
│  │                            ↓                                           │  │
│  │                     Qdrant (主要向量庫)                               │  │
│  │                     PGVector (備份向量庫)                             │  │
│  └─────────────────────────────────────────────────────────────────────┘  │
│                                      ↓                                      │
│  ┌─────────────────────────────────────────────────────────────────────┐  │
│  │  Stage 5: 搜尋 (Search)                                             │  │
│  │                                                                       │  │
│  │  Natural Language Query ──→ [Embedding] ──→ [Qdrant Search]        │  │
│  │                                    ↓                                   │  │
│  │                           返回結果含 file_path                        │  │
│  └─────────────────────────────────────────────────────────────────────┘  │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

CLI 命令

Stage 1: JSON 生成 (Process)

# 基本用法
cargo run --bin momentry -- process <uuid_or_path>

# 只處理特定模組
cargo run --bin momentry -- process <uuid> --modules asr,cut

# 強制重新處理（忽略完整性檢查）
cargo run --bin momentry -- process <uuid> --force

# 從中斷點續傳
cargo run --bin momentry -- process <uuid> --resume

# 模組使用雲端處理
cargo run --bin momentry -- process <uuid> --modules yolo,face --cloud yolo

# 完整範例
cargo run --bin momentry -- process /path/to/video.mp4 \
    --modules asr,cut,yolo,ocr \
    --cloud yolo

Stage 2: 入庫 (Import)

# 目前入庫在 process 完成後自動執行
# 計劃新增獨立的 import 命令
# cargo run --bin momentry -- import <uuid>

Stage 3: Chunk 生成

# 生成 chunks
cargo run --bin momentry -- chunk <uuid>

Stage 4: 向量化

# 向量化 chunks（使用預設模型 nomic-embed-text-v2-moe:latest）
cargo run --bin momentry -- vectorize <uuid>

# 明確指定模型
cargo run --bin momentry -- vectorize <uuid> --model nomic-embed-text-v2-moe:latest

處理模式選項

--force (強制重新處理)

刪除現有的 JSON 檔案
從頭開始處理
適用於：處理失敗、模型更新、需要重新處理

# 強制重新處理 YOLO
cargo run --bin momentry -- process <uuid> --modules yolo --force

--resume (續傳)

檢查現有 JSON 的進度
從中斷點繼續處理
適用於：處理中斷、系統崩潰後恢復

# 從上次中斷點繼續
cargo run --bin momentry -- process <uuid> --resume

預設行為 (Smart Mode)

如果 JSON 完全：跳過
如果 JSON 不完整：警告 + 跳過（需要 --resume 或 --force）
如果 JSON 不存在：處理

Output:
ASR: ✓ Already complete, skipping

⚠️  Found incomplete JSON file: /path/to/yolo.json
   Progress: 73800/412343 (17.9%)
   Use --resume to continue from checkpoint
   Use --force to reprocess from scratch
YOLO: ✓ Already complete, skipping

可用模組

模組	功能	輸出	用途
asr	自動語音辨識	asr.json	語音轉文字
cut	場景偵測	cut.json	影片分段
asrx	說話者分離	asrx.json	多人對話分析
yolo	物體偵測	yolo.json	物體辨識
ocr	文字辨識	ocr.json	畫面文字
face	人臉偵測	face.json	人臉辨識
pose	姿態估計	pose.json	人體姿態

向量化模型選擇

專用嵌入模型

Momentry Core 統一使用 nomic-embed-text-v2-moe:latest 作為所有規則的嵌入模型：

# 統一模型（所有 Rule 1/2/3 使用）
--model nomic-embed-text-v2-moe:latest

模型特性

特性	說明
模型名稱	`nomic-embed-text-v2-moe:latest`
向量維度	768 維
多語言支持	✅ 完整支持（英語、中文、日語、韓語等）
模型架構	Mixture of Experts (MoE)
推理速度	快速，適合實時應用

使用方式

// Rust 代碼中使用
let embedder = Embedder::new("nomic-embed-text-v2-moe:latest".to_string());

// 文檔嵌入（用於儲存）
let document_vector = embedder.embed_document("文本內容").await?;

// 查詢嵌入（用於搜索）
let query_vector = embedder.embed_query("搜索查詢").await?;

資料庫儲存

PostgreSQL (主要關聯式資料庫)

影片資訊
Chunks 資料
Pre-chunks 資料
Frames 資料
使用者資料

Qdrant (主要向量資料庫)

Chunk 向量
相似度搜尋

PGVector (備份向量資料庫)

Chunk 向量副本
備援機制

Pipeline 狀態追蹤

PostgreSQL 狀態欄位

-- 影片處理狀態（基本狀態）
videos.status: 'pending' | 'processing' | 'completed' | 'failed'

-- 影片處理狀態（詳細狀態）
videos.processing_status: 'REGISTERED' | 'PENDING' | 'PROBING' | 'ASR' | 'OCR' | 'YOLO' | 'FACE' | 'POSE' | 'CUT' | 'ASRX' | 'COMPLETED' | 'FAILED' | 'PAUSED' | 'RESUMING'

-- 說明：
-- status：基本狀態，用於 API 查詢過濾（is_processed=true → status='completed'）
-- processing_status：詳細狀態，用於 Portal 顯示和作業追蹤

-- 檔案處理狀態
videos.fs_json: true/false
videos.fs_chunks: true/false
videos.fs_vectors: true/false

-- pre_chunks 狀態
pre_chunks.imported: true/false

-- frames 狀態
frames.imported: true/false

-- chunks 狀態
chunks.cleaned: true/false
chunks.vectorized: true/false

進度查詢 API

# 查詢處理進度
curl http://localhost:3002/api/v1/progress/{uuid}

# 回應範例
{
  "uuid": "a1b10138a6bbb0cd",
  "file_name": "video.mp4",
  "overall_progress": 65,
  "cpu_percent": 45.2,
  "gpu_percent": 98.5,
  "memory_mb": 8500,
  "processors": [
    {"name": "asr", "status": "complete", "progress": 100},
    {"name": "cut", "status": "complete", "progress": 100},
    {"name": "yolo", "status": "progress", "progress": 45},
    {"name": "ocr", "status": "pending", "progress": 0}
  ]
}

Agent 進度追蹤（V1.2 起）

從 V1.2 起，Agent 任務透過 processing_status JSONB 的 agents 字段追蹤。

Agent 進度字段

Agent	JSONB 路徑	說明
5W1H	`processing_status->agents->5w1h`	場景摘要 Agent
Translation	`processing_status->agents->translation`	翻譯 Agent

Agent 狀態結構

{
  "agents": {
    "5w1h": {
      "status": "running",
      "scenes_processed": 5,
      "scenes_total": 1332,
      "progress_pct": 0.4,
      "started_at": "2026-04-27T05:45:00Z"
    }
  }
}

SQL 查詢 Agent 進度

SELECT 
  uuid,
  processing_status->'agents'->'5w1h'->>'status' as status,
  processing_status->'agents'->'5w1h'->>'scenes_processed' as processed
FROM videos 
WHERE processing_status->'agents'->'5w1h'->>'status' = 'running';

詳細規範請參考: REFERENCE/PROCESSING_STATUS_JSONB_SPEC.md

下一步

API 端點 - 支援 --modules 和 --cloud 參數
獨立 Import 命令 - 分離入庫流程
獨立 Chunk 命令 - 分離 chunk 生成
獨立 Vectorize 命令 - 分離向量化流程
模型管理 - 新增、選擇、預覽模型

14 KiB Raw Blame History Unescape Escape