docs: update docs_v1.0/ documentation

- Fix markdown lint issues (MD030, MD047, MD051, MD028, MD005)
- Update AI agents, architecture, implementation docs
- Add new identity, face recognition, and API documentation
- Remove deprecated face/person API guides
This commit is contained in:
Warren
2026-04-30 15:10:41 +08:00
parent 8f05a7c188
commit 4d75b2e251
185 changed files with 21071 additions and 1605 deletions
@@ -465,16 +465,16 @@ class UnifiedAudioProcessor:
```python
# Mac Studio 多處理器並行
class ParallelVideoProcessor:
def process_all(self, video_uuid):
def process_all(self, file_uuid):
# 同時運行所有處理器
with ThreadPoolExecutor(max_workers=8) as executor:
futures = {
"audio": executor.submit(self.run_asrx, video_uuid),
"ocr": executor.submit(self.run_ocr, video_uuid),
"yolo": executor.submit(self.run_yolo, video_uuid),
"face": executor.submit(self.run_face, video_uuid),
"pose": executor.submit(self.run_pose, video_uuid),
"scene": executor.submit(self.run_scene, video_uuid)
"audio": executor.submit(self.run_asrx, file_uuid),
"ocr": executor.submit(self.run_ocr, file_uuid),
"yolo": executor.submit(self.run_yolo, file_uuid),
"face": executor.submit(self.run_face, file_uuid),
"pose": executor.submit(self.run_pose, file_uuid),
"scene": executor.submit(self.run_scene, file_uuid)
}
return {k: f.result() for k, f in futures.items()}
@@ -486,7 +486,7 @@ class ParallelVideoProcessor:
# 新 API 端點
POST /api/v1/process
{
"video_uuid": "...",
"file_uuid": "...",
"processors": ["audio"], # 統一使用 ASRX large
"mode": "auto" # 或 "fast" / "professional"
}
@@ -494,7 +494,7 @@ POST /api/v1/process
# 向下兼容
POST /api/v1/process
{
"video_uuid": "...",
"file_uuid": "...",
"processors": ["asr"] # 自動映射到 "standard" profile
}
```
@@ -618,4 +618,4 @@ python3 scripts/test_stability_24h.py
- 處理速度:提升 3-5 倍
- 用戶等待時間:減少 70-80%
- 維護成本:降低 50%
- 功能完整性:100%(所有功能啟用)
- 功能完整性:100%(所有功能啟用)
@@ -162,7 +162,7 @@ ai_query_hints:
## 💡 使用建議
### 推薦使用自實作 ASRX 如果
### 推薦使用自實作 ASRX 如果
- ✅ 需要快速處理(96x 實時)
- ✅ 不想配置 HuggingFace token
@@ -172,7 +172,7 @@ ai_query_hints:
---
### 推薦使用 pyannote.audio 如果
### 推薦使用 pyannote.audio 如果
- ✅ 需要最高準確度(90-95%
- ✅ 需要處理重疊說話
@@ -917,4 +917,4 @@ python -c "from speechbrain.inference.speaker import EncoderClassifier; print('S
---
**更新日誌:**
- 2026-04-06: V1.0 初始版本,完整分析 ASR/ASRX 和聲紋模型
- 2026-04-06: V1.0 初始版本,完整分析 ASR/ASRX 和聲紋模型
@@ -222,4 +222,4 @@ class ASRProcessor:
*Document Version: 1.0*
*Last Updated: 2026-03-27*
*Next Review: 2026-04-27*
*Next Review: 2026-04-27*
@@ -182,4 +182,4 @@ def get_whisper_model(model_name="base"):
*Last Updated: 2026-03-27*
*Status: Planning Phase*
*Owner: Warren (Technical Lead)*
*Owner: Warren (Technical Lead)*
@@ -345,4 +345,4 @@ ASR (tiny) < ASR (base) < ASRX < ASRX (large)
- [faster-whisper](https://github.com/SYSTRAN/faster-whisper)
- [whisperx](https://github.com/m-bain/whisperX)
- [pyannote.audio](https://github.com/pyannote/pyannote-audio)
- [pyannote.audio](https://github.com/pyannote/pyannote-audio)
@@ -526,7 +526,7 @@ config/audio_profiles.json
# API 端點
POST /api/v1/process
{
"video_uuid": "...",
"file_uuid": "...",
"processors": ["audio"],
"audio_config": {
"profile": "diarized" # 或自定義配置
@@ -536,7 +536,7 @@ POST /api/v1/process
# 向下兼容
POST /api/v1/process
{
"video_uuid": "...",
"file_uuid": "...",
"processors": ["asr"] # 自動使用 "standard" profile
}
```
@@ -783,4 +783,4 @@ def main():
if __name__ == "__main__":
main()
```
```
@@ -422,28 +422,28 @@ impl VideoProcessor {
# 快速轉錄(預設)
POST /api/v1/process
{
"video_uuid": "...",
"file_uuid": "...",
"processors": ["asr"] # 使用 ASR tiny
}
# 準確轉錄
POST /api/v1/process
{
"video_uuid": "...",
"file_uuid": "...",
"processors": ["asr:medium"]
}
# 說話人分離
POST /api/v1/process
{
"video_uuid": "...",
"file_uuid": "...",
"processors": ["asrx"] # 使用 ASRX base
}
# 完整分析
POST /api/v1/process
{
"video_uuid": "...",
"file_uuid": "...",
"processors": ["asrx:large"]
}
```
@@ -501,4 +501,4 @@ AudioProcessor
- 統一 API
- 靈活配置
- 向下兼容
- 降低維護複雜度
- 降低維護複雜度