432 Commits

Author SHA1 Message Date
Accusys
e1572907ae feat: ASRX hybrid pipeline, identity history, worker fixes, checkpoint system 2026-06-02 07:13:23 +08:00
Accusys
e3066c3f49 Add Charade face matching experience report
Documents the journey from Rust pipeline snowball bug through
5 iterations of pgvector-based matching to the final 11-identity
centroid approach with dual-gate and ambiguity cleanup.
2026-06-02 05:01:56 +08:00
Accusys
3731a1230f docs: add Identity Best-Face API requirement document for frontend team 2026-06-01 21:58:54 +08:00
Accusys
874d688987 feat: deploy hybrid search (semantic+keyword+identity) with RRF fusion
- Replace smart_search with hybrid RRF implementation
- Add speaker_detections table for identity-agent binding
- Fix identity queries: direct SQL to avoid type mismatches
- Add debug logs to job_worker for processor debugging
- Deployed to production (3002) successfully

Key changes:
- search.rs: Complete rewrite with 3 strategies + RRF
- postgres_db.rs: speaker_detections table + identity query fixes
- job_worker.rs: Debug logs for output file checks

Tested:
- Hybrid search works with semantic + keyword + identity
- Identity search: 'identity:Charade' returns correct results
- Chinese keyword search: '調光' matches Charade summaries

Bugs found:
- Case mismatch: 'ASRX' vs 'asrx' in processors field
- Missing CUT dependency for ASRX processor
2026-06-01 15:15:17 +08:00
Accusys
0d58a738a1 feat: add processor state machine and alert mechanism
- Add ProcessorJobStatus enum (8 states: Idle/Waiting/Ready/Pending/Running/Completed/Failed/Skipped)
- Add processor_alerts table (migrations/034)
- Add emit_processor_alert() to redis_client.rs
- Add ConditionResult enum + check_dependencies() to job_worker.rs
2026-05-30 10:03:49 +08:00
Accusys
08167d73b2 docs: add Processor State Machine V1.0 design 2026-05-30 10:03:48 +08:00
Accusys
3d13d1390e Merge branch 'main' of http://192.168.110.200:3000/admin/momentry_core 2026-05-29 23:14:14 +08:00
Accusys
04cbb71ca0 docs: save handoff - library page flash & filter fix 2026-05-29 23:12:09 +08:00
Accusys
e96cc8c8de docs: record WordPress API URL update session progress 2026-05-29 19:06:15 +08:00
M5Max128
f5cf12409b docs: expand JPEG validation plan to include Python scripts 2026-05-27 15:55:20 +08:00
M5Max128
ea20e27a4d docs: add JPEG validation implementation plan for M5Max48 2026-05-27 15:40:15 +08:00
M5Max128
a036d985b7 docs: add Thumbnail QA Analysis for M5Max48 implementation 2026-05-27 14:35:53 +08:00
M5Max128
c85794292a docs: add processor refactoring assessment from M5Max128 workspace research 2026-05-27 03:59:13 +08:00
M5Max128
955282e587 docs: add LaunchDaemon architecture reference for M5Max128/M5Max48 collaboration 2026-05-27 01:12:37 +08:00
Accusys
127d646ef1 fix: worker processor_results + rule3 SQL + unregister cleanup bugs
- job_worker.rs: add upsert_processor_result when output file exists
- job_worker.rs: add load JSON and store to pre_chunks when output exists
- rule3_ingest.rs: fix SQL bind order (scene_number was occupying chunk_type slot)
- files.rs: fix unregister WHERE clause (uuid -> file_uuid) + add pre_chunks delete
- asrx_self/main_fixed.py: fix KeyError (s['start'] -> s['start_time'])
- wrapper_worker_playground.sh: add Worker launchd script
- com.momentry.playground.plist: add Playground launchd config
2026-05-26 04:35:51 +08:00
Accusys
87dead7f65 fix: POST /api/v1/jobs 500 — wrong column names + NULL file_name 2026-05-25 10:50:37 +08:00
Accusys
20dae387ee docs: sync case-insensitive variant 2026-05-25 10:31:37 +08:00
Accusys
b9e93c6293 docs: update API Ref (V4.2), CHANGELOG, Release Notes for de88fd4e 2026-05-25 10:31:32 +08:00
Accusys
de88fd4e44 fix: restore accidentally deleted type definitions
Add back PipelineType enum, ProcessorType::pipeline() method, and
OLLAMA_URL/EMBED_URL/LLM_HEALTH_URL config constants — all of
which were deleted in commits 78923a89 and 0856b92e while the
referencing code was left intact, causing 5 compilation errors.
2026-05-25 08:50:53 +08:00
Accusys
d7f89a962b fix: frame_number is BIGINT in DB, use i64 not i32
frame_number column in face_detections table is defined as BIGINT (INT8).
Using i32 caused sqlx type mismatch at runtime. Fixed in:
- identity_agent_api.rs: query_as tuples and HashMap key
- qdrant_db.rs: upsert_face_embedding signature and row extraction
2026-05-25 04:07:30 +08:00
M5Max128
25ec1625df Merge branch 'main' of 10.10.10.201:/Users/accusys/momentry_core_0.1/ 2026-05-25 03:59:54 +08:00
M5Max128
0806d44df4 fix: add status/duration/fps to FileDetailResponse; fix progress API with HSET+HGETALL 2026-05-25 03:40:02 +08:00
M5Max128
29eabf6d88 chore: remove swift build artifacts from tracking 2026-05-25 03:37:19 +08:00
Accusys
a2b71fef0d fix: i64→i32 for INT4 cols (identity_binding, identity_agent, qdrant_db) 2026-05-25 03:18:50 +08:00
Accusys
8fdd1d741b fix: stranger_id=NULL on bind/merge; doc: add traces+mergeinto endpoints 2026-05-25 03:03:27 +08:00
M5Max128
78923a8973 fix: system consistency - store_vector, search, worker trigger
- store_vector: stub -> actual PG embedding storage
- search_parent_chunks_semantic: include sentence chunks
- Remove early return in check_and_complete_job
2026-05-24 23:20:02 +08:00
M5Max128
932e43518d fix: trigger_processing — remove fake QUEUED state, create monitor_job if missing
- Remove SET processing_status = 'QUEUED' (no queue exists)
- Fix COALESCE type mismatch (jsonb vs text)
- Fix UPDATE WHERE id =  should be WHERE uuid =
- Check monitor_jobs existence, INSERT if missing via create_monitor_job
- Add UNIQUE constraint on monitor_jobs.uuid
- Fix response message: 'Processing queued' → 'Processing triggered'
2026-05-23 23:06:37 +08:00
M5Max128
5d8449b07c fix: compile processing.rs + mount processing_routes
- Fix 9 compilation errors in processing.rs:
  - memory_mb typo (mem_mb)
  - download_json return type
  - Chunk from_row (use row_to_json)
  - ProgressResponse/SystemHealthInfo/ProcessorProgressInfo Deserialize
  - Remove flush_all/flush (methods don't exist)
- Add pub mod processing to api/mod.rs
- Merge processing::processing_routes() into server router
2026-05-23 22:40:19 +08:00
M5Max128
0856b92ec6 fix: resource path cleanup + mount processing_routes WIP
- config.rs: SCRIPTS_DIR fix, EMBED/OLLAMA_URL 127.0.0.1, PYTHON_PATH restored
- executor.rs: use config::PYTHON_PATH instead of hardcoded path
- probe.rs/watcher.rs: use config::SCRIPTS_DIR instead of hardcoded path
- release.rs: momentry_core_0.1 → momentry_core
- .env.development: fix REDIS_URL host, PYTHON_PATH, SCRIPTS_DIR
- api/mod.rs + server.rs: add processing module declaration (routes not yet mountable due to pre-existing compile errors)
2026-05-23 22:26:03 +08:00
M5Max128
f8bcc0356c feat: frame/time pipeline split + output validation
- Add PipelineType enum + pipeline() to ProcessorType
- Split ProcessorPool into frame_slots (max 2) and time_slots (max 1)
- Add can_start_for() for pipeline-aware scheduling
- Add validate_output_file() — checks JSON validity before marking complete
- Add 3 unit tests for validate_output_file()
- Create DESIGN/FRAME_TIME_PIPELINE_V1.0.md (492 lines)
2026-05-23 21:14:28 +08:00
M5Max128
dddb5d4cbd refactor: centralize port config + fix 8082 conflict
- Add EMBED_URL, OLLAMA_URL, LLM_HEALTH_URL to config.rs
- Fix health.rs hardcoded ports → config references
- Fix sync_db.rs Ollama URL → config::OLLAMA_URL
- Create config/port_registry.tsv (single source of truth for ports)
- Remove Caddy 8082 proxy block (port belongs to LLM)
- Fix .env LLM_URL: localhost → 127.0.0.1 (avoid IPv6 Caddy conflict)
2026-05-23 02:54:34 +08:00
M5Max128
a008bb865b feat: add Gitea to startup script, update AGENTS.md token
- Add Gitea (port 3000) as step 10 in startup script
- Update AGENTS.md Gitea token record
2026-05-23 02:37:19 +08:00
M5Max128
1c30af9557 fix: correct service paths, nohup removal, MongoDB graceful fallback, add MariaDB + Caddy to startup
- Fix Qdrant binary path (services/ -> momentry_resources/bin/)
- Fix LLM binary/model paths (llama/ -> momentry_resources/llama/, models/ -> models/llm/)
- Fix PostgreSQL data path (pgsql/data -> momentry/var/postgresql)
- Remove nohup (fails in LaunchDaemon environment)
- Add MongoDB graceful fallback with 5s timeout in server.rs
- Add MariaDB + Caddy steps to startup script for WordPress
- Revert all unrelated changes
2026-05-23 01:46:23 +08:00
Accusys
6967b99142 Merge remote-tracking branch 'origin/main' 2026-05-22 17:38:34 +08:00
Accusys
4cd5d63e64 feat: RustDesk 1.4.6 verified and installed 2026-05-22 17:37:35 +08:00
M5Max128
3ccdf403b6 feat: add Ollama to verified sources (Gitea repo + manifest + build from source) 2026-05-22 17:20:14 +08:00
Accusys
c09268f3d3 docs: add go(golang) and ollama verification reports 2026-05-22 16:58:08 +08:00
Accusys
84a2f71e30 docs: add verification_doc links to service sources manifest 2026-05-22 16:57:45 +08:00
Accusys
9b32d1fed4 docs: add Gitea repo URLs to service sources manifest 2026-05-22 16:45:55 +08:00
Accusys
3ef2e6e150 docs: add service sources manifest (replace src/ directory) 2026-05-22 16:38:58 +08:00
Accusys
c4e30e4234 fix: list_resources returns data (config+metadata); register source code resource 2026-05-22 16:01:33 +08:00
Accusys
bd82028f34 refactor: unified LLM config - CHAT_URL/VISION_URL/SUMMARY_URL with env var overrides 2026-05-22 15:47:17 +08:00
Accusys
a78b5bc12b docs: add agents/search endpoint to 12_agent.md 2026-05-22 12:26:11 +08:00
Accusys
2d008b75bf fix: find_file/list_files include has_data flag for video data availability 2026-05-22 12:22:35 +08:00
Accusys
380dd87d8b feat: POST /api/v1/agents/search - Gemma4 function calling agent 2026-05-22 12:10:37 +08:00
Accusys
600ce8e964 fix: await initApp() + fulltextSearch for reliable restore 2026-05-22 10:58:53 +08:00
Accusys
bc04d1c44a fix: persist search query across refresh via sessionStorage 2026-05-22 10:56:29 +08:00
Accusys
832dc2c45b docs: add bind/trace endpoint to 07_identity.md 2026-05-22 10:41:34 +08:00
Accusys
883535c4f7 feat: POST /identity/:uuid/bind/trace endpoint 2026-05-22 10:29:52 +08:00
Accusys
cb5d4aef61 feat: search clear X button 2026-05-22 10:18:44 +08:00
Accusys
37e75bd84f fix: search result click scrolls to first match + highlight; left sidebar unchanged 2026-05-22 10:15:58 +08:00
Accusys
373dea4a0d merge 2026-05-22 10:10:07 +08:00
Accusys
a2042507a3 fix: search results displayed in left sidebar, not content area 2026-05-22 10:09:32 +08:00
M5Max128
e158176fbe Merge branch 'main' of http://192.168.110.200:3000/admin/momentry_core 2026-05-22 10:08:11 +08:00
M5Max128
3e81f7c16b docs: rebuild HTML + WASM doc after identity PATCH update 2026-05-22 10:08:08 +08:00
Accusys
fc338a4b59 fix: sidebar sticky top, independent scroll from content 2026-05-22 10:07:40 +08:00
Accusys
f6a24e8cb5 docs: thumbnail auto-detect + representative-frame endpoint in 08_media.md; sync wasm 2026-05-22 09:56:10 +08:00
Accusys
7805eaa3cb fix: doc-wasm hardcoded path momentry_core_0.1 -> momentry_core 2026-05-22 09:33:33 +08:00
Accusys
0794476902 feat: representative frame limited to first half of video 2026-05-22 09:24:48 +08:00
Accusys
2b950c985c feat: representative frame - auto-detect thumbnail + JSON endpoint 2026-05-22 09:22:15 +08:00
M5Max128
2b025a014e docs: add PATCH identity endpoint doc + BCP 47 alias reference 2026-05-22 08:56:07 +08:00
M5Max128
e1619c724a Merge branch 'main' of http://192.168.110.200:3000/admin/momentry_core 2026-05-22 08:51:08 +08:00
M5Max128
701e71463d feat: identity PATCH update, alias system, name UNIQUE removal
- Add PATCH /api/v1/identity/:identity_uuid endpoint
- Migration 030: remove name UNIQUE, add tmdb_id index
- TMDb upsert: ON CONFLICT (name) -> ON CONFLICT (tmdb_id)
- get_or_create_identity: pre-check by name
- upload_identity: ON CONFLICT (name) -> ON CONFLICT (uuid)
- Search: include aliases in identity text search
- Add scripts/llm_metadata_enhancer.py
- Add DESIGN/IdentityUpdateAndAliasSystem.md
2026-05-22 08:35:32 +08:00
Accusys
deb9516796 feat: TKG extension - pose data + mutual gaze detection 2026-05-22 07:09:54 +08:00
Accusys
a9e9285032 docs: add TKG_QUERY_API_V1.0 design document 2026-05-22 06:29:25 +08:00
Accusys
6db29fc0e8 docs: add co-occur-with endpoint to 08_media.md 2026-05-22 05:35:24 +08:00
Accusys
2d3017d3c1 feat: GET file/:uuid/identities/:a/co-occur-with/:b endpoint 2026-05-22 05:34:25 +08:00
Accusys
6378d7be89 docs: add thumbnail endpoint to 08_media.md 2026-05-22 04:58:43 +08:00
Accusys
d67f123949 feat: GET file/:uuid/trace/:tid/thumbnail endpoint 2026-05-22 04:58:28 +08:00
Accusys
d7e11a394f docs: add representative-face endpoint to 08_media.md 2026-05-22 04:51:16 +08:00
Accusys
37f8aea4aa feat: GET file/:uuid/trace/:tid/representative-face endpoint 2026-05-22 04:50:07 +08:00
Accusys
e2c627da31 merge: M5Max128 server.rs split + path updates 2026-05-21 21:04:37 +08:00
Accusys
0710c5edf7 chore: update paths from momentry_core_0.1 to momentry_core 2026-05-21 21:03:43 +08:00
M5Max128
e1dbd27333 docs: add Gitea sync info + access token to AGENTS.md 2026-05-21 17:27:33 +08:00
M5Max128
3c458dfc5c Merge remote-tracking branch 'origin/main' 2026-05-21 16:38:52 +08:00
M5Max128
3a33d00449 refactor: modularize server.rs into separate route modules
- Extract scan.rs, files.rs, types.rs, processing.rs, visual_chunk_search.rs
- Move AppState and AppConfig to types.rs
- Each module exposes pub fn xxx_routes() -> Router<AppState>
- server.rs reduced from 5005 to 118 lines (orchestrator only)
- All stubs filled with real implementations from git history
- Verify: cargo check, clippy, tests all pass
2026-05-21 16:38:49 +08:00
Accusys
e7eb90b987 docs: sync notes + identity_binding.rs traces pagination 2026-05-21 16:30:27 +08:00
M5Max128
80812128e2 merge: resolve conflicts with M5Max128 local changes 2026-05-21 01:11:44 +08:00
Accusys
bebaa743ed feat: trace-level matching, health watcher/worker status, timezone config 2026-05-21 01:08:30 +08:00
Accusys
8ede4be159 chore: organize logs into logs/ directory with startup scripts
- Moved momentry_3002.log, momentry_3003.log to logs/
- Moved 34 nohup_worker_*.log files to logs/
- Created run-server-3002.sh, run-server-3003.sh for easy startup
- Updated AGENTS.md with log paths and startup scripts
- logs/ already excluded by *.log in .gitignore
2026-05-20 09:31:48 +08:00
Accusys
8b53e815b8 docs: fix 3003 reference in pipeline module, regenerate HTML/WASM 2026-05-19 23:22:09 +08:00
Accusys
ba68cd2548 feat: Identity JSON sync + schema-aware column selection
- storage.rs: add local_profile field, check disk for profile.jpg
- tmdb_api.rs: trigger JSON sync after TMDb probe
- identity_api.rs: upload_profile_image triggers JSON sync
- identity_binding.rs: bind/unbind/merge trigger JSON sync
- get_identity_json: Lazy Sync (generates JSON from DB if missing)
- identities.rs + identity_api.rs: use schema-aware column selection (dev:name vs public:real_name)
- Fixes 500 errors on identities endpoints across schemas
2026-05-19 23:10:49 +08:00
Accusys
0eb08acaae feat: Identity JSON sync mechanism
- storage.rs: add local_profile field, check disk for profile.jpg in save_identity_file_by_pool
- tmdb_api.rs: trigger JSON sync after TMDb probe
- identity_api.rs: upload_profile_image triggers JSON sync
- identity_binding.rs: bind/unbind/merge trigger JSON sync
- get_identity_json: replace DB fallback with Lazy Sync (generates JSON from DB if missing)
- Fixes missing/obsolete JSON files for all identity mutations
2026-05-19 22:20:19 +08:00
Accusys
7680c202ef Phase 5: mark bind/unbind/match-trace as tested on 3003 2026-05-19 21:08:16 +08:00
Accusys
58c283a1fc fix: playground ASR field names (start_time/end_time) + add 3003 specific test script
- playground.rs: seg.start/end -> seg.start_time/end_time
- scripts/test_m5api_phase5_3003.sh: tests bind, unbind, match-from-trace on localhost:3003
- Note: bind fails on dev (real_name column missing), match-from-trace returns 404 for no embeddings
2026-05-19 21:07:39 +08:00
Accusys
d2d3197c0d Phase 5: 21 tests (18 pass, 3 known: identity deleted by mergeinto, multipart required, proxy 404)
Note: mergeinto is destructive and deletes source identity.
Match-from-photo requires multipart file upload.
Match-from-trace works but proxy returns 404.
2026-05-19 20:31:34 +08:00
Accusys
e3c7e347b7 fix: identity binding + JSON endpoint + Phase 5 test script
- identity_binding.rs: fix i32->i64 type mismatch, COALESCE name column
- identity_api.rs: get_identity_json fallback to DB if file missing
- test_m5api_phase5.sh: fixed variable expansion, updated request bodies
- Phase 5: 21/23 passed (2 known: multipart + proxy 404)
2026-05-19 20:30:05 +08:00
Accusys
1ea23a6d51 fix: identity detail 502 - IdentityDetailRecord.id i32->i64 type mismatch panic
- identities.id is BIGINT (8 bytes), Rust struct was i32 (4 bytes)
- sqlx type mismatch caused panic, crashing backend process
- Proxy returned 502 due to empty reply from crashed backend
- Phase 5: 17/23 passed (was 16/23)
2026-05-19 18:33:21 +08:00
Accusys
02ad015b86 fix: type mismatch BIGINT->INT4 and FLOAT8->FLOAT4 in traces and faces endpoints
- trace_agent_api: CAST trace_id, frame_number to int; CAST confidence to float4
- identities: CAST frame_number to int; CAST confidence to float4
- Fixes 500 errors on /traces, /trace/:id/faces, /faces/candidates
2026-05-19 18:09:25 +08:00
Accusys
47a480a5e2 fix: identity search - fix i.name column and simplify identity_bindings join
- search_identity_text: COALESCE(i.real_name, i.actor_name) AS identity_name
- search_identities_by_text:
  - Removed broken identity_bindings join (table has wrong schema)
  - Fixed i.id type mismatch (bigint -> i32 via ::int cast)
  - Simplified to direct face_detections join
- Added error logging for debugging
- Phase 4 now 11/11 passed
2026-05-19 16:21:15 +08:00
Accusys
77098b88ba feat: Phase 2-5 API test scripts + create_monitor_job fix
Phase 2: 10/10 passed 
Phase 3: 7/7 passed 
Phase 4: 9/11 passed (2 known bugs - i.name column)
Phase 5: 13/23 passed (10 failures - pre-existing bugs)

Fixes:
- create_monitor_job: ON CONFLICT (uuid) DO UPDATE to prevent duplicate key errors
- test scripts: Correct request bodies for all visual search endpoints
2026-05-19 16:05:46 +08:00
Accusys
ff0bf6b25b feat: Phase 2-5 API test scripts
Phase 2: Files (10 endpoints) - 10/10 passed
Phase 3: Process & Pipeline (7 endpoints) - 4/7 passed
Phase 4: Search (12 endpoints) - pending
Phase 5: Identity/Media/TMDB (24 endpoints) - pending

Known issues:
- Process trigger fails for already-processed files (500)
- Health detailed returns 200 when tested directly
2026-05-19 15:53:53 +08:00
Accusys
ea6ea02925 fix: delete_video - add file existence check + fix pre_chunks UUID cast
- unregister: check file exists before delete, return 200 with success:false if not found
- delete_video: cast pre_chunks.file_uuid parameter as UUID (::uuid)
- Added Phase 2 test script (10/10 endpoints passed)
2026-05-19 15:51:25 +08:00
Accusys
611441662f fix: register_resource - use ON CONFLICT (resource_id) DO UPDATE instead of RETURNING id
- resources table uses resource_id as PK (no auto-increment id column)
- Make register idempotent: duplicate registration updates status + heartbeat
- Added Phase 1 API test script (15 endpoints, 100% pass)
2026-05-19 14:22:40 +08:00
Accusys
3d2bacb07f feat: Phase 1 base API test script (15 endpoints) 2026-05-19 14:15:00 +08:00
Accusys
7ab7119a99 fix: ASR processor indentation error 2026-05-19 13:23:09 +08:00
Accusys
67ca846ccd feat: ASR output frame numbers + rename start/end to start_time/end_time
- Python: asr_processor.py detects FPS from CUT/ffprobe (no fallback), outputs start_frame/end_frame
- Rust: All AsrSegment structs use start_time/end_time with #[serde(alias)] for backward compat
- store_asr_chunks: prefers ASR output frames, falls back to time-based conversion
- Added backward compatibility test for old JSON format (start/end)

Breaking change: ffprobe/CUT FPS failure now aborts instead of using default 24fps
2026-05-19 13:22:38 +08:00
Accusys
26725dcab7 fix: enable GFM tables in WASM doc renderer (pulldown-cmark ENABLE_TABLES) 2026-05-19 12:54:08 +08:00
Accusys
c9bcdcb56a docs: regenerate HTML/WASM docs with video vs clip comparison + timestamps 2026-05-19 12:51:10 +08:00
Accusys
5b2f9b35bf docs: add video vs clip comparison table + update timestamps to all 14 modules 2026-05-19 12:50:39 +08:00
Accusys
7b6da4f0d8 fix: identities API - use real_name instead of name for cross-schema compatibility 2026-05-19 10:21:49 +08:00
Accusys
72f4b53357 fix: add emergency API key bypass in middleware (3002+3003) 2026-05-19 09:59:09 +08:00
Accusys
ef64d69be7 feat: add download .md button to doc viewer 2026-05-19 03:29:46 +08:00
Accusys
6da046e831 feat: highlight matched keywords in search results 2026-05-19 03:21:22 +08:00
Accusys
7bc069b806 feat: full-text search across all doc modules 2026-05-19 03:18:46 +08:00
Accusys
b046a3b91c feat: add search filter to doc-wasm sidebar 2026-05-19 03:16:06 +08:00
Accusys
f6f623eeea docs: add 13_config to USER_MODULES + regenerate docs 2026-05-19 03:14:18 +08:00
Accusys
3085a7d048 docs: regenerate HTML/WASM docs after adding 13_config module 2026-05-19 03:06:39 +08:00
Accusys
2335781390 docs: extract config module (13_config.md) from pipeline module 2026-05-19 03:05:45 +08:00
Accusys
e14dc0fcb9 fix: register dedup response returns full existing file metadata (not zeros) 2026-05-19 03:02:56 +08:00
Accusys
1c42004abf fix: scan job_id via LEFT JOIN LATERAL monitor_jobs instead of stale videos.job_id column 2026-05-19 02:49:53 +08:00
Accusys
538eea6406 feat: health consistency agent — 4 data integrity checks, GET /health/consistency 2026-05-19 02:17:27 +08:00
Accusys
c95de97762 feat: show config toggle states in /health/detailed 2026-05-19 00:42:41 +08:00
Accusys
a02a83c1c3 fix: scan status=unregistered not shown as registered; feat: config API for auto-pipeline/watcher-auto-register 2026-05-19 00:37:00 +08:00
Accusys
05e1e807c0 remove: pipeline flowchart diagram 2026-05-18 13:30:37 +08:00
Accusys
bc962e910d fix: simplify vector DB labels 2026-05-18 13:28:45 +08:00
Accusys
522c0acabe fix: rename Story 5W1H Summary -> Template 5W1H Story Summary 2026-05-18 13:26:15 +08:00
Accusys
66542174b9 fix: rename to Story 5W1H Summary / LLM 5W1H Summary 2026-05-18 13:22:59 +08:00
Accusys
13bc3f7f80 fix: correct naming - story sentence embedding / llm summary sentence embedding 2026-05-18 13:20:17 +08:00
Accusys
35a94aa979 fix: add missing vector storage steps to 入库 checklist 2026-05-18 13:18:59 +08:00
Accusys
8ec70e39de fix: Story and 5W1H as separate agent items 2026-05-18 13:17:44 +08:00
Accusys
3fada32dae fix: separate Story/5W1H into Agent subgraph 2026-05-18 13:16:04 +08:00
Accusys
be216f26bd fix: Phase1/Qdrant/PG moved to top-level subgraphs 2026-05-18 13:12:32 +08:00
Accusys
56e6d2a985 fix: restructure 入库 into Phase1, Qdrant向量庫, PG向量庫 2026-05-18 13:10:35 +08:00
Accusys
ccf82ec8ba fix: wrap vector storage in separate subgraph 2026-05-18 13:07:49 +08:00
Accusys
1515a0a682 fix: add voice+face embedding to pipeline diagram 2026-05-18 13:05:37 +08:00
Accusys
22e164f1a3 fix: correct pipeline dependency diagram 2026-05-18 13:01:02 +08:00
Accusys
6afbd45929 fix: OCR and Pose as separate nodes 2026-05-18 12:57:29 +08:00
Accusys
7835922264 fix: Mermaid colors + simplified LR layout 2026-05-18 12:54:51 +08:00
Accusys
b373608e67 feat: Mermaid diagram rendering in WASM doc 2026-05-18 12:44:47 +08:00
Accusys
47caf0cc4a fix: wrap login in form so Enter key submits 2026-05-18 12:38:04 +08:00
Accusys
12864634da fix: clear password field in Python login page too 2026-05-18 12:35:08 +08:00
Accusys
97e7234a74 fix: clear password field on login page 2026-05-18 12:34:10 +08:00
Accusys
91bf26fd8b fix: /doc redirects to /doc-wasm (remove old Python doc login) 2026-05-18 12:34:00 +08:00
Accusys
778d6b5984 fix: better 404 error message with full URL 2026-05-18 12:27:24 +08:00
Accusys
880425b335 fix: logout clears cookies + shows login form, module-list clear on re-login 2026-05-18 12:20:49 +08:00
Accusys
b151494db8 fix: force show login form on WASM doc 2026-05-18 12:15:37 +08:00
Accusys
d035e9fa9f feat: WASM doc login page 2026-05-18 12:14:33 +08:00
Accusys
99cef1a18b fix: sidebar min-height 100vh + sticky logout 2026-05-18 12:11:58 +08:00
Accusys
e0a6fdf143 fix: logout no longer reloads page, shows message instead 2026-05-18 12:08:33 +08:00
Accusys
d4f68c40e5 fix: WASM direct instantiation working 2026-05-18 12:05:04 +08:00
Accusys
efcf26d294 fix: WASM import module path was wrong (wbg -> ./md_wasm_bg.js) 2026-05-18 11:59:17 +08:00
Accusys
773ab67092 fix: direct WASM instantiation without wasm-bindgen JS glue 2026-05-18 11:56:17 +08:00
Accusys
e53106f7e2 fix: add error listeners + WASM test page 2026-05-18 11:48:29 +08:00
Accusys
4f35386bb1 fix: await loadDoc, add empty render check, show stack trace 2026-05-18 11:46:55 +08:00
Accusys
dc210b24c6 fix: Makefile WASM copy path 2026-05-18 11:35:03 +08:00
Accusys
a1ac722b2f fix: use no-modules WASM target for simpler loading 2026-05-18 11:33:55 +08:00
Accusys
e61ff88bf8 fix: WASM import absolute path instead of relative 2026-05-18 11:32:14 +08:00
Accusys
10f0538b0b fix: add WASM init error handling to index page 2026-05-18 10:14:44 +08:00
Accusys
97e29dc2cf fix: WASM doc fetch path /doc/modules -> /doc-wasm/modules 2026-05-18 10:11:48 +08:00
Accusys
6452ac5af2 feat: WASM-based doc viewer (pulldown-cmark) 2026-05-18 10:07:38 +08:00
Accusys
78ba6f3d3d docs: fix logout f-string escaping, rebuild 2026-05-18 10:00:51 +08:00
Accusys
2103672684 docs: add logout to every doc page and index 2026-05-18 10:00:29 +08:00
Accusys
54da7c7266 docs: add logout button to login page 2026-05-18 09:54:37 +08:00
Accusys
e6fd170da2 fix: identity agent writes Round 1 matches to DB immediately 2026-05-18 03:46:33 +08:00
Accusys
02cca7beda fix: search frames SQL alias bug, visual search serde default, identity JSON hyphen lookup 2026-05-18 02:52:27 +08:00
Accusys
53d80db2b3 docs: identity chunks response with start_frame/end_frame/fps 2026-05-18 01:56:32 +08:00
Accusys
a5275f5646 docs: identity tmdb_profile local path 2026-05-18 01:47:58 +08:00
Accusys
5c24cb2214 fix: identity tmdb_profile returns local path instead of TMDb URL 2026-05-18 01:34:39 +08:00
Accusys
a1f85de885 fix: identity detail response uuid -> identity_uuid 2026-05-18 01:31:39 +08:00
Accusys
e791da566f docs: update universal search response with start_frame/end_frame, limit param 2026-05-18 01:22:43 +08:00
Accusys
362c63007c feat: smart search response includes start_frame/end_frame/fps, add limit param 2026-05-18 01:21:43 +08:00
Accusys
4125163f7b refactor: rename search uuid -> file_uuid 2026-05-18 01:17:48 +08:00
Accusys
245ef39f03 docs: pipeline completion flow requires 入库 2026-05-18 00:55:54 +08:00
Accusys
70646871b9 fix: pipeline not complete until ingestion steps done 2026-05-18 00:50:33 +08:00
Accusys
01bebb645a docs: fix endpoint names, remove dead signlas/unbound, correct unmounted routes list 2026-05-18 00:42:27 +08:00
Accusys
088aefdac7 fix: pipeline timeline log, chunk lookup, face processor no fallback, Qdrant UUID script, delete safety rules 2026-05-18 00:36:14 +08:00
Accusys
a880c80556 fix: face_detections INSERT in pipeline, add dependency graph doc 2026-05-17 22:16:20 +08:00
Accusys
d6c8930f84 feat: ingestion status endpoint + pipeline doc with 入库 steps 2026-05-17 21:36:55 +08:00
Accusys
3164a65554 update: pipeline, search, clip, embedding fixes 2026-05-17 19:46:35 +08:00
Accusys
eec2eea880 docs: file_uuid generation rules for M4 2026-05-17 02:26:09 +08:00
Accusys
3a6c186575 docs: add REFERENCE docs, M4 workspace, Caddyfile 2026-05-16 03:11:32 +08:00
Accusys
5317cb4bec feat: schema tracking, SHA256 integrity, identity UUID fix, 3-angle face match, cuts table, trace stranger_id 2026-05-16 03:10:50 +08:00
Accusys
c41f7e0c6e feat: schema version tracking, SHA256 integrity, setup scripts, bug fixes 2026-05-15 18:06:36 +08:00
Accusys
0e73d2a2ce test: add unified probe unit tests (8 Rust + 6 Python), fix pre-existing test compilation errors 2026-05-15 14:58:44 +08:00
Accusys
f66557f898 docs: update identities API examples — add identity_uuid, start/end frame/time, fps 2026-05-15 14:44:52 +08:00
Accusys
29eca5a224 feat: unified probe — dispatcher detects category, runs ffprobe/Python/meta per file type 2026-05-15 14:38:47 +08:00
Accusys
4ee8a42e76 docs: unified file probe SOP design — PyPDF2, python-docx, openpyxl, python-pptx 2026-05-15 13:52:09 +08:00
Accusys
79265dfb86 docs: unify file_uuid/identity_uuid naming in FILE_LIFECYCLE design doc 2026-05-15 13:30:43 +08:00
Accusys
5d899b7ada docs: FILE_LIFECYCLE — mtime, watcher detection-only, version V1.2 2026-05-15 13:28:05 +08:00
Accusys
7686ed0df7 fix: use mtime (not birthtime) for UUID birthday — rsync preserves mtime across systems 2026-05-15 13:26:36 +08:00
Accusys
08f088e4a0 docs: .env.example — comprehensive env var reference matching config.rs 2026-05-15 13:20:36 +08:00
Accusys
5af8df9201 fix: watcher is detection-only — pre_process_file is now explicit, not automatic 2026-05-15 13:18:22 +08:00
Accusys
43cf702d05 feat: add 'unregistered' status — all incomplete files migrated to unregistered 2026-05-15 13:17:31 +08:00
Accusys
9fef5fb70d fix: move DEMO_USER_API_KEY from hardcoded to env var, add .env.example 2026-05-15 13:14:59 +08:00
Accusys
8a7ffc94e4 fix: register uses birthday from pre.json (not DB registration_time) for UUID stability
- Step 4 UUID computation now reuses birthday from pre.json or file creation time
- Removed DB birthday query that overwrote the correct birthday with NOW()
- End-to-end verified: watcher UUID now matches registration UUID
2026-05-15 13:07:45 +08:00
Accusys
cdbd205972 feat: file pre-processor in watcher — SHA256 + probe + UUID → .pre.json for all file types 2026-05-15 12:51:43 +08:00
Accusys
e86aebccee feat: register INSERT now uses status='registered' + registration_time=NOW() 2026-05-15 12:46:42 +08:00
Accusys
b98578da15 docs: add cross-contamination prevention section to AGENTS.md 2026-05-15 12:26:28 +08:00
Accusys
66658b1156 docs: credential management design — classification, current state, recommended architecture 2026-05-15 12:22:56 +08:00
Accusys
9c47bb331f docs: FILE_LIFECYCLE is draft design → DESIGN/, not finalized standard 2026-05-15 12:20:07 +08:00
Accusys
9cf20d3f8e docs: reclassify — DESIGN→STANDARDS, conversion→M5_workspace, cleanup 2026-05-15 12:18:29 +08:00
Accusys
33b6f3cc66 docs: set document_type to design_doc 2026-05-15 12:10:47 +08:00
Accusys
37e485c56f docs: move FILE_LIFECYCLE from REFERENCE to DESIGN — design doc, not reference 2026-05-15 12:10:37 +08:00
Accusys
e4330a9704 docs: comply with V1.0 docs standard — add frontmatter, info table, English content 2026-05-15 12:09:34 +08:00
Accusys
e4e3e25170 docs: clarify lifecycle applies to all managed file types, not just video 2026-05-15 12:06:44 +08:00
Accusys
d81aec7360 docs: file lifecycle design — pre-process (birth certificate) + registration (civil registry) 2026-05-15 12:05:13 +08:00
Accusys
802beb2db6 docs: RCA — identity_uuid missing + file identities NULL appearance 2026-05-15 10:59:23 +08:00
Accusys
37799fff4e fix: add identity_uuid to /identities list + /file/:uuid/identities responses 2026-05-15 10:14:22 +08:00
Accusys
fdcec82274 fix: file/identities — replace NULL first/last_appearance with actual start_frame/end_frame + start_time/end_time + fps 2026-05-15 10:07:35 +08:00
Accusys
d7a133e1e4 docs: file conversion strategy — tools, licensing, implementation phases 2026-05-15 10:05:14 +08:00
Accusys
85b06b6169 docs: API ref — fix uuid→file_uuid, add fps/end_frame/end_time, dual input, lookup, content_hash, audio params 2026-05-15 05:10:27 +08:00
Accusys
a66bd6b7c2 chore: track M4's register_api_404 report 2026-05-15 03:28:38 +08:00
Accusys
fc1d7751dd feat: register non-video files — graceful probe fallback for svg/pdf/docx/pages etc 2026-05-15 03:17:57 +08:00
Accusys
263f017972 revert: remove /api/v1/register alias — not a valid endpoint, corrected M4 to use /api/v1/files/register 2026-05-15 03:12:32 +08:00
Accusys
e5f2bba248 fix: add /api/v1/register alias for backward compatibility 2026-05-15 03:08:56 +08:00
Accusys
53d64677d0 fix: rsync pipeline check looks for source-built binary at ~/bin/rsync 2026-05-15 01:22:54 +08:00
Accusys
1c07136ef1 feat: add rsync as managed resource — registered in DB + pipeline health check 2026-05-15 01:13:57 +08:00
Accusys
194a3b161a feat: registration accepts optional content_hash from client — checksum at birth 2026-05-14 20:44:33 +08:00
Accusys
37747466e8 fix: deploy_package.sh — add content_hash column migration before import 2026-05-14 20:35:22 +08:00
Accusys
4d1fe2d26f feat: file dedup — content_hash SHA256 + /files/lookup API + auto-rename on name collision 2026-05-14 20:24:21 +08:00
Accusys
189bec929a feat: all video endpoints support mode=normal|debug + audio=on|off 2026-05-14 19:04:42 +08:00
Accusys
d2bc7c0e2d fix: trace debug cut query — use chunk table (no separate 'cut' table exists), show '-' when unavailable 2026-05-14 18:48:27 +08:00
Accusys
7fb6745c27 fix: trace debug — move overlay to top-left, double font sizes 2026-05-14 18:44:27 +08:00
Accusys
93d87f0582 fix: trace video normal mode — remove -an to preserve audio 2026-05-14 18:38:11 +08:00
Accusys
54763ea88d docs: final compliance audit — uuid naming + start/end standardization 2026-05-14 18:00:51 +08:00
Accusys
b5215f13e3 fix: progress route :uuid → :file_uuid (consistency with API docs) 2026-05-14 17:58:57 +08:00
Accusys
11f690ca35 docs: fix start/end → start_frame/end_frame in API docs 2026-05-14 17:57:00 +08:00
Accusys
0491c39d3f docs: audit API docs — fix all remaining bare uuid → file_uuid 2026-05-14 17:54:39 +08:00
Accusys
a9d0228a72 fix: unregister — rename request/response uuid → file_uuid 2026-05-14 17:46:38 +08:00
Accusys
1319eecc71 docs: add UUID naming rule to AGENTS.md + DOCS_STANDARD.md — never bare uuid, always file_uuid/identity_uuid 2026-05-14 17:40:18 +08:00
Accusys
8608d38548 docs: fix unregister endpoint description — supports uuid + pattern modes 2026-05-14 17:38:35 +08:00
Accusys
4494935cc9 feat: dual input (start_frame/end_frame + start_time/end_time) + all outputs include frames, time, fps 2026-05-14 17:36:18 +08:00
Accusys
df531b2457 docs: clarify start_frame/end_frame vs start_time/end_time across API docs 2026-05-14 17:23:33 +08:00
Accusys
89c3b7df50 docs: clarify file_uuid vs identity_uuid across all API docs 2026-05-14 17:19:57 +08:00
Accusys
0da90630f5 docs: update trace API ref + API dictionary to V4.1 2026-05-14 17:15:37 +08:00
Accusys
2e9bb6e52b docs: update API reference to V4.1 — health pipeline, trace debug, identity search 2026-05-14 17:14:33 +08:00
Accusys
26f243428d docs: pipeline services checklist for M4 2026-05-14 17:05:18 +08:00
Accusys
513b9e72fc feat: health/detailed — add pipeline status section (scripts, models, ffmpeg, embed, gdino, llm) 2026-05-14 17:01:54 +08:00
Accusys
c589eb10cf docs: respond to M4 binary crash analysis 2026-05-14 16:32:02 +08:00
Accusys
b3458edfc5 delivery: v1.0.0_1f7daf9 for M4 — schema hardcode fix, health API, trace debug overhaul 2026-05-14 16:09:21 +08:00
Accusys
1f7daf9e8b fix: escape colons in drawtext text values for ffmpeg 8.1.1 filter parser compatibility 2026-05-14 15:55:32 +08:00
Accusys
6728c2bb90 feat: trace debug — actual bbox thickness=4, interpolated bbox thickness=1 at first known position 2026-05-14 15:24:11 +08:00
Accusys
d8dddda970 fix: trace debug bbox thickness 1 (thinner) 2026-05-14 15:21:04 +08:00
Accusys
cfb0cfbb37 fix: trace debug info panel moved to bottom-left corner 2026-05-14 15:20:05 +08:00
Accusys
94122f5371 fix: trace debug bbox transparency 0.5 2026-05-14 15:18:12 +08:00
Accusys
a8d7361a97 fix: trace debug — green bbox + trace_id label per face detection 2026-05-14 15:17:06 +08:00
Accusys
c90394897d fix: trace debug — show Stranger_NNN for unnamed traces instead of unknown 2026-05-14 15:12:21 +08:00
Accusys
8f013cbdbc fix: trace debug mode — show all traces in frame range with interpolation
Debug overlay now lists every trace visible in the current frame range,
including interpolated frames (continuous from first to last detection).
Format per trace line:
  Trace {id}: start_frame={n}  Identity={name}
2026-05-14 15:09:34 +08:00
Accusys
c51d6f6f2d fix: trace debug mode — text overlay only, no bounding boxes
Debug overlay now shows:
  File UUID: {uuid}
  Trace {id}: start_frame={n}  Identity: {name}
  Cut: {id}
  Frame: {n}  Time: {t}s
2026-05-14 15:07:28 +08:00
Accusys
1497b53e82 fix: trace video default mode changed from 'debug' to 'normal' 2026-05-14 15:00:35 +08:00
Accusys
6927415c41 feat: health API — add build_timestamp + detailed resource status list
- build.rs: BUILD_TIMESTAMP from build time via `date -u`
- GET /health: now returns build_timestamp
- GET /health/detailed: returns build_timestamp + resources block
  (cpu_used/cpu_idle/memory/gpu usage)
2026-05-14 14:59:30 +08:00
Accusys
2c4e32f14a fix: deploy.sh normalizes schema prefix in data.sql too (format normalization) 2026-05-14 14:46:53 +08:00
Accusys
df47ed1417 feat: identity inactive cleanup — migration script + release.rs excludes status='inactive'
- New SQL migration: mark auto identities with no face references as inactive
- release.rs identities export now filters out status='inactive'
2026-05-14 14:46:30 +08:00
Accusys
c45bd3bb0f fix: deploy script schema integrity — normalize COPY schema prefix via sed + drop identities_name_key constraint 2026-05-14 14:45:53 +08:00
Accusys
31d113f23a test: add face tracker unit tests — 27 tests for IoU, distance, embedding sim, match logic, and tracking pipeline 2026-05-14 14:44:39 +08:00
Accusys
301a95e2bc refactor: remove all dev.* and public.* schema hardcodes from runtime code
14 files updated to use schema::table_name() instead of hardcoded schema
prefixes. Only src/bin/release.rs intentionally retains dev.* references.
2026-05-14 14:40:14 +08:00
Accusys
261d134fee Add document compliance checklist section to AGENTS.md
P0 (7 mandatory) + P1 (3 suggested) checklist for REFERENCE/*.md files,
with M4_workspace/ exception. AI agents must self-check before creating docs.
2026-05-14 14:27:12 +08:00
Accusys
4864c57d4c fix: executor scene/object trace time range for GDINO 2026-05-14 14:02:49 +08:00
Accusys
159684331e feat: GDINO A+B — time-bounded search (9s vs 130s) + parameterized interval 2026-05-14 13:57:25 +08:00
Accusys
5a9b34f1c2 feat: identity text search endpoints — /search/identity_text + /identities/search 2026-05-14 12:27:08 +08:00
Accusys
39888ce3cc feat: eye filter flag + QA fixes (Gemma4 prompt, YOLO boundary, PaliGemma score, GDINO skip) 2026-05-14 12:24:25 +08:00
Accusys
f60a59b280 feat: QA self-check agent — 15 prompts, 5 judges, weighted scoring 2026-05-14 10:53:30 +08:00
Accusys
2b633174b9 docs: reply to M4 with freshly built binary from HEAD 2026-05-14 03:45:55 +08:00
Accusys
0bd23fabd0 docs: M5 progress report — face tracker, bug fixes, pipeline 2026-05-14 03:37:01 +08:00
Accusys
79e455cc3d docs: deliver cut-based trace merge package 2026-05-14 03:12:55 +08:00
Accusys
64bcfd716e feat: merge traces within same cut — centroid similarity threshold 0.75 2026-05-14 03:04:03 +08:00
Accusys
4e933a554c docs: reply to M4 on trace schema hardcode fix 2026-05-14 02:56:59 +08:00
Accusys
e8f44d7357 fix: trace_agent_api.rs — replace all dev.* hardcodes with schema::table_name() 2026-05-14 02:56:43 +08:00
Accusys
edadb022e1 docs: notify M4 of trace video mode param 2026-05-14 02:48:15 +08:00
Accusys
995d925053 docs: add trace video normal/debug mode to API reference 2026-05-14 02:42:57 +08:00
Accusys
8f877b474f feat: trace video normal/debug mode — normal=raw, debug=bbox+frame+identity+cut 2026-05-14 02:41:22 +08:00
Accusys
d4386aba1b docs: notify M4 of binary + source delivery 2026-05-14 02:34:53 +08:00
Accusys
ac96a4242b fix: correct frame number expression in trace video 2026-05-14 02:31:29 +08:00
Accusys
605d02a674 feat: trace video shows frame number overlay 2026-05-14 02:30:40 +08:00
Accusys
3a7facdc10 fix: face tracker — add iou>0.35+dist<100 condition for same-position matching 2026-05-14 02:26:37 +08:00
Accusys
7e068f5bb9 docs: reply to M4 deploy report — clarify trace/TKG counts, identities issue 2026-05-14 01:58:43 +08:00
Accusys
11ec006947 docs: reply to M4 --force request 2026-05-14 01:54:09 +08:00
Accusys
1023930f73 feat: deploy.sh --force flag to skip overwrite confirmation 2026-05-14 01:53:59 +08:00
Accusys
f482705b9b docs: deliver pipeline v2 package to M4 — cut-aware traces + TMDB + TKG 2026-05-14 01:36:26 +08:00
Accusys
b66d7963c2 fix: store_traced_faces — embed from DB, UPDATE not INSERT, dedup 2026-05-14 00:32:39 +08:00
Accusys
74f00d3baa fix: face traces split at scene cuts — even same person, different cut 2026-05-14 00:21:17 +08:00
Accusys
9007e46b9f fix: trace video bbox no longer extends beyond last detection 2026-05-14 00:14:52 +08:00
Accusys
690254a5b2 fix: face tracker — reject cross-person match on bbox size + edge exit 2026-05-14 00:05:57 +08:00
Accusys
70a796e16c fix: face tracker embedding threshold — reject similarity < 0.5, tighten fallback to >0.75 2026-05-14 00:02:39 +08:00
Accusys
118a386f47 docs: notify M4 of trace video audio fix + updated binary 2026-05-13 23:46:52 +08:00
Accusys
adae263065 fix: add audio (aac) to trace video API 2026-05-13 23:46:06 +08:00
Accusys
abca3f67ff fix: drop redundant chunk_vectors chunk_id unique constraint 2026-05-13 22:42:03 +08:00
Accusys
65a1b55215 feat: add macmon + mactop build steps to install_services.sh 2026-05-13 22:40:41 +08:00
Accusys
1642a4b817 docs: reply to M4 release fixes — pre-clean all tables + SCHEMA variable 2026-05-13 22:05:53 +08:00
Accusys
6cd41ed71f fix: deploy.sh pre-clean all tables + SCHEMA var for public/dev 2026-05-13 22:05:35 +08:00
Accusys
96a96b4e88 docs: release delivery — binary + 2 packages 2026-05-13 21:11:31 +08:00
Accusys
301da0810f fix: M5 provides release binary, not M4 2026-05-13 21:05:08 +08:00
Accusys
d4864121b7 docs: reply to M4 release decision request — 6 items with rationale 2026-05-13 21:00:43 +08:00
Accusys
2cf962bc70 docs: update DELIVERY_PROCEDURE to v1.1 — add self-verify, version strategy, rollback, deploy details 2026-05-13 20:54:58 +08:00
Accusys
7ae8ccafb8 docs: add DELIVERY_PROCEDURE.md — M4_workspace → M5 → Public Release 2026-05-13 20:52:01 +08:00
Accusys
edb0e0bf7a fix: bundle vec0.dylib in package + deploy install (4/4 M4 items) 2026-05-13 20:46:29 +08:00
Accusys
e6aa45d7ea fix: /files total count from DB (was hardcoded 0) 2026-05-13 20:45:23 +08:00
Accusys
2e7dd44552 fix: scan extensions add jpg/png, /files status from DB (2/4 M4 items) 2026-05-13 20:43:37 +08:00
Accusys
50d38a5473 docs: reply to M4 on REQUIRED_FILES fix 2026-05-13 20:20:36 +08:00
Accusys
fcaaeadf06 fix: deploy.sh missing REQUIRED_FILES variable 2026-05-13 20:20:26 +08:00
Accusys
1d69a88741 fix: deploy.sh build check lenient + per-file import order (M4 feedback)
- Accept SRV_BUILD=unknown (skip build check, only compare version)
- Per-table import with explicit FK order (nodes before edges)
2026-05-13 20:15:44 +08:00
Accusys
3dc09cf802 docs: add M4 notification protocol to AGENTS.md 2026-05-13 20:04:44 +08:00
Accusys
78b7a10ace docs: add M4 notification protocol — standardize response format 2026-05-13 20:01:45 +08:00
Accusys
ffc30d7377 M4 handover: coordinate fixes, detector registry, deploy v2, YOLOv8s, identity lifecycle
- Fix swift_pose/swift_ocr Y-flip bugs (BUG-003~006)
- Add heuristic_scene module + post-processing trigger (replaces Places365)
- YOLOv5nu → YOLOv8s CoreML (+33% detections, +390% scene indicators)
- Per-table SQL export (split 4.7GB single file → 478MB max per table)
- Version/build check in deploy.sh (compare /health vs file_info.json)
- Add file_uuid column to identities table + backfill
- Identity pre-clean step in deploy (avoids UNIQUE conflicts on re-deploy)
- Stranger_xxx naming fix with UUID context
- Add DETECTOR_REGISTRY.md (25 detectors), DETECTOR_SELECTION_SOP.md
- Update SPATIAL_COORDINATE_REGISTRY.md (P layer, 6-layer architecture)
- New IDENTITY_LIFECYCLE.md
- M4 response docs for deploy_script_fix and 111614 test report
2026-05-13 20:00:47 +08:00
Accusys
d34bcae145 fix: M4 api_test v2 compatibility — chunk ID format + response
- Fix chunk/0-01 → chunk/0 (v2.0 sequential chunk IDs)
- Identity UUID 2b0ddefe (Cary Grant) confirmed working in v2.0
- api_test.sh: 39/39 passed
- Response doc to M4_HANDOVER/ + M4_workspace/
2026-05-13 05:00:59 +08:00
Accusys
5c1d8a67b2 docs: M4 handover V2.0 — complete package with TMDB, sqlite-vec, deploy scripts
- Package v20260512_203344.tar.gz: 1.3GB, 18 files
- Self-contained deploy/verify scripts
- SQLite + sqlite-vec with 9 tables + 3 vec0 vector tables
- TMDB face matching: 9 actors, 93.6% face coverage
- Full TKG: 6,457 nodes + 21,028 edges
- Identity data: 428 identities, 5,483 bindings
- Offline report: render_offline_report.py
- All reports: ERP, SFTPGo, Service Inventory
2026-05-13 04:40:30 +08:00
Accusys
c0c0e6e8ea feat: self-contained deploy/verify scripts in release package
- Add deploy.sh: imports data.sql, copies video, copies output files, verifies
- Add verify.sh: checks file integrity + DB/offline status
- Both scripts included in tar.gz via release package command
- Package now deployable standalone without release CLI
2026-05-13 04:35:43 +08:00
Accusys
48c3b13c37 fix: restore identity_id after face_dedup, rebuild package v20260512
- Re-ran identity_bind.py to restore identity_id on face_detections
- Dedup cleanup had removed rows with identity_id, kept NULL rows
- 70691 face_detections now have identity_id, 428 identities
- Full package rebuild: 169MB sqlite, 1358MB tar.gz
- identities.json: 428 identities + 5483 bindings + 5483 trace maps
- TMDB matching complete: Audrey Hepburn 843 traces, Cary Grant 482
2026-05-13 04:30:18 +08:00
Accusys
fff2af8ad1 fix: identity names now show in all trace tooltips (online + offline)
- Online: remove IDENTITY filter gating on identity_note — always show
- Offline: fix id_names scope bug — was overwritten by top10-only dict
- Both reports now show 'identity: PERSON_xxx' for all 2000 timeline traces
- All 5483 traces have identity mapping (verified in SQLite)
2026-05-13 03:19:26 +08:00
Accusys
8d4d29ce6e fix: offline report shows identity names in trace tooltips
- Load trace_to_identity mapping from SQLite face_detections
- Query identity names from identities table
- Show 'identity: PERSON_xxx' in each trace bar tooltip
- Works in both full view and --identity filtered view
2026-05-13 03:14:49 +08:00
Accusys
bbf8e64752 feat: offline report from SQLite, no PostgreSQL needed
- Add render_offline_report.py — reads .sqlite directly
- Reports include: DB contents, TKG breakdown, density histogram,
  trace timeline, top identities, identity details card
- Supports --identity filter (like online mode)
- Add release visualize-offline <sqlite> [-i identity] [-o output]
- Works with exported .sqlite from export_sqlite.py
- Uses sqlite-vec vec0 tables for vector metadata
2026-05-13 03:10:59 +08:00
Accusys
007fe10c2e feat: TKG completion, PG audit, SQLite backup with Qdrant voice vectors
- Add voice_embeddings vec0 table (192D) from Qdrant to SQLite export
- Add tkg_nodes + tkg_edges tables to SQLite export
- Clean orphan TKG data (2414 nodes, 64 chunks)
- Rebuild TKG for both Charade files with speaker nodes
- Create asrx.json from chunk speaker metadata for TKG builder
- PG audit: pre_chunks 1.8GB (largest), 3 empty tables found
- Update release package to include all output files (not just JSON)
- Full backup: 9 SQLite tables + 3 vec0 vector tables
2026-05-13 03:03:38 +08:00
Accusys
2992a0e650 feat: service inventory, ERP reports, sqlite-vec integration, visualize tool
- Add SERVICE_INVENTORY_V1.0.0.md (25 source-verified tools, 3.7GB)
- Add ERP_SELECTION_REPORT.md (Odoo CE vs ERPNext comparison)
- Add SFTPGO_ODOO_REPLACEMENT.md (SFTPGo migration plan)
- Add SERVICE_GO_GITEA_BUILD.md (Go compiler + Gitea build report)
- Add release visualize command (face trace heatmap + identity filter)
- Add sqlite-vec integration (160MB SQLite with vec0 vector tables)
- Add export_identities.py, export_sqlite.py, render_face_heatmap.py
- Add Go, Gitea, Rust/Cargo, Swift, yt-dlp, SQLite, sqlite-vec to service CLI
- Fix package to include identities and identity_bindings in data.sql
- Update release list to show all deployed video stats
- Add V1.0.0 YAML frontmatter to all docs (DOCS_STANDARD compliant)
2026-05-13 02:37:45 +08:00
Accusys
cac60c6093 fix: M4 Phase 1 bugs - dev.chunks refs, search_path, uuid column
Bug fixes from M4 report:
- 4 remaining dev.chunks → dev.chunk in SQL queries
- search_path includes public for pgvector extension
- get_chunk_by_chunk_id_and_uuid: uuid → file_uuid
- New endpoint: GET /api/v1/file/:uuid/chunk/:chunk_id
2026-05-11 10:21:06 +08:00
Accusys
39ba5ddf76 feat: Phase 1 handover - schema migration, correction mechanism, API fixes
Schema changes: dev.chunks->dev.chunk, remove old_chunk_id/chunk_index
Correction: asr-1.json format, generate/apply scripts
API: 37/37 endpoints fixed and tested
Docs: HANDOVER_V2.0.md for M4
2026-05-11 07:03:22 +08:00
Accusys
ef894a44ad docs: update Phase 1 report with all Qdrant collections + voice embeddings fix
- Fixed asrx_processor_custom.py: embeddings now passed to asrx.json
- Voice embeddings (192D ECAPA-TDNN) extracted for all 1815 ASRX segments
- momentry_dev_voice Qdrant collection created (1815 vectors)
- Updated Phase 1 report with 6 collections, key decisions
2026-05-10 01:11:42 +08:00
Accusys
d043b6adae docs: Phase 1 completion report + LLM reasoning off fix 2026-05-09 22:03:34 +08:00
Accusys
e7f311e7b8 docs: Phase 1 checklist 6 sections (含 face trace + TKG + identities)
release_pack.py: identities 併入 phase 1(不限 phase >= 2)
2026-05-09 18:14:22 +08:00
Accusys
6fc1d2b54d dashboard: copy button, dedup files, /api/all single call 2026-05-09 18:00:46 +08:00
Accusys
4f1e546104 feat: Momentry Dashboard web app (Flask)
- Realtime dashboard on port 5050
- Pipeline checklist (8 stages /)
- System health: CPU, memory, disk, GPU, 4 services
- Redis metrics: memory, clients, hit rate, keys
- DB table counts: videos, chunks, face_detections, identities, TKG
- Processor timing chart
- Auto-refresh every 15 seconds + manual refresh button

Usage: python3 scripts/dashboard.py
Open: http://localhost:5050
2026-05-09 17:43:26 +08:00
Accusys
06caea51e7 feat: pipeline status dashboard (checklist + health + timing)
Usage:
  python3 scripts/pipeline_status.py              # formatted table
  python3 scripts/pipeline_status.py --json        # machine-readable JSON

Shows:
- 8-stage checklist with pass/fail per stage
- System health: CPU, memory, disk, GPU, 4 services
- Processor timing from DB
- All in under 1 second
2026-05-09 17:29:38 +08:00
Accusys
fc16e7b1c3 fix: Phase 1 pipeline fully operational
- store_traced_faces.py: add --uuid arg for PythonExecutor compat
- tkg_builder.py: add --uuid arg + timestamp_secs column fix
- release_pack.py: fix pg_dump/psql paths, proper JSON escaping
- pipeline_checklist.py: new independent verification tool

Phase 1 checklist 8/8 PASS:
ASR  ASRX  sentence chunks  vector embeddings 
face trace  TKG graph  trace chunks  Phase 1 release 
2026-05-09 17:21:17 +08:00
Accusys
3a4fd4136d docs: wiki vs RAG distinction in model lifecycle
wiki is not traditional RAG:
- RAG: ephemeral query-time augmentation
- Wiki: permanent model corrections, versioned, packaged with model
- Edits accumulate across versions as ground truth
2026-05-09 14:28:47 +08:00
Accusys
e2509a650c docs: wiki mechanism for model adjustability
Each momentry model includes a wiki/ directory for user-contributed knowledge:
- identity labels, object labels, ASR corrections
- Edits feed back into next model version
- TKG edges enriched by wiki data
2026-05-09 14:24:40 +08:00
Accusys
076af4cba1 docs: Phase 3 object identity design (v3 model)
- Object instance tracking (similar to face trace)
- Custom detector for stamps, guns, etc.
- TKG integration for object-face-speaker graph
- Upgrade path: yolov5nu → yolov8m, fine-tune, zero-shot
2026-05-09 14:22:37 +08:00
Accusys
19669a1f91 refactor: model naming v1(base)/v2, Qdrant collection naming
- Phase 1 = v1 (base model, sentence chunk embedding)
- Phase 2 = v2 (full pipeline + 5W1H)
- Naming leaves room for v3, v4, etc.
- Qdrant collection: momentry_dev_v1 (active model under dev)
- Release packaging exports Qdrant points snapshot
2026-05-09 14:14:04 +08:00
Accusys
227c647a43 docs: momentry model vs core architecture
Pipeline = training → produces momentry model per video
Core = inference engine → serves APIs from model
Phase 1 = tiny model (sentence chunks)
Phase 2 = full model (complete + 5W1H)
2026-05-09 14:03:00 +08:00
Accusys
28652f5b76 feat: phased release packaging (Phase 1 + Phase 2)
- scripts/release_pack.py: packages output_json + schema + chunks + vectors
- Phase 1: triggered after ASR+ASRX+Rule 1+vectorization (sentence chunk delivery)
- Phase 2: triggered after full pipeline + 5W1H Agent (full delivery)
- Both phases include all available {uuid}.*.json files
- Non-overlapping directories: release/phase1/ and release/phase2/
2026-05-09 13:58:55 +08:00
Accusys
7237a1811e feat: verification agent for processor output validation
- New src/verification/ module: verify_output() checks JSON structure/completeness per processor type
- Worker: after processor succeeds, verification agent gates the result
- Passed -> mark completed + cleanup_temp_files (remove .tmp/.partial/.err/timestamp backups)
- Failed -> mark failed with verification details, preserve files for inspection
- cleanup_temp_files() keeps only the canonical {uuid}.{proc}.json
2026-05-09 13:30:00 +08:00
Accusys
e068b70777 docs: git bundle instructions for M4 (SSH disabled) 2026-05-09 06:25:45 +08:00
Accusys
a0774cb9ab feat: wire TKG builder into worker pipeline + face-face edges
- Auto-run tkg_builder.py after face trace store + Qdrant sync + trace chunks
- Add face-face CO_OCCURS_WITH edges (two traces in same frame)
- docs: TKG integration report for M4
2026-05-09 06:22:27 +08:00
Accusys
b902763d45 feat: trace chunks with co-appearance relationships
- New trace_ingest module: creates chunks for each face trace (time + bbox + ASR text)
- Computes pairwise time overlaps between traces -> co_appearances in metadata
- Worker auto-triggers after face trace store + Qdrant sync
- SearchFilters: chunk_type filter (sentence/cut/trace/visual)
- SearchFilters: co_appears_with_trace_id filter
2026-05-09 06:18:32 +08:00
Accusys
9f5afd1b86 fix: file-based source of truth for worker + backup protocol
- Worker: check {uuid}.{processor}.json existence before starting processor
- Worker: timestamp-copy backup existing output files before re-run (no delete, no overwrite)
- Executor: partial output saved as .json.partial (not .json) to avoid false completion
- Start script: removed set-e, log dir changed to momentry/logs, Qdrant collection status fix
- docs: M4 release incident report + M4/M5 collaboration protocol
2026-05-09 05:32:13 +08:00
Accusys
b220920e64 Visual scene classification: Phase 1+2 complete
- Extracted visual_stats per scene (face count, size, objects, duration, density)
- Classified 1130 scenes into 18 types (establishing/close_up/medium/long/two/group × dialogue/sparse/silent)
- All from existing data, no LLM needed
- Scene type stored in cut chunk metadata
2026-05-08 14:38:00 +08:00
Accusys
283da8e767 Fix trace/3128: drawtext + filter_complex_script
- Replaced bitmap font (~195K drawbox commands) with drawtext (~2.2K)
- Write filter to temp file, use -/filter_complex to bypass ARG_MAX
- Added ffmpeg stderr logging for debugging
2026-05-08 14:03:30 +08:00
Accusys
1f103e796b Video endpoints: use ffmpeg-full for drawtext, fix ARG_MAX via filter_complex_script
- Added FFMPEG Lazy static + ffmpeg_cmd() with DYLD_LIBRARY_PATH
- Replaced bitmap font rendering with drawtext (1 filter vs 35 per letter)
- Large traces (>1000 detections) may still fail (ARG_MAX with -vf)
2026-05-08 13:55:08 +08:00
Warren
7d89ff77d0 docs: add purpose/use case for each visualization type 2026-05-08 13:50:10 +08:00
Warren
6234972f37 docs: add trace visualization catalog V1-V5 with priority 2026-05-08 13:49:06 +08:00
Warren
77598a4713 docs: add XY trajectory visual encoding guide with color/size/fade 2026-05-08 13:45:35 +08:00
Warren
bb8e79cbc2 docs: add X-axis + xy trajectory modes, API request format 2026-05-08 13:43:12 +08:00
Warren
a0f3382d13 docs: expanded trajectory section with interpretation guide and SQL 2026-05-08 13:40:10 +08:00
Warren
e5f252a3ec docs: heatmap as full-pipeline visual search visualization layer 2026-05-08 13:37:53 +08:00
Warren
2058599e63 docs: trace heatmap spec v1.0.0 — spatial + temporal + combined 2026-05-08 13:33:57 +08:00
Warren
f469197ce6 M4: add ffmpeg-full licensing evaluation for M5's doc 2026-05-08 13:28:16 +08:00
Warren
3ff783e4aa docs: full demo script with narration (10 steps) 2026-05-08 13:24:32 +08:00
Warren
606405b941 M4: suggest replace bitmap font with drawtext filter 2026-05-08 13:21:40 +08:00
Warren
ac59789f6e Merge branch 'main' of 192.168.110.201:/Users/accusys/momentry_core_0.1/ 2026-05-08 13:16:28 +08:00
Accusys
14d95cab8e Notify M4: video endpoints root cause 2026-05-08 13:16:05 +08:00
Accusys
485dc4010c Fix video endpoints: DB file_path did not match actual file
Root cause: video file was renamed on disk but DB still had old path.
Old: ...Charade ... │ Comedy ...mp4
New: ...Old_Time_Movie_Show_-_Charade_1963.HD.mov

All 3 endpoints (trace/video, video/bbox, thumbnail) now return 200.
2026-05-08 13:14:00 +08:00
Warren
6ee2607f67 M4: add ffmpeg-full install status to response 2026-05-08 13:11:06 +08:00
Accusys
0cf9ca56d4 Fix ARG_MAX overflow: use filter_complex_script instead of -vf
- trace_video filter string can exceed macOS 256KB limit
- Write filter to temp file, pass with -filter_complex_script
2026-05-08 13:08:23 +08:00
Warren
ebe8722e1f M4: response to ffmpeg evaluation - ARG_MAX is the root cause 2026-05-08 13:06:27 +08:00
Warren
cfd4159b30 Merge branch 'main' of 192.168.110.201:/Users/accusys/momentry_core_0.1/ 2026-05-08 13:04:17 +08:00
Warren
6a8b534239 M4: update bug report - found root cause ARG_MAX overflow 2026-05-08 13:03:21 +08:00
Accusys
0366eb0f04 FFmpeg drawtext technical evaluation 2026-05-08 13:03:15 +08:00
Accusys
0977a04002 Fix guide for M4 video endpoints 500 2026-05-08 12:52:12 +08:00
Warren
d6ba74a61a M4: bug report - video endpoints 500 (trace, bbox, thumbnail) 2026-05-08 12:49:32 +08:00
Warren
047f6c4b2b docs: demo sequence v1.0.0 - curl POST + browser video 2026-05-08 12:32:56 +08:00
Warren
1bdc94c1ac docs: single-line curl for cross-platform (bash/PowerShell/cmd) 2026-05-08 11:57:45 +08:00
Warren
b63fe58751 docs: hardcode curl examples, remove shell vars 2026-05-08 10:58:06 +08:00
Warren
bb3505c91b fix: demo login returns official API key matching docs 2026-05-08 10:40:24 +08:00
Warren
653387a557 docs: fix API_REFERENCE key to official format 2026-05-08 09:55:34 +08:00
Warren
f122a1ebca docs: fix RELEASE_API_REFERENCE with correct API key 2026-05-08 09:50:50 +08:00
Warren
1f6cc7a631 docs: update API key to official format 2026-05-08 09:47:38 +08:00
Warren
e502e8248b Merge branch 'main' of 192.168.110.201:/Users/accusys/momentry_core_0.1/ 2026-05-08 09:31:01 +08:00
Accusys
8405d60797 Fix 5W1H+: max_tokens 2048->4096, skip empty summaries
- max_tokens was too low, truncating LLM JSON output
- Added guard to skip storing empty parent_summary
- Applied fix to all 3 entry points (analyze, batch, pipeline)
2026-05-08 08:12:45 +08:00
Warren
2767d4971b docs: make curl commands directly copy-pasteable with shell vars 2026-05-08 04:30:49 +08:00
Warren
6c266f0beb docs: add real curl examples with verified responses 2026-05-08 04:28:17 +08:00
Warren
7a193845bb docs: remove .rs annotations, fix :uuid→:file_uuid, :id→:trace_id 2026-05-08 04:25:15 +08:00
Warren
cad5eadeec docs: rewrite RELEASE_API_REFERENCE v4.0 - 55 endpoints, 10 categories 2026-05-08 04:16:39 +08:00
Warren
8c9bab1d4a docs: fix numbering 47→56, separate delete section 2026-05-08 04:06:11 +08:00
Warren
876552ee95 docs: add media+delete routes to dictionary 2026-05-08 03:13:44 +08:00
Accusys
3caa35e096 Fix 5W1H status endpoint: uuid to file_uuid 2026-05-08 03:05:36 +08:00
Warren
a19385d35b docs: add missing trace routes to API_DOCUMENTATION, mark V3 docs deprecated 2026-05-08 02:54:18 +08:00
Warren
761853771a docs: update production test report - public schema, status ok 2026-05-08 02:46:17 +08:00
Warren
76c4d47112 docs: complete API dictionary v1.0.0 (55 endpoints) 2026-05-08 02:41:43 +08:00
Warren
ae0033f14b docs: schema migration plan v1.0.0 + production .env 2026-05-08 02:30:25 +08:00
Warren
dfd6bf9861 docs: production test report v1.0.0 2026-05-08 02:19:09 +08:00
Accusys
32f1d3e28a Release v1.0.0 notification 2026-05-08 01:46:40 +08:00
Accusys
d8714aa46e Fix semantic search: query chunks instead of empty parent_chunks table 2026-05-08 01:29:10 +08:00
Warren
3e70f1b590 M4: bug report - parent_chunks missing summary_vector/scene_order 2026-05-08 01:26:43 +08:00
Accusys
736b14be15 Fix smart search: use EmbeddingGemma instead of Ollama mxbai 2026-05-08 01:23:34 +08:00
Warren
b577f5b3bc M4: bug report - smart search still uses Ollama/mxbai 2026-05-08 01:22:31 +08:00
Accusys
64cce1b2b4 Fix search uuid to file_uuid column rename 2026-05-08 01:14:28 +08:00
Warren
7a7bccc04a M4: bug report - search uuid→file_uuid column rename 2026-05-08 01:11:54 +08:00
Warren
1fddd667e1 Merge branch 'main' of 192.168.110.201:/Users/accusys/momentry_core_0.1/ 2026-05-08 01:04:45 +08:00
Warren
6d82131589 M4: trace API, portal embed client, EmbeddingGemma sync, release plan 2026-05-08 01:04:23 +08:00
Accusys
69635bd4da Notify M4: release ready for sync 2026-05-08 01:01:34 +08:00
Accusys
573714788f Release v1.0.0 candidate 2026-05-08 00:48:15 +08:00
Warren
26d9c33419 Merge branch 'main' of 192.168.110.201:/Users/accusys/momentry_core_0.1/ 2026-05-08 00:42:36 +08:00
M5
1c9c8f7d61 M5 repo open for push, add guide for M4 2026-05-08 00:41:39 +08:00
M5
23d114d058 Add 5W1H+ quality evaluation report
- Gemma4 26B scored 5/10 overall
- Template-heavy, lacks specific details and emotion
- Suggested improvements: prompt tuning, temperature, model upgrade
2026-05-08 00:32:57 +08:00
Warren
7b822c754c Merge M5 docs into M4 2026-05-08 00:26:09 +08:00
M5
56dfe1df8f Add Qdrant collection naming convention
- Format: {machine}_{env}_{type}
- M5 dev: m5_dev_rule1, m5_dev_face
- M4 dev: m4_dev_rule1, m4_dev_face
- M4 prod: m4_prod_rule1, m4_prod_face
- Controlled via QDRANT_COLLECTION env var
2026-05-08 00:22:39 +08:00
M5
041e414a9b Add DB/vector sync guide for M4
- PostgreSQL dump (890MB) ready at /tmp/momentry_3abeee81.sql
- Qdrant face vectors (4873 points, 512D) available
- Text vectors pending (5W1H+ in progress, ~9h)
- Output JSON files ready for rsync
2026-05-08 00:02:05 +08:00
M5
73e9825c6e Add git sync setup guide for M4 2026-05-07 23:43:16 +08:00
M5
28e927f7bb Initial commit: docs_v1.0 structure
- API_V1.0.0: 正式 API 文件(spec、release、deploy、test)
- M4_workspace: M4 工作記錄(review、issue、提案)
- M5_workspace: M5 工作記錄(實作、評估、sync)
- AGENTS.md: 專案規則

M5/M4 協作方式:git push/pull 同步 workspace 文件
2026-05-07 23:42:19 +08:00
Warren
bac6c2d8a8 feat: identity clustering V3.0 — min_frames=1, all 2347 traces bound (0 unbound), Raoul Delfosse newly recognized 2026-05-06 18:20:12 +08:00
Warren
0b42365ecd docs: complete M5 Gemma4 deployment record V1.1 — full build, dylib fix, codesign, reasoning off, OpenCode config 2026-05-06 17:54:16 +08:00
Warren
f65ac89e6a deploy: Gemma 4 31B llama-server running on M5 Max (192.168.110.201:8081) 2026-05-06 17:13:32 +08:00
Warren
2e29780d40 docs: update identity clustering report with TMDb direct match vs iterative enrichment analysis 2026-05-06 15:03:04 +08:00
Warren
ca4f59d811 fix: RCA trace 39/45 collision - raise composite threshold 0.35→0.50, add min_face_similarity, add temporal collision check. Verified: collision resolved 2026-05-06 14:55:49 +08:00
Warren
65a1f77e65 feat: trace quality agent selection report, identity clustering runner_v2 DB write, age/gender CoreML selection, updated experiment config UUID 2026-05-06 14:41:48 +08:00
Warren
74b6182eba feat: media API (video/bbox/thumbnail), UUID unification, dot matrix text, portal fixes, API dictionary V1.3 2026-05-06 13:34:49 +08:00
Warren
e75c4d6f07 cleanup: remove dead code and duplicate docs
- Remove session-ses_2f27.md (161KB raw session log)
- Remove 49 ROOT_* duplicate files across REFERENCE/
- Remove 14 duplicate files between REFERENCE/ root and history/
- Remove asr_legacy.rs (dead code, replaced by asr.rs)
- Remove src/core/worker/ (duplicate JobWorker)
- Remove src/core/layers/ (empty directory)
- Remove 4 .bak files in src/
- Remove 7 dead private methods in worker/processor.rs
- Remove backup directory from git tracking
2026-05-04 01:31:21 +08:00
Warren
ee81e343ce chore: remove obsolete APIs (register, probe, n8n, videos, people)
- Remove /api/v1/register (replaced by /api/v1/files/register)
- Remove /api/v1/probe (replaced by /api/v1/files/:uuid)
- Remove /api/v1/n8n/... (n8n workflow only)
- Remove /api/v1/unregister (high risk)
- Remove /api/v1/videos list (replaced by /api/v1/files)
- Remove /api/v1/people (merged into /api/v1/identities)
- Clean up dead code and unused structs
2026-04-30 22:16:24 +08:00
Warren
b54c2def30 feat: add migrations, test scripts, and utility tools
- Add database migrations (006-028) for face recognition, identity, file_uuid
- Add test scripts for ASR, face, search, processing
- Add portal frontend (Tauri)
- Add config, benchmark, and monitoring utilities
- Add model checkpoints and pretrained model references
2026-04-30 15:11:53 +08:00
Warren
4d75b2e251 docs: update docs_v1.0/ documentation
- Fix markdown lint issues (MD030, MD047, MD051, MD028, MD005)
- Update AI agents, architecture, implementation docs
- Add new identity, face recognition, and API documentation
- Remove deprecated face/person API guides
2026-04-30 15:10:41 +08:00
Warren
8f05a7c188 feat: update Python processors and add utility scripts
- Update ASR, face, OCR, pose processors
- Add release pre-flight check script
- Add synonym generation, chunk processing scripts
- Add face recognition, stamp search utilities
2026-04-30 15:07:49 +08:00
Warren
f4697396e4 chore: update dependencies and AGENTS.md
- Add mac_address crate for MAC address detection
- Add tempfile dev dependency for testing
- Update AGENTS.md with latest development guidelines
2026-04-30 15:07:31 +08:00
Warren
2b23d1cfbd feat: update core API, database layer, and worker modules
- Remove unused imports (n8n_search, universal_search, Client, Arc, etc.)
- Update API endpoints for identity, face recognition, search
- Fix postgres_db.rs search_videos parent_uuid column
- Add snapshot API and identity agent API
- Clean up backup files (.bak, .bak2)
2026-04-30 15:07:02 +08:00
Warren
8f2208dd63 chore: update .gitignore and remove .env files from tracking
- Add Python cache, test artifacts, backups, models to .gitignore
- Remove .env and .env.development from tracking (security)
- Keep release documentation, ignore binaries
2026-04-30 15:04:50 +08:00
Warren
5e896fb509 feat: implement Phase 6 Agent Integration (Translation API) 2026-04-26 00:07:18 +08:00
Warren
c15f7cd4af feat: implement Phase 5 Resource Registry & Heartbeat 2026-04-25 23:12:15 +08:00
Warren
4686c5abc4 feat: complete Phase 4 Candidate Workflow (Confirm/Reject API) 2026-04-25 22:27:31 +08:00
Warren
e84982e7d9 feat: Phase 3 API (Identity, Files, Candidates) and pre_chunks migration 2026-04-25 22:19:12 +08:00
Warren
1f84e5469f feat: backup architecture docs, source code, and scripts 2026-04-25 17:15:45 +08:00
Warren
59809dae1f chore: backup before migration to new repo 2026-04-23 16:46:02 +08:00
Warren
13dd3b30f3 docs: 添加 Places365 模型完整指南
內容:
- 手動下載方法(3 種)
- 模型驗證步驟
- 使用方式和預期改進
- 故障排除指南

目前狀態:
-  ImageNet 模型正常運作(37% 準確率)
-  Places365 模型可選手動下載(85-90% 準確率)
- 📄 完整安裝和使用指南
2026-04-01 03:19:42 +08:00
Warren
f45ecf4643 docs: 添加長影片場景識別測試報告
測試結果:
-  Old_Time_Movie_Show (114 分鐘) 處理成功
-  處理時間 313 秒(5.2 分鐘)
-  加速比 22x
-  記憶體使用穩定(3-4GB)
-  1,379 個取樣點

效能指標:
- 取樣間隔:5 秒
- 最小場景:10 秒
- 場景數量:1(ImageNet 模型限制)
- 信心度:25%

建議:
- 下載 Places365 模型提升準確率
- 整合 CUT 場景切換偵測
- 優化長片處理策略
2026-04-01 03:08:35 +08:00
Warren
d12caba00a docs: 添加場景識別測試結果報告
新增:
- docs_v1.0/TESTING/SCENE_CLASSIFICATION_TEST_RESULTS_2026_04_01.md

測試結果:
-  Rust 單元測試 5/5 通過
-  Python 功能測試通過
-  ExaSAN 影片識別成功
-  79 個取樣點,處理時間 1.2 秒
-  信心度 37%(ImageNet 模型)

效能指標:
- 處理速度:133x 實時
- 模型大小:44.7 MB
- MPS 加速:啟用
2026-04-01 03:01:07 +08:00
Warren
395f74bf07 feat: 添加場景識別 Playground API 整合
新增:
- scripts/test_scene_api.py - API 測試腳本
- docs_v1.0/IMPLEMENTATION/SCENE_API_INTEGRATION.md - API 整合指南

功能:
-  GET /api/v1/scene/:uuid endpoint 設計
-  Python 測試腳本
-  完整使用文檔
-  Python 整合範例

使用方式:
```bash
# 啟動 Playground (port 3003)
cargo run --bin momentry_playground -- server --port 3003

# 測試場景識別
python3 scripts/test_scene_api.py <video_uuid>
```

目前狀態:
-  Python 場景識別功能正常
-  API endpoint 設計完成
-  Rust 完整實作進行中
- 📄 完整文檔已建立
2026-04-01 02:55:52 +08:00
Warren
363d6913f9 docs: 添加 Places365 安裝指南和測試腳本
新增:
- docs_v1.0/IMPLEMENTATION/PLACES365_INSTALLATION.md
- scripts/test_places365_scene.py

功能:
-  Places365 380 個場景類別載入
-  場景分類器測試
-  影片場景分類測試

目前狀態:
-  基礎場景識別功能正常
-  Places365 模型可選手動安裝
- 📊 準確率 37% → 預期 85-90%
2026-04-01 02:39:13 +08:00
Warren
6d5d121d0f feat: 整合 Places365 場景類別到場景識別
- 新增 places365_categories.json (380 個場景類別)
- 更新場景識別使用 Places365 類別名稱
- 使用最常見場景類型作為影片主要場景
- 改進場景合併邏輯

改進:
- 場景名稱從 'unknown_X' 改為實際場景索引
- 支援 Places365 380 個場景類別
- 自動統計最常見場景類型

限制:
- ResNet18 使用 ImageNet 1000 類別
- Places365 只有 365 類別,索引不完全匹配
- 建議使用專門的 Places365 模型獲得最佳結果

測試結果:
- ExaSAN 影片識別為 scene_664 (37% 信心度)
- 處理時間:1.3 秒
- 79 個取樣點成功處理
2026-04-01 02:31:49 +08:00
Warren
4109ec3d95 docs: 修復場景識別測試報告 markdown 編號
- 修正有序列表編號符合 markdownlint MD029
- 使用 1/2/3 樣式而非連續編號
2026-04-01 02:21:40 +08:00
Warren
576f58df71 feat: add build version with timestamp
- Add build.rs to generate BUILD_VERSION at compile time
- Update CLI to show full version: '0.1.0 (build: 2026-03-31 11:21:37)'
- Update health endpoints to return build version
- Add chrono as build dependency
2026-03-31 11:30:50 +08:00
Warren
37d2b66c56 feat: add PostgreSQL schema isolation for playground environment
- Create schema.rs utility module with table_name() function
- Add schema prefix to all SQL queries in postgres_db.rs
- Support dev schema for playground, public for production
- Add DATABASE_SCHEMA, MONGODB_DATABASE, QDRANT_COLLECTION config
- Fix 40+ functions including videos, chunks, frames, vectors, etc.
- Update Cargo dependencies
2026-03-31 10:30:33 +08:00
Warren
95b44f1e55 fix: backup monitoring and PATH environment issues
- Fix backup_monitor.sh find command to sort by modification time
- Fix grep -oP syntax error (change to grep -oE)
- Adjust tier rotation threshold from -mtime +7 to +6
- Add backup_all.sh script with PATH fixes for crontab
- Add mysql-client/bin to PATH for mysqldump command
- Fix backup status check for v2 naming patterns
2026-03-30 04:11:02 +08:00
Warren
2393d81a3f feat: fix Chinese text search and duplicate chunk_id bug
- Add helper functions to extract text from nested content structure
- Update SearchResult to include uuid field
- Add PostgreSQL function get_chunk_by_chunk_id_and_uuid to handle duplicate chunk_ids
- Update Qdrant search functions to extract uuid from payload
- Change embedding model to nomic-embed-text-v2-moe:latest
- Update Qdrant collection name to momentry_rule1
- Fix MongoDB authentication and disable cache for development
- Improve error handling in processor.rs
- Update documentation with new embedding model
2026-03-29 04:44:28 +08:00
Warren
82955504f3 feat: 新增 Job Worker 系統與 API 文檔全面更新 2026-03-26 16:16:34 +08:00
Warren
80399b1c12 fix: return file_path instead of media_url in n8n search API
The media_url was constructed using MEDIA_BASE_URL which returned
404s. Now returns actual file_path from database for n8n workflow.
2026-03-25 17:24:29 +08:00
Warren
ceb33877ff docs: change media_url to file_path in API response 2026-03-25 16:39:48 +08:00
Warren
dacfb7e083 docs: clarify media_url is auto-generated and may not be accessible 2026-03-25 16:33:10 +08:00
Warren
fb60858cec docs: clarify media_url meaning in search API 2026-03-25 16:28:44 +08:00
Warren
f1d7077e40 docs: fix media_url example to show realistic filename 2026-03-25 16:27:06 +08:00
Warren
4f402c873b docs: add SFTPGo demo credentials to training manual 2026-03-25 16:23:46 +08:00
Warren
a89d94bc67 docs: update SFTPGo host to sftpgo.momentry.ddns.net 2026-03-25 16:21:15 +08:00
Warren
17cab667f9 docs: add chunk API usage, playback format, and API examples 2026-03-25 16:06:11 +08:00
Warren
f8925ab994 docs: update API docs with cache/unregister endpoints and marcom training refs 2026-03-25 15:56:29 +08:00
Warren
dac2b234d0 docs: add version history tables to all training docs 2026-03-25 15:54:01 +08:00
Warren
67c8c60ceb docs: add search endpoint documentation with chunk details 2026-03-25 15:51:30 +08:00
1724 changed files with 530151 additions and 10061 deletions

40
.env
View File

@@ -1,40 +0,0 @@
# Database Configuration
DATABASE_URL=postgres://accusys@localhost:5432/momentry
# Redis
# Format: redis://[username][:password]@host:port
# Users: default (with password), accusys (custom user with password)
REDIS_URL=redis://accusys:accusys@localhost:6379
# MongoDB
MONGODB_URL=mongodb://accusys:Test3200Test3200@localhost:27017/admin
MONGODB_DATABASE=momentry
# Qdrant Vector Database
QDRANT_URL=http://localhost:6333
QDRANT_API_KEY=Test3200Test3200Test3200
QDRANT_COLLECTION=chunks_v3
# Gitea
GITEA_URL=http://localhost:3000
# API Server (Production)
MOMENTRY_SERVER_PORT=3002
MOMENTRY_REDIS_PREFIX=momentry:
API_HOST=127.0.0.1
API_PORT=3002
# Worker Configuration (Production)
MOMENTRY_WORKER_ENABLED=true
MOMENTRY_MAX_CONCURRENT=2
MOMENTRY_POLL_INTERVAL=5
# Watch Directories (comma separated)
WATCH_DIRECTORIES=~/Videos,~/momentry_core_project/test_video
# Ollama (for Mistral 7B LLM)
OLLAMA_HOST=http://localhost:11434
# Model Paths
# EMBEDDING_MODEL_PATH=./models/comic-embed-text
# LLM_MODEL_PATH=./models/mistral-7b

View File

@@ -8,39 +8,41 @@
MOMENTRY_SERVER_PORT=3003
MOMENTRY_REDIS_PREFIX=momentry_dev:
# Worker Configuration (disabled by default for development)
MOMENTRY_WORKER_ENABLED=false
MOMENTRY_MAX_CONCURRENT=1
# Worker Configuration (enabled for development)
MOMENTRY_WORKER_ENABLED=true
MOMENTRY_MAX_CONCURRENT=6
MOMENTRY_POLL_INTERVAL=10
MOMENTRY_WORKER_BATCH_SIZE=5
# Database (same as production, but could use separate dev database)
# Database (PostgreSQL) - Schema isolation
DATABASE_URL=postgres://accusys@localhost:5432/momentry
DATABASE_SCHEMA=dev
# MongoDB
MONGODB_URL=mongodb://accusys:Test3200Test3200@localhost:27017/admin
MONGODB_DATABASE=momentry
# MongoDB - Database isolation
MONGODB_URL=mongodb://localhost:27017
MONGODB_DATABASE=momentry_dev
# Redis
REDIS_URL=redis://:accusys@localhost:6379
REDIS_PASSWORD=accusys
# Redis (already isolated via prefix)
REDIS_URL=redis://127.0.0.1:6379
# REDIS_PASSWORD not set - Redis has no password configured
# Qdrant Vector Database (same as production)
# Qdrant Vector Database - Collection isolation
QDRANT_URL=http://localhost:6333
QDRANT_API_KEY=Test3200Test3200Test3200
QDRANT_COLLECTION=chunks_v3
QDRANT_COLLECTION=momentry_dev_rule1_v2
# Paths
MOMENTRY_OUTPUT_DIR=/Users/accusys/momentry/output_dev
MOMENTRY_BACKUP_DIR=/Users/accusys/momentry/backup/momentry_dev
MOMENTRY_SFTP_ROOT=/Users/accusys/momentry/var/sftpgo/data/demo/
# Python (for processing scripts)
MOMENTRY_PYTHON_PATH=/opt/homebrew/bin/python3.11
MOMENTRY_SCRIPTS_DIR=/Users/accusys/momentry_core_0.1/scripts
MOMENTRY_PYTHON_PATH=/Users/accusys/momentry_core/venv/bin/python
MOMENTRY_SCRIPTS_DIR=/Users/accusys/momentry_core/scripts
# Logging
RUST_LOG=debug
MOMENTRY_LOG_LEVEL=debug
RUST_LOG=info
MOMENTRY_LOG_LEVEL=info
# Media
MOMENTRY_MEDIA_BASE_URL=https://wp.momentry.ddns.net
@@ -51,10 +53,51 @@ MOMENTRY_CUT_TIMEOUT=3600
MOMENTRY_DEFAULT_TIMEOUT=7200
# Cache Settings
MONGODB_CACHE_ENABLED=true
MONGODB_CACHE_ENABLED=false
MONGODB_CACHE_TTL_VIDEOS=300
MONGODB_CACHE_TTL_SEARCH=300
MONGODB_CACHE_TTL_HYBRID_SEARCH=600
MONGODB_CACHE_TTL_VIDEO_META=3600
REDIS_CACHE_TTL_HEALTH=30
REDIS_CACHE_TTL_VIDEO_META=3600
REDIS_CACHE_TTL_VIDEO_META=3600
# 同義詞配置文件(可選)
# 取消註釋並設置為您的同義詞JSON檔案路徑以啟用同義詞擴展
# MOMENTRY_SYNONYM_FILE=/Users/accusys/momentry_core_0.1/docs/examples/custom_synonyms.json
#
# 多個同義詞檔案(逗號分隔),會覆蓋 MOMENTRY_SYNONYM_FILE
# MOMENTRY_SYNONYM_FILES=/path/to/first.json,/path/to/second.json
#
# 示例檔案docs/examples/custom_synonyms.json
# TMDb Integration (probe phase - auto-create identities from movie metadata)
TMDB_API_KEY=e9cde52197f6f8df4d9db99da93db1fb
MOMENTRY_TMDB_PROBE_ENABLED=true
# LLM for 5W1H summary (points to M5 Gemma4)
MOMENTRY_LLM_SUMMARY_URL=http://127.0.0.1:8082/v1/chat/completions
MOMENTRY_LLM_SUMMARY_MODEL=google_gemma-4-26B-A4B-it-Q5_K_M.gguf
MOMENTRY_LLM_SUMMARY_ENABLED=true
# LLM Chat (A4B on port 8082)
MOMENTRY_LLM_CHAT_URL=http://127.0.0.1:8082/v1/chat/completions
MOMENTRY_LLM_CHAT_MODEL=google_gemma-4-26B-A4B-it-Q5_K_M.gguf
# LLM Vision (E4B on port 8083)
MOMENTRY_LLM_VISION_URL=http://127.0.0.1:8083/v1/chat/completions
MOMENTRY_LLM_VISION_MODEL=gemma-4-E4B-it-Q4_K_M.gguf
# Embedding (ANE CoreML server)
MOMENTRY_EMBED_URL=http://localhost:11436
# === Binary & Data Paths (for start_momentry.sh) ===
MOMENTRY_LOG_DIR=/Users/accusys/momentry/logs
MOMENTRY_PG_BIN_DIR=/Users/accusys/pgsql/18.3/bin
MOMENTRY_PG_DATA_DIR=/Users/accusys/pgsql/data
MOMENTRY_QDRANT_BIN=/Users/accusys/.cargo/bin/qdrant
MOMENTRY_QDRANT_STORAGE_DIR=/Users/accusys/momentry/qdrant_storage
MOMENTRY_LLAMACPP_BIN=/Users/accusys/llama/bin/llama-server
MOMENTRY_LLM_A4B_MODEL_PATH=/Users/accusys/models/google_gemma-4-26B-A4B-it-Q5_K_M.gguf
MOMENTRY_LLM_A4B_MMPROJ_PATH=/Users/accusys/models/gemma-4-26B-A4B-it.mmproj-f16.gguf
MOMENTRY_LLM_E4B_MODEL_PATH=/Users/accusys/models/gemma-4-E4B-it-Q4_K_M.gguf
MOMENTRY_LLM_E4B_MMPROJ_PATH=/Users/accusys/models/mmproj-gemma-4-E4B-it-BF16.gguf
MOMENTRY_OLLAMA_BIN=/Users/accusys/bin/ollama
MOMENTRY_PLAYGROUND_BIN=target/debug/momentry_playground

View File

@@ -1,70 +1,63 @@
# Momentry Core Configuration Template
# Copy this file to .env and customize for your environment
# DO NOT commit .env with real credentials to version control
# Momentry Core Environment Configuration
# Copy this file to .env and fill in your values
# DO NOT commit .env to version control
# ===========================================
# Database Configuration
# ===========================================
DATABASE_URL=postgres://user:password@localhost:5432/momentry
# === Database ===
DATABASE_URL=postgres://accusys@localhost:5432/momentry
DATABASE_SCHEMA=dev
# ===========================================
# Redis Configuration
# ===========================================
REDIS_URL=redis://user:password@localhost:6379
REDIS_PASSWORD=your_redis_password
# ===========================================
# MongoDB Configuration
# ===========================================
MONGODB_URL=mongodb://user:password@localhost:27017/admin
# === MongoDB ===
MONGODB_URL=mongodb://localhost:27017
MONGODB_DATABASE=momentry
MONGODB_CACHE_ENABLED=true
# ===========================================
# Qdrant Configuration
# ===========================================
QDRANT_URL=http://localhost:6333
QDRANT_API_KEY=your_qdrant_api_key
QDRANT_COLLECTION=chunks_v3
# === Redis ===
REDIS_URL=redis://:accusys@localhost:6379
REDIS_PASSWORD=accusys
MOMENTRY_REDIS_PREFIX=momentry_dev:
# ===========================================
# API Server Configuration
# ===========================================
API_HOST=127.0.0.1
API_PORT=3000
# === Qdrant ===
QDRANT_COLLECTION=momentry_rule1
# ===========================================
# Directory Paths
# ===========================================
MOMENTRY_OUTPUT_DIR=/path/to/output
MOMENTRY_BACKUP_DIR=/path/to/backup
MOMENTRY_SCRIPTS_DIR=/path/to/momentry_core/scripts
# === API Keys ===
MOMENTRY_API_KEY=muser_your_key_here
MOMENTRY_DEMO_API_KEY=muser_your_demo_key_here
JWT_SECRET=your_jwt_secret_here_change_in_production
SFTPGO_BASE_URL=http://127.0.0.1:8080
TMDB_API_KEY=your_tmdb_api_key_here
# === LLM ===
MOMENTRY_LLM_SUMMARY_URL=http://127.0.0.1:8082/v1/chat/completions
MOMENTRY_LLM_SUMMARY_MODEL=google_gemma-4-26B-A4B-it-Q5_K_M.gguf
MOMENTRY_LLM_SUMMARY_TIMEOUT=120
# LLM Chat (A4B)
MOMENTRY_LLM_CHAT_URL=http://127.0.0.1:8082/v1/chat/completions
MOMENTRY_LLM_CHAT_MODEL=google_gemma-4-26B-A4B-it-Q5_K_M.gguf
MOMENTRY_LLM_CHAT_TIMEOUT=120
# LLM Vision (E4B)
MOMENTRY_LLM_VISION_URL=http://127.0.0.1:8083/v1/chat/completions
MOMENTRY_LLM_VISION_MODEL=gemma-4-E4B-it-Q4_K_M.gguf
MOMENTRY_LLM_VISION_TIMEOUT=120
# === Paths ===
MOMENTRY_OUTPUT_DIR=/Users/accusys/momentry/output_dev
MOMENTRY_BACKUP_DIR=/Users/accusys/momentry/backup
MOMENTRY_SCRIPTS_DIR=/Users/accusys/momentry_core_0.1/scripts
MOMENTRY_PYTHON_PATH=/opt/homebrew/bin/python3.11
MOMENTRY_FFMPEG=/opt/homebrew/opt/ffmpeg-full/bin/ffmpeg
MOMENTRY_MEDIA_BASE_URL=
# ===========================================
# Processor Timeouts (seconds)
# ===========================================
# === Encryption ===
AUDIT_ENCRYPTION_KEY= # 32 bytes hex (64 hex chars)
# === Processor Timeouts (seconds) ===
MOMENTRY_ASR_TIMEOUT=3600
MOMENTRY_CUT_TIMEOUT=3600
MOMENTRY_DEFAULT_TIMEOUT=7200
# ===========================================
# Watch Directories (comma separated)
# ===========================================
WATCH_DIRECTORIES=~/Videos,~/Downloads
# ===========================================
# Logging
# ===========================================
RUST_LOG=info
# Options: trace, debug, info, warn, error
# ===========================================
# Ollama (for LLM integration)
# ===========================================
OLLAMA_HOST=http://localhost:11434
# ===========================================
# Model Paths
# ===========================================
# EMBEDDING_MODEL_PATH=./models/embedding
# LLM_MODEL_PATH=./models/llm
# === Server ===
MOMENTRY_SERVER_PORT=3003
MOMENTRY_LOG_LEVEL=info

79
.gitignore vendored
View File

@@ -1,40 +1,49 @@
# Environment - Local configs (NEVER commit these)
.env
.env.local
.env.*.local
# Build artifacts
target/
venv/
# Generated files
thumbnails/
*.asr.json
*.probe.json
test_asr.json
# Local output (machine learning results)
output/
*.pt
# Cache
.ruff_cache/
# OS files
.DS_Store
.Spotlight-V100
.Trashes
# Logs
.env
.env.development
*.gguf
*.mlpackage
*.pt
*.pth
*.bin
*.onnx
*.zip
*.tar.gz
venv/
__pycache__/
node_modules/
*.log
/tmp/
*.diff
*.bundle
*.probe.json
*.cut.json
.qdrant-initialized
dump.rdb
fix55.js
checksums.sha256
# SSH keys (NEVER commit)
id_*
!id_*.pub
# IDE and editor
scripts/swift_processors/.build/
.opencode/
.vscode/
.idea/
*.swp
*.swo
*~
backups/
logs/
output/
models/
data/
storage/
thumbnails/
services/
model_checkpoints/
release/delivery/
release/system/
release/phase*/
release/dev_*.sql
release/migrate_*.sql
release/files/
package-lock.json
package.json
portal/dist/
portal/src-tauri/icons/
momentry_runtime/logs/

View File

@@ -0,0 +1,15 @@
{
"db_name": "PostgreSQL",
"query": "UPDATE dev.videos SET processing_status = $1 WHERE uuid = $2",
"describe": {
"columns": [],
"parameters": {
"Left": [
"Jsonb",
"Text"
]
},
"nullable": []
},
"hash": "2d61eacd106ad5144c99a85c84f070924af9b29103a507e115674d1b14b77181"
}

View File

@@ -0,0 +1,14 @@
{
"db_name": "PostgreSQL",
"query": "UPDATE dev.jobs SET status = 'COMPLETED', processed_frames = total_frames, updated_at = NOW() WHERE id = $1",
"describe": {
"columns": [],
"parameters": {
"Left": [
"Uuid"
]
},
"nullable": []
},
"hash": "345d912734b063a7b30d52c066045553964d0a55453a7e26a4d8b8d758be3857"
}

View File

@@ -0,0 +1,15 @@
{
"db_name": "PostgreSQL",
"query": "UPDATE dev.jobs SET status = 'FAILED', error_message = $2, updated_at = NOW() WHERE id = $1",
"describe": {
"columns": [],
"parameters": {
"Left": [
"Uuid",
"Text"
]
},
"nullable": []
},
"hash": "60cc008705cfea3a4532b9496db8f6ed0e3023436660bdf8ee81fe78fe270971"
}

412
AGENTS.md
View File

@@ -2,12 +2,193 @@
Rust-based digital asset management system with video analysis and RAG capabilities.
---
## ⚠️ CRITICAL: 開發隔離原則
### 絕對禁止事項
- **絕對不可修改 `/Users/accusys/wordpress/` 目錄下的任何檔案**
- **絕對不可修改 n8n 工作流或設定**
- **絕對不可修改 WordPress 或 n8n 的資料庫 table**
- **除非是 release 作業,絕對不可動 port 3002 (production)**
- **🔴 DELETE / REMOVE / DROP / CLEAR 任何資料前必須先問使用者「要刪嗎?」獲得明確同意後才能執行**
- **🔴 Qdrant collection 刪除、DB truncate、檔案刪除、資料清空 — 一律要先問**
- **🔴 不確定是否該刪 → 先問,不要自己決定**
- **🔴 改變議題前必須先存檔紀錄**:使用 `todowrite` 工具或建立紀錄文件(如 `docs_v1.0/M4_workspace/YYYY-MM-DD_topic_handoff.md`),確保上下文不丟失
### 開發範圍界定
| 範圍 | 狀態 | 說明 |
|------|------|------|
| `momentry_core_0.1/` | ✅ **可開發** | Momentry Core 主要開發目錄 |
| `momentry_core_0.1/portal/` | ✅ **可開發** | Tauri Portal 前端 |
| `momentry_core_0.1/src/` | ✅ **可開發** | Rust 後端程式碼 |
| `/Users/accusys/wordpress/` | ❌ **禁止修改** | WordPress/Marcom 團隊負責 |
| n8n 工作流 | ❌ **禁止修改** | 自動化流程,與 dev 無關 |
| WordPress/n8n 資料庫 table | ❌ **禁止修改** | Marcom 團隊管理,與 dev 無關 |
### 開發環境
| 服務 | Port | 用途 | 命令 |
|------|------|------|------|
| Playground | 3003 | **唯一開發環境** | `cargo run --bin momentry_playground -- server` |
| Production | 3002 | ❌ 禁止修改 | `cargo run -- server` (僅 release 時) |
| Portal (Tauri) | 1420 | 前端開發 | `npm run tauri dev` |
### 日誌與啟動
| 服務 | 日誌路徑 | 啟動方式 |
|------|----------|----------|
| Production (3002) | `logs/momentry_3002.log` | `./run-server-3002.sh` |
| Playground (3003) | `logs/momentry_3003.log` | `./run-server-3003.sh` |
| Worker / 歷史 | `logs/nohup_worker_*.log` | 由 worker 自動產生 |
> **注意**: 所有伺服器日誌統一存放於專案內 `logs/` 目錄。
> 啟動腳本會自動 kill 舊程序、重 build若需要、並將日誌導向 `logs/`。
## ⚠️ 交叉污染防制 (Cross-Contamination Prevention)
**每個執行前必須評估是否會汙染其他獨立作業。**
### Scope Isolation Matrix
| 執行內容 | 允許的 Scope | 禁止影響 | 檢查事項 |
|----------|-------------|----------|----------|
| M4 delivery binary | `target/release/momentry` | Playground (3003), Production (3002) | 確認舊 process 未被誤殺 |
| Playground server | `localhost:3003`, `dev.*` schema | Production (3002), `public.*` schema | `DATABASE_SCHEMA=dev` |
| Production deploy | `localhost:3002`, `public.*` schema | Playground (3003), `dev.*` schema | 先停 production不影響 playground |
| Git commit | 只包含意圖修改的檔案 | 無關的 untracked files | `git status` 確認 stage 內容正確 |
| CI / packaged tests | 測試環境 | 正式資料 | 測試用 DB 不能連到 production |
| Doc changes | 指定文件 | 其他文件、程式碼 | `git diff --stat` 檢查 scope |
| SQL migration | 目標 schema | 其他 schema、無關 table | `WHERE` clause 要精準 |
| `sed` / `grep` / mass edit | 目標檔案集 | 非目標檔案 | 先用 `grep -c` 確認只有目標檔案匹配 |
### Recent Violations / Near-Misses
| 事件 | 問題 | 防止方式 |
|------|------|----------|
| `sed` API doc 編號 | `sed -i '' 's/.../.../g'` 改到所有行 | 先 `grep -c` 確認匹配,`git diff` 再提交 |
| 亂加 `/api/v1/register` route | 不必要的 API 別名,汙染路由表 | 角色切換:路由設計不該由實作方決定 |
| `API_WORKSPACE/` vs `GUIDES/` vs `REFERENCE/` vs `DESIGN/` vs `OPERATIONS/` vs `INTEGRATIONS/` | 文件放到錯誤分類 | API 文件改在 API_WORKSPACE/modules/ 編輯,`make deploy` 生成到 GUIDES/ |
| Build release binary in plan mode | 浪費時間,無意義 | 嚴格遵守 plan/build mode 規定 |
### ⛔ 嚴格測試隔離規則 (Strict Test Isolation)
- **所有測試 (Test) 必須在 Dev (3003) 進行**。
- **絕對禁止 (ABSOLUTELY FORBIDDEN)** 在任何測試指令、Demo 流程或 API 檢查中使用 `localhost:3002`
- 即使是「測試 Unregister」或「檢查版本」若未明確標示為 "Production Deployment",一律視為違規。
- **預設行為**: 所有 curl, CLI, 或程式碼測試指令,預設 URL 必須為 `http://localhost:3003`
### 違反後果
- 修改 WordPress/n8n 可能影響 marcom 團隊工作與生產環境
- 修改 WordPress/n8n 資料庫 table 可能破壞自動化流程與資料完整性
- 修改 port 3002 可能中斷正在使用的服務 (這是非常嚴重的錯誤)
- 所有 dev 測試必須在 playground (3003) 進行
---
## AI Coding Principles (Karpathy-Inspired)
Behavioral guidelines to reduce common LLM coding mistakes.
Source: [andrej-karpathy-skills](https://github.com/forrestchang/andrej-karpathy-skills) (94K stars)
**Tradeoff:** These guidelines bias toward caution over speed. For trivial tasks, use judgment.
### 1. Think Before Coding
**Don't assume. Don't hide confusion. Surface tradeoffs.**
- State your assumptions explicitly. If uncertain, ask.
- If multiple interpretations exist, present them - don't pick silently.
- If a simpler approach exists, say so. Push back when warranted.
- If something is unclear, stop. Name what's confusing. Ask.
### 2. Simplicity First
**Minimum code that solves the problem. Nothing speculative.**
- No features beyond what was asked.
- No abstractions for single-use code.
- No "flexibility" or "configurability" that wasn't requested.
- No error handling for impossible scenarios.
- If you write 200 lines and it could be 50, rewrite it.
Ask yourself: "Would a senior engineer say this is overcomplicated?" If yes, simplify.
### 3. Surgical Changes
**Touch only what you must. Clean up only your own mess.**
When editing existing code:
- Don't "improve" adjacent code, comments, or formatting.
- Don't refactor things that aren't broken.
- Match existing style, even if you'd do it differently.
- If you notice unrelated dead code, mention it - don't delete it.
When your changes create orphans:
- Remove imports/variables/functions that YOUR changes made unused.
- Don't remove pre-existing dead code unless asked.
The test: Every changed line should trace directly to the user's request.
### 4. Goal-Driven Execution
**Define success criteria. Loop until verified.**
Transform tasks into verifiable goals:
- "Add validation" -> "Write tests for invalid inputs, then make them pass"
- "Fix the bug" -> "Write a test that reproduces it, then make it pass"
- "Refactor X" -> "Ensure tests pass before and after"
For multi-step tasks, state a brief plan:
```
1. [Step] -> verify: [check]
2. [Step] -> verify: [check]
3. [Step] -> verify: [check]
```
Strong success criteria let you loop independently. Weak criteria ("make it work") require constant clarification.
---
These guidelines are working if: fewer unnecessary changes in diffs, fewer rewrites due to overcomplication, and clarifying questions come before implementation rather than after mistakes.
---
## Terminology (V4.0)
| Term | Scope | Description | Example |
|------|-------|-------------|---------|
| **file_uuid** | Video file | Video file identifier (renamed from `video_uuid`) | `384b0ff44aaaa1f1` |
| **identity_uuid** | Global identity | Global person identity (cross-file) | `a9a90105-6d6b-46ff-92da-0c3c1a57dff4` |
| **face_id** | Single detection | Single face detection (frame-level) | `face_100` |
| **trace_id** | Face tracking | Face tracking ID (Face Tracker output) | `2` |
| **chunk_id** | Sentence chunk | Sentence chunk (from pre_chunks via rules) | `chunk_1` |
| **speaker_id** | Speaker segment | Speaker ID (from ASRX) | `SPEAKER_0` |
| **person_id** | ❌ **Deprecated** | Video-local person ID (removed in V4.0) | - |
### Architecture (V4.0)
```
Face → Identity (Two-layer, direct binding)
person_identities table: REMOVED
file_identities table: ADDED (N:N relationship)
```
### Key Changes (V3.x → V4.0)
| Change | V3.x | V4.0 |
|--------|------|------|
| **video_uuid** | Used everywhere | **file_uuid** |
| **person_identities** | Required (303 records) | **Removed** |
| **person_id APIs** | 28 endpoints | **Removed** (except register/bind) |
| **Face binding** | Person → Identity | **Face → Identity** (direct) |
| **Chunk binding** | Manual | **Auto** (time alignment) |
---
## Build & Run Commands
```bash
# Build project
# Build project (use debug builds for development/testing)
cargo build
cargo build --release
cargo build --bin momentry
cargo build --bin momentry_playground
@@ -22,8 +203,29 @@ cargo run -- server --host 0.0.0.0 --port 3002
# Run playground (development binary)
cargo run --bin momentry_playground -- server
cargo run --bin momentry_playground -- --help
# Start servers (recommended — auto-build & logs to logs/)
./run-server-3002.sh
./run-server-3003.sh
```
### Server Logs
All runtime logs are centralized in `logs/`:
```bash
# View real-time logs
tail -f logs/momentry_3002.log
tail -f logs/momentry_3003.log
# Check recent errors
grep -i "error\|panic\|FAIL" logs/momentry_*.log | tail -20
```
### ⚠️ CRITICAL: `cargo build --release` PROHIBITION
- **NEVER run `cargo build --release` unless the user explicitly says "release the binary" or "正式 release"**
- `cargo build --release` is SLOW and only needed when producing a production binary for deployment
- For all development, testing, debugging, and linting: use `cargo build` or `cargo check`
- If uncertain, ALWAYS ask the user first
## Binaries
| Binary | Purpose | Port | Redis Prefix | Environment |
@@ -182,6 +384,15 @@ src/
### Server
- `MOMENTRY_SERVER_PORT` - API server port (default: `3002` for production, `3003` for playground)
- `MOMENTRY_REDIS_PREFIX` - Redis key prefix (default: `momentry:` for production, `momentry_dev:` for playground)
- `MOMENTRY_API_KEY` - API key for Player online mode testing
### Testing API Key
```bash
export MOMENTRY_API_KEY="muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
# Test Player online mode
cargo run --features player --bin momentry_player -- -o
```
### Database
- `DATABASE_URL` - PostgreSQL (default: `postgres://accusys@localhost:5432/momentry`)
@@ -201,6 +412,16 @@ src/
- `MOMENTRY_CUT_TIMEOUT` - CUT timeout in seconds (default: 3600)
- `MOMENTRY_DEFAULT_TIMEOUT` - Default timeout (default: 7200)
### TMDb Integration (Face Clustering)
- `TMDB_API_KEY` - TMDb API key for movie metadata lookup (required for `MOMENTRY_TMDB_PROBE_ENABLED=true`)
- `MOMENTRY_TMDB_PROBE_ENABLED` - Enable TMDb probe during registration (default: `false`)
- Register phase: searches TMDb by filename, creates identities with tmdb_id/tmdb_profile
- Post-process phase: matches detected faces against TMDb identities via cosine similarity
### Synonym Expansion
- `MOMENTRY_SYNONYM_FILES` - Comma-separated paths to synonym JSON files (e.g., `data/english_synonyms.json,data/llm_synonyms.json`)
- `MOMENTRY_SYNONYM_FILE` - Single synonym JSON file path (deprecated, use above)
### Logging
- `RUST_LOG` or `MOMENTRY_LOG_LEVEL` - Log level (default: `info`)
@@ -212,6 +433,24 @@ src/
- Monitor directory is a separate system (not Rust)
- PythonExecutor provides unified script execution with timeout support
- Redis 1.0.x for improved performance
- FaceNet CoreML model (`models/facenet512.mlpackage`) replaces InsightFace for embedding extraction (MIT license, ANE-accelerated)
### LLM Synonym Generation
Generate synonym database using llama.cpp (Gemma4):
```bash
# Generate full database (162 entries, ~5 minutes)
python3 scripts/generate_synonyms_llamacpp.py
# Quick test
python3 scripts/generate_synonyms_llamacpp.py --test
# Resume from existing file
python3 scripts/generate_synonyms_llamacpp.py --resume
# Output: data/llm_synonyms.json (27 Chinese + 135 English words)
```
## Task Management
@@ -313,6 +552,85 @@ shellcheck scripts/*.sh monitor/**/*.sh
**注意**: Hook 只檢查 error 等級的 shellcheck 問題style 警告會顯示但不阻擋提交。
## Gitea Sync
主要 sync 管道為 Gitea`http://192.168.110.200:3000/admin/momentry_core.git`
### 產生 Access Token首次設定
```bash
# admin 帳號密碼為 AccusysTest!
TOKEN=$(curl -s -X POST "http://192.168.110.200:3000/api/v1/users/admin/tokens" \
-u "admin:AccusysTest!" \
-H "Content-Type: application/json" \
-d '{"name":"m5max128_push","scopes":["write:repository"]}' | jq -r '.sha1')
echo $TOKEN
```
### 設定 Remote
```bash
# 用 token 取代密碼
git remote add origin http://admin:TOKEN@192.168.110.200:3000/admin/momentry_core.git
# 同步
git pull origin main
git push origin main
```
### Token 記錄
| 機器 | Token |
|------|-------|
| M5Max128 | `c33768c4cc26c0f4c575dcce832e92e5cf192773` (write:repository + write:user) |
**注意**: Token 有 write:repository scope勿外洩。如需新增 token 給其他機器,各自產自己的 token。
## Release Workflow
### Release 前準備
每次 release production binary 前,必須:
1. **建立 Release Tag**
```bash
git tag -a v0.X.X -m "Release vX.X.X - YYYY-MM-DD"
git push origin v0.X.X
```
2. **備份獨立 Source Code**
```bash
# 建立 release 獨立目錄
RELEASE_DIR="/Users/accusys/momentry_core_releases/v0.X.X"
mkdir -p "$RELEASE_DIR"
# 複製完整原始碼(排除不必要的檔案)
rsync -av --exclude='.git' --exclude='target' --exclude='node_modules' \
/Users/accusys/momentry_core_0.1/ "$RELEASE_DIR/"
# 記錄 release 資訊
echo "Release: v0.X.X" > "$RELEASE_DIR/RELEASE_INFO.txt"
echo "Date: $(date)" >> "$RELEASE_DIR/RELEASE_INFO.txt"
echo "Git Commit: $(git rev-parse HEAD)" >> "$RELEASE_DIR/RELEASE_INFO.txt"
echo "Binary: $(ls -la target/release/momentry)" >> "$RELEASE_DIR/RELEASE_INFO.txt"
```
3. **備份 Binary**
```bash
cp target/release/momentry "$RELEASE_DIR/momentry_v0.X.X"
cp target/release/momentry_playground "$RELEASE_DIR/momentry_playground_v0.X.X" 2>/dev/null
```
4. **記錄資料庫 Schema**
```bash
pg_dump -U accusys -d momentry --schema-only > "$RELEASE_DIR/schema_v0.X.X.sql"
```
### 重要性
- 避免 release binary 與 current source code 不一致
- 方便追蹤特定 release 的程式碼狀態
- 必要時可快速復原或比對差異
- 確保資料庫 schema 與程式碼版本對應
## Reference Documents
| 文件 | 用途 |
@@ -411,3 +729,93 @@ Phase 1: marcom 建構 (現在) → Elementor 頁面建構
Phase 2: 交付審視 (TBD) → 功能確認 / 重構評估
Phase 3: OpenCode 重構 → 純程式碼實作,交付無 Elementor 依賴版本
```
## M4 通知規範
### 固定通知方式
通知 M4 的唯一管道:**`M4_workspace/` 下建立回覆文件 + `git commit`**。不需口頭、即時訊息、郵件。
### 命名規則
```
docs_v1.0/M4_workspace/YYYY-MM-DD_<topic>_response.md (回覆 M4 問題)
docs_v1.0/M4_workspace/YYYY-MM-DD_<topic>.md (主動通報)
docs_v1.0/M4_workspace/YYYY-MM-DD_<topic>_test_report.md (測試報告)
```
### 觸發時機
| 情境 | 動作 |
|------|------|
| M4 提交問題報告到 `M4_workspace/` | 修復後,回覆 `*_response.md` |
| 完成 M4 要求的任務 | 回覆 `*_response.md` |
| 重大變更(模型替換、架構變更) | 主動通知 `*.md` |
| 新測試包產出 | `*_test_report.md` |
### 交付檢查
1. 文件寫入 `docs_v1.0/M4_workspace/`
2. `git add` 包含該文件
3. `git commit` 含相關變更
4. M4 透過 git log 查看
詳細規範見 `docs_v1.0/M4_workspace/M4_NOTIFICATION_PROTOCOL.md`。
## UUID Naming Rule
**Never use bare `uuid` in API route paths, query params, JSON keys, or code variable names. Always qualify:**
| Context | Must use | Never |
|---------|----------|-------|
| Video/file resource | `file_uuid` | `uuid` |
| Identity resource | `identity_uuid` | `uuid` |
| Query parameter | `file_uuid=`, `identity_uuid=` | `uuid=` |
| Route path | `:file_uuid`, `:identity_uuid` | `:uuid` |
| JSON key | `"file_uuid"`, `"identity_uuid"` | `"uuid"` |
This applies to docs, code, API responses, and curl examples. Exceptions: internal database primary key names (e.g. `identities.uuid` column).
## Document Compliance Checklist
Before creating any file in `docs_v1.0/` (API_WORKSPACE, GUIDES, REFERENCE, DESIGN, OPERATIONS, INTEGRATIONS), verify all items below.
**IMPORTANT**: API functional documents are generated from `API_WORKSPACE/modules/`. Edit modules there, then run `make deploy` in `API_WORKSPACE/` to update `GUIDES/`. Never edit generated files in `GUIDES/` directly. See `DESIGN/Modular_Doc_System_V1.0.md` for the full system design.
### P0 — Mandatory (7 items)
| # | Check | Rule |
|---|-------|------|
| 1 | YAML frontmatter | `title`, `version`, `date`, `author`, `status` present |
| 2 | Version history | Table at bottom of file tracking changes |
| 3 | Top info table | scope, status, applicable to, etc. |
| 4 | PascalCase filename | e.g. `DetectorRegistry.md`, not `detector_registry.md` |
| 5 | `_` separator | Within filenames use `_`, never spaces or other chars |
| 6 | English content | Entire file in English |
| 7 | Correct directory | File must reside in appropriate directory: `API_WORKSPACE/modules/` (API endpoint modules), `GUIDES/` (user docs, generated), `REFERENCE/` (data models), `DESIGN/` (architecture), `OPERATIONS/` (infra/release), `INTEGRATIONS/` (n8n/tests) |
### P0b — UUID Naming
| # | Check | Rule |
|---|-------|------|
| 8 | `file_uuid` not bare `uuid` | All file references use `file_uuid` (see UUID Naming Rule above) |
| 9 | `identity_uuid` not bare `uuid` | All identity references use `identity_uuid` |
### P1 — Suggested (3 items)
| # | Check | Note |
|---|-------|------|
| 1 | Cross-references | Link to related docs in API_WORKSPACE/, GUIDES/, REFERENCE/, DESIGN/, OPERATIONS/ |
| 2 | Glossary terms | Define non-obvious terms inline or link glossary |
| 3 | Diagrams | Include Mermaid/ASCII diagram for complex topics |
### Exception
`M4_workspace/` files are exempt from this checklist (free-format reply documents).
---
## Delivery Procedure
完整交付程序M4_workspace → M5 → Release → Deploy → Public
`docs_v1.0/OPERATIONS/DELIVERY_PROCEDURE.md`

View File

@@ -1,143 +0,0 @@
# Changelog
All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
## [Unreleased]
### Added
- Gitea API token integration
- n8n API key integration
- API key caching with Moka
- Rate limiting for API key validation
- Constant-time hash comparison
- OpenAPI documentation with utoipa
## [0.1.0] - 2026-03-21
### Added
#### API Key Management System
- API key generation with secure random (UUID v4)
- SHA256 key hashing
- 5 key types: System, User, Service, Integration, Emergency
- Key expiration with configurable TTL
- Grace period for key rotation
#### Anomaly Detection
- High request rate detection (>1000/min)
- High error rate detection (>50%)
- Multiple IP detection (>5/hour)
- Unusual time activity detection
- Redis Pub/Sub for anomaly alerts
#### Rotation Mechanism
- Automatic rotation scheduling
- Manual rotation requests
- Forced rotation for security incidents
- Grace period management per key type:
- System: 72 hours
- User: 24 hours
- Service: 48 hours
- Integration: 24 hours
- Emergency: 0 hours (immediate)
#### PostgreSQL Integration
- `api_keys` table for key storage
- `api_key_audit_log` table for audit trail
- `api_key_anomalies` table for anomaly records
- Full CRUD operations for API keys
#### Redis Integration
- Anomaly alert Pub/Sub (`momentry:anomaly:alerts`)
- Key anomaly state tracking
- Real-time alert notifications
#### CLI Commands
- `momentry api-key create` - Create new API key
- `momentry api-key list` - List all API keys
- `momentry api-key validate` - Validate an API key
- `momentry api-key revoke` - Revoke an API key
- `momentry api-key rotate` - Request key rotation
- `momentry api-key stats` - Show statistics
#### Gitea Integration
- Create Gitea Personal Access Tokens
- List user tokens
- Delete tokens
- Local token tracking
- CLI commands:
- `momentry gitea create`
- `momentry gitea list`
- `momentry gitea delete`
- `momentry gitea verify`
#### n8n Integration
- Create n8n API keys
- List API keys
- Delete API keys
- Local key tracking
- CLI commands:
- `momentry n8n create`
- `momentry n8n list`
- `momentry n8n delete`
- `momentry n8n verify`
#### Security Features
- Constant-time hash comparison (subtle crate)
- Rate limiting for validation attempts
- IP-based lockout after failed attempts
- Configurable thresholds via environment variables
#### Performance Optimizations
- Moka-based API key validation cache
- Configurable TTL and capacity
- Reduced database queries for hot keys
#### Documentation
- API Key Management design document
- Redis user configuration guide
- Gitea token integration guide
- n8n API key integration guide
- Optimization plan with task codes
### Environment Variables
#### API Key Configuration
```
CACHE_TTL_SECONDS=300 # Cache TTL (default: 300)
CACHE_MAX_CAPACITY=10000 # Max cache entries (default: 10000)
RATE_LIMIT_MAX_ATTEMPTS=5 # Max failed attempts (default: 5)
RATE_LIMIT_WINDOW_SECONDS=900 # Lockout duration (default: 900)
```
#### Service URLs
```
GITEA_URL=http://localhost:3000
N8N_URL=https://n8n.momentry.ddns.net
```
### Database Schema
#### Tables Created
- `api_keys` - API key storage
- `api_key_audit_log` - Audit trail
- `api_key_anomalies` - Anomaly records
- `gitea_tokens` - Gitea token tracking
- `n8n_api_keys` - n8n API key tracking
### Dependencies Added
- `uuid` - UUID generation
- `subtle` - Constant-time comparison
- `moka` - Async cache
- `utoipa` - OpenAPI documentation
- `utoipa-swagger-ui` - Swagger UI
---
## Version History
| Version | Date | Description |
|---------|------|-------------|
| 0.1.0 | 2026-03-21 | Initial release with API Key Management |

1336
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -1,6 +1,6 @@
[package]
name = "momentry_core"
version = "0.1.0"
version = "1.0.0"
edition = "2021"
authors = ["Momentry Team"]
description = "Digital asset management system with video analysis and RAG"
@@ -11,8 +11,9 @@ anyhow = "1.0"
thiserror = "1.0"
tokio = { version = "1", features = ["full"] }
tracing = "0.1"
tracing-subscriber = "0.3"
tracing-subscriber = { version = "0.3", features = ["env-filter"] }
once_cell = "1.19"
libc = "0.2"
dotenv = "0.15"
# CLI
@@ -25,32 +26,42 @@ futures-util = "0.3"
# Serialization
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
regex = "1"
chrono = { version = "0.4", features = ["serde"] }
# UUID
sha2 = "0.10"
hex = "0.4"
uuid = { version = "1.0", features = ["v4"] }
mac_address = "1.1"
# Security
subtle = "2.5"
aes-gcm = "0.10"
base64 = "0.22"
# Security
subtle = "2.5"
aes-gcm = "0.10"
base64 = "0.22"
argon2 = "0.5"
jsonwebtoken = "9.3"
# Text processing
jieba-rs = "0.8.1"
ferrous-opencc = { version = "0.3.1", features = ["s2t-conversion", "t2s-conversion"] }
# Cache
moka = { version = "0.12", features = ["future"] }
# Database
redis = { version = "1.0", features = ["tokio-comp", "connection-manager"] }
sqlx = { version = "0.8", features = ["runtime-tokio", "postgres", "sqlite", "json", "chrono"] }
sqlx = { version = "0.8", features = ["runtime-tokio", "postgres", "sqlite", "json", "chrono", "uuid"] }
mongodb = { version = "2", features = ["tokio-runtime"] }
bson = { version = "2", features = ["chrono-0_4"] }
qdrant-client = "1.7"
reqwest = { version = "0.12", features = ["json"] }
reqwest = { version = "0.12", features = ["json", "gzip"] }
pgvector = { version = "0.3", features = ["sqlx"] }
# HTTP Server
axum = "0.7"
axum = { version = "0.7", features = ["multipart"] }
tower = "0.4"
tower-http = { version = "0.5", features = ["cors", "fs"] }
# API Documentation
utoipa = { version = "4", features = ["axum_extras", "chrono", "uuid"] }
@@ -71,9 +82,9 @@ crossterm = "0.28"
# Terminal
atty = "0.2"
tokio-util = { version = "0.7.18", features = ["io"] }
# System
libc = "0.2"
[lib]
name = "momentry_core"
@@ -81,12 +92,20 @@ path = "src/lib.rs"
[features]
default = []
player = []
player = ["sdl2"]
[dependencies.sdl2]
version = "0.35"
optional = true
[[bin]]
name = "momentry"
path = "src/main.rs"
[[bin]]
name = "momentry-cli"
path = "src/bin/cli.rs"
[[bin]]
name = "momentry_player"
path = "src/player/main.rs"
@@ -94,3 +113,41 @@ path = "src/player/main.rs"
[[bin]]
name = "momentry_playground"
path = "src/playground.rs"
[[bin]]
name = "fix_chunks"
path = "src/bin/fix_chunks.rs"
[[bin]]
name = "migrate_chinese_text"
path = "src/bin/migrate_chinese_text.rs"
[[bin]]
name = "test_bm25_simple"
path = "src/bin/test_bm25_simple.rs"
[[bin]]
name = "integrated_player"
path = "src/bin/integrated_player.rs"
[[bin]]
name = "release"
path = "src/bin/release.rs"
[[bin]]
name = "vectorize_missing"
path = "src/bin/vectorize_missing.rs"
[[bin]]
name = "sync_qdrant_from_pg"
path = "src/bin/sync_qdrant_from_pg.rs"
[[bin]]
name = "service"
path = "src/bin/service.rs"
[build-dependencies]
chrono = "0.4"
[dev-dependencies]
tempfile = "3"

277
IDENTITY_BEST_FACE_API.md Normal file
View File

@@ -0,0 +1,277 @@
# Identity Best-Face API
**狀態:** 規劃中
**提出日期:** 2026-06-01
**提出者:** WordPress Portal 前端團隊
---
## 1. 背景
WordPress Portal 的 People 頁面需要在 identity detail view 與 grid card 中顯示代表臉部縮圖。目前前端作法:
1. `GET /identity/{uuid}/traces` → 取得所有 trace 列表(含 `avg_confidence`
2. 對每個 trace 載入第一幀 thumbnail → `GET /file/{uuid}/trace/{tid}/thumbnail`
3. 從有 thumbnail 的 trace 中,選 `avg_confidence` 最高者作為代表圖
### 現有問題
- **品質不佳**trace thumbnail 固定取第一幀,不一定是該 trace 內最清晰或正面的臉部畫面
- **浪費頻寬**:前端需發送大量並行請求(最多 20 trace × thumbnail多數 thumbnail 最終不會被使用
- **無快取**:每次進入 detail view 都要重複載入所有 thumbnail
- **不一致**:同樣 identity 在 grid card 與 detail view 可能顯示不同代表圖
---
## 2. 目標
後端新增一個 endpoint對指定 identity **跨所有 trace** 選出品質最佳(最清晰)的臉部畫面,並提供可直接使用的縮圖 URL支援 disk cache。
---
## 3. API 規格
### `GET /api/v1/identity/:identity_uuid/best-face`
無 query parameter。
#### 成功回應 `200`
```json
{
"success": true,
"identity_uuid": "a6fb22eebefaef17e62af874997c5944",
"name": "Audrey Hepburn",
"source": "fresh",
"best": {
"file_uuid": "a6fb22eebefaef17e62af874997c5944",
"trace_id": 42,
"frame_number": 3120,
"timestamp_secs": 124.8,
"bbox": {
"x": 240,
"y": 180,
"width": 120,
"height": 160
},
"confidence": 0.97,
"quality_score": 18624.0,
"blur_score": 2.1,
"thumbnail_url": "/api/v1/file/a6fb22eebefaef17e62af874997c5944/trace/42/thumbnail"
}
}
```
#### 無可用臉部 `200`
```json
{
"success": true,
"identity_uuid": "a6fb22eebefaef17e62af874997c5944",
"name": "Audrey Hepburn",
"source": "fresh",
"best": null
}
```
#### 欄位說明
| 欄位 | 型態 | 說明 |
|------|------|------|
| `success` | boolean | 請求是否成功 |
| `identity_uuid` | string | identity UUID32字元無連字號 |
| `name` | string | identity 名稱 |
| `source` | string | `"fresh"`(即時計算)或 `"cache"`(來自 disk cache |
| `best` | object/null | 最佳臉部資訊,無可用臉部時為 `null` |
| `best.file_uuid` | string | 該臉部所屬檔案 UUID |
| `best.trace_id` | int | 該臉部所屬 trace ID |
| `best.frame_number` | int | 代表臉的影格編號 |
| `best.timestamp_secs` | float | 代表臉的時間戳(秒) |
| `best.bbox` | object | 臉部 bounding box `{x, y, width, height}` |
| `best.confidence` | float | 該臉部的 detection confidence |
| `best.quality_score` | float | 品質分數 = `(width * height) * confidence` |
| `best.blur_score` | float | 模糊度分數ffmpeg blurdetect越低越清晰 |
| `best.thumbnail_url` | string | 縮圖 URL相對路徑可直接用於瀏覽器 |
---
## 4. 實作建議
### 4.1 建議放置位置
**選項 A建議** `src/api/trace_agent_api.rs`
- 原因:核心邏輯重用 `select_rep_face()`(目前為 `pub(crate)`,位於同一檔案),無需修改既有的 function visibility
-`trace_agent_routes()` 中新增路由
**選項 B** `src/api/identity_binding.rs`
- 需將 `select_rep_face` 改為 `pub` 才能跨檔案呼叫
- 路由語意上更接近 identity 操作
### 4.2 演算法
```
1. DISK CACHE CHECK
路徑:{OUTPUT_DIR}/identities/{uuid}/best_face.json
讀取 identity.json 的 updated_at與 cache 中記錄的版本比較
若 cache 未過期 → 直接回傳source: "cache"
若無 cache 或已過期 → 繼續計算
2. QUERY IDENTITY
SELECT id, name FROM identities
WHERE REPLACE(uuid::text, '-', '') = $1
3. QUERY TOP N TRACES
SELECT fd.file_uuid, fd.trace_id,
AVG(fd.confidence)::float8 AS avg_conf
FROM {schema}.face_detections fd
WHERE fd.identity_id = $1
AND fd.confidence > 0.7
AND (fd.metadata->>'qc_ok' IS NULL
OR (fd.metadata->>'qc_ok')::boolean = true)
GROUP BY fd.file_uuid, fd.trace_id
ORDER BY avg_conf DESC
LIMIT 5
4. FOR EACH TRACE (並行)
select_rep_face(pool, file_uuid, trace_id, err_fn)
 → 回傳該 trace 內 blur_score 最低(最清晰)的臉
失敗則 skiplog warning
5. SELECT BEST AMONG RESULTS
主排序blur_score ASC越低越清晰
次排序quality_score DESCblur_score 差距 < 0.5 時)
全部失敗 → best = null
6. WRITE DISK CACHE
路徑:{OUTPUT_DIR}/identities/{uuid}/best_face.json
內容best 欄位 + 計算時間 + identity updated_at
7. RESPONSE
```
### 4.3 效能參數
| 參數 | 值 | 說明 |
|------|----|------|
| TOP N | 5 | 只對 confidence 最高的 5 個 trace 做 blurdetect |
| confidence 門檻 | > 0.7 | 同既有的 `select_rep_face` 邏輯 |
| QC 過濾 | qc_ok = true/null | 同既有邏輯 |
| ffmpeg timeout | inherit from Command | 每個 trace 約 1-3s |
| cache TTL | 直到下一次 bind/unbind/merge | 事件驅動失效 |
### 4.4 快取策略
**寫入時機:** `get_identity_best_face` 計算完成後
**失效時機(刪除 `best_face.json`**
| 觸發 operation | 所在檔案 | 備註 |
|---------------|---------|------|
| `bind_trace` (POST) | `identity_binding.rs` | 新增 face 關聯 |
| `unbind` (POST) | `identity_binding.rs` | 移除 face 關聯 |
| `mergeinto` (POST) | `identity_binding.rs` | source + target 雙雙清除 |
| `profile-image` (POST) | `identity_api.rs` | 使用者上傳新大頭照 |
**Cache 驗證機制:** 儲存計算時的 `identity.updated_at`,每次請求時比對:
- 若 identity 的 `updated_at` 未變 → cache 有效
- 若已變 → 重新計算
### 4.5 建議的新增/修改檔案
| 檔案 | 動作 | 說明 |
|------|------|------|
| `src/api/trace_agent_api.rs` | **新增** handler + struct + route | ~+130 行 |
| `src/api/identity_binding.rs` | **修改** 3 處 + cache invalidation helper | ~+25 行 |
| `src/api/identity_api.rs` | **修改** 1 處profile-image POST | ~+5 行 |
### 4.6 需要的新 struct
**`src/api/trace_agent_api.rs`**(或獨立檔案 `src/core/identity_best_face.rs`
```rust
#[derive(Debug, Serialize, Deserialize)]
pub struct BestFaceResponse {
pub success: bool,
pub identity_uuid: String,
pub name: String,
pub source: String,
pub best: Option<BestFaceResult>,
}
#[derive(Debug, Serialize, Deserialize)]
pub struct BestFaceResult {
pub file_uuid: String,
pub trace_id: i32,
pub frame_number: i64,
pub timestamp_secs: f64,
pub bbox: RepFaceBbox,
pub confidence: f64,
pub quality_score: f64,
pub blur_score: f64,
pub thumbnail_url: String,
}
```
### 4.7 Cache Invalidation Helper Function
```rust
async fn invalidate_best_face_cache(output_dir: &str, uuid_clean: &str) {
let path = format!("{}/identities/{}/best_face.json", output_dir, uuid_clean);
let _ = tokio::fs::remove_file(path).await;
}
```
---
## 5. 前端整合參考(供後端團隊理解使用情境)
WP snippet 72 (`ms-people.js`) 的 `loadPersonDetail` 中,優先使用新 endpoint
```js
async function loadPersonDetail(person) {
if (person.thumb && person._hasProfileImage) return;
try {
const res = await apiFetch('/identity/' + person.id + '/best-face');
if (res?.success && res?.best) {
const b = res.best;
person.thumb = `${API_BASE}/file/${b.file_uuid}/trace/${b.trace_id}/thumbnail?api_key=${API_KEY}`;
person._hasProfileImage = true;
updateDetailAvatar(person);
return;
}
} catch (e) { /* fallback to legacy */ }
// 原邏輯traces → thumbnails → confidence sort
}
```
同樣可用於 grid card 的代表圖載入(`loadGridThumbnails`
```js
// 一次性載入所有 pending identity 的 best-face
const results = await Promise.allSettled(
persons.map(p => apiFetch('/identity/' + p.id + '/best-face'))
);
```
---
## 6. 驗收標準
1. `GET /api/v1/identity/{uuid}/best-face``200` + valid JSON
2. 有 trace 的 identity → `best` 不為 null`blur_score` 為該 identity 所有 trace 中最低
3. 無 trace 的 identity → `best: null`
4. 短時間內重複請求同一 identity → `source: "cache"`,回應時間 < 10ms
5. 綁定新 trace 後再次請求 → `source: "fresh"`cache 已正確失效)
6. `thumbnail_url` 可直接用於 `<img>` 顯示
---
## 7. 風險與注意事項
- **首次請求延遲**:對有大量 trace 的 identity如主角首次請求可能需 5-15 秒。建議前端顯示 loading state
- **ffmpeg 資源**:同時多個請求可能導致高 CPU 使用。可考慮加入 per-identity lock 避免重複計算
- **邊界案例**trace 內的 faces 全部 confidence ≤ 0.7 或 qc_ok=false則該 trace 被跳過,可能導致 `best: null`

78
SYNC_V1.1.md Normal file
View File

@@ -0,0 +1,78 @@
# Sync Notes 2026-05-21
## M5Max128 收到後需要做的事
```bash
cd ~/momentry_core
git pull origin main # 拉取所有變更
cat SYNC_V1.1.md # 閱讀此文件
# 資料庫變更(必須先執行,否則 worker 會 fail
psql -U accusys -d momentry -c "ALTER TABLE public.pre_chunks ALTER COLUMN coordinate_index SET DEFAULT 0;"
# 重建 + 重啟
cargo build --release --bin momentry
./run-server-3002.sh
```
---
## Bugs Fixed (13)
| # | 問題 | 根因 | 修復 |
|---|------|------|------|
| 1 | `GET /identity/:uuid/files` 空資料 | SQL 缺 `REPLACE(uuid)` + 缺 `JOIN videos` | 改用 `REPLACE(uuid::text...)` + JOIN videos + `frame_number/fps` |
| 2 | `GET /identity/:uuid/faces` crash + 空 | `i64`/`INT4` 型別不符 + 硬編碼 NULL/0 | `id::bigint``confidence::float8` + 真實欄位 |
| 3 | `GET /identity/:uuid` crash | `IdentityDetailRecord.id``i64` 但 DB 是 `INT4` | `id::bigint as id` |
| 4 | `GET /file/:uuid/identities` 空 | 雙重 stubhandler + DB 都 `Vec::new()` | 完整實作 + 正確 total count |
| 5 | `GET /identities/search?q=Louis` 500 | `c.text_content` NULL 但 Rust tuple 用 `String` | 改 `Option<String>` |
| 6 | `POST /search/universal` person type first/last_time null | `search_persons_internal``timestamp_secs` | 改 `frame_number/fps` + JOIN videos |
| 7 | faces/files/chunks total 不正確 | `total: data.len()` | 獨立 COUNT 查詢 |
| 8 | `GET /identity/:uuid/traces` 無分頁 | 缺 page/page_size | 新增 `TracesQuery` + LIMIT/OFFSET |
| 9 | 身分比對 frame-level 不穩定 | frame-level Qdrant | 改 **trace-level**AVG embedding per trace |
| 10 | Charade face embedding 不在 Qdrant | 沒跑 `sync_face_embeddings` | match API 自動 push + ANN search |
| 11 | 無眼睛 face 推入 Qdrant | 無 QC 過濾 | `face_landmark_qc.py --apply` + Qdrant sync 過濾 `qc_ok` |
| 12 | TMDb 比對 dev/prod 不一致 | Qdrant ANN 不同 collection | trace-level 改善穩定性 |
| 13 | `faces/files/chunks total` 顯示 page_size | `total: data.len()` | 改為獨立 COUNT 查詢 |
## ✨ 新功能 (6)
| # | 功能 | 說明 |
|---|------|------|
| 1 | `POST /api/v1/tmdb/fetch` | 從 TMDb 下載 cast → 建立 identity + json + jpg + Qdrant |
| 2 | `POST /api/v1/agents/tmdb/match/:file_uuid` | 推 face → Qdrant ANN search → bind identity |
| 3 | `GET /api/v1/identity/:uuid/status` | 檢查 identity.json + profile.jpg 是否存在 |
| 4 | `/health` 新增 watcher/worker/時區 | `watcher_running``worker_running``system_timezone` |
| 5 | `SYSTEM_TIMEZONE` config | 自動偵測系統時區,可 `MOMENTRY_TIMEZONE` 覆蓋 |
| 6 | `GET /identity/:uuid/traces` 分頁 | `?page=1&page_size=20` |
## 🔧 資料庫變更
```sql
-- 必須執行(否則 worker 的 CUT processor 會失敗)
ALTER TABLE public.pre_chunks ALTER COLUMN coordinate_index SET DEFAULT 0;
-- 選擇性face_landmark_qc.py --apply 需要)
ALTER TABLE public.face_detections ADD COLUMN metadata jsonb DEFAULT '{}'::jsonb;
```
## 🗑️ 清理
- 刪除 2,769 個孤兒 `person_xxx` identity無 face_detections
- `person_identities` + `person_appearances` table 已 DROP
## 📂 主要檔案變更
| 檔案 | 說明 |
|------|------|
| `src/api/identity_api.rs` | identity detail/files/faces total 修正 + status endpoint |
| `src/api/identity_binding.rs` | traces 分頁(新增 `page`/`page_size`/`total` |
| `src/api/server.rs` | health 新增 watcher/worker/system_timezone |
| `src/api/tmdb_api.rs` | **新檔案** — tmdb/fetch + match 端點 |
| `src/api/universal_search.rs` | person search 改 frame_number/fps |
| `src/core/config.rs` | 新增 SYSTEM_TIMEZONE |
| `src/core/db/qdrant_db.rs` | search_face_collection + sync_trace_embeddings + batch upsert |
| `src/core/db/postgres_db.rs` | get_identity_files/faces 修正 + get_file_identities 實作 |
| `src/core/tmdb/probe.rs` | extract_movie_name 改進(只取 `(` 前) |
| `scripts/face_landmark_qc.py` | 新增 `--apply` + `--schema` 參數 |
| `Cargo.toml` | reqwest 加 `gzip` feature |

View File

@@ -1 +0,0 @@
/Users/accusys/momentry_core_0.1/output/a1b10138a6bbb0cd.cut.json

View File

@@ -1 +0,0 @@
/Users/accusys/momentry_core_0.1/output/a1b10138a6bbb0cd.face.json

View File

@@ -1 +0,0 @@
/Users/accusys/momentry_core_0.1/output/a1b10138a6bbb0cd.ocr.json

View File

@@ -1 +0,0 @@
/Users/accusys/momentry_core_0.1/output/a1b10138a6bbb0cd.pose.json

View File

@@ -1 +0,0 @@
/Users/accusys/momentry_core_0.1/output/a1b10138a6bbb0cd.story.json

View File

@@ -1 +0,0 @@
/Users/accusys/momentry_core_0.1/output/a1b10138a6bbb0cd.yolo.json

81
build.rs Normal file
View File

@@ -0,0 +1,81 @@
use std::collections::BTreeMap;
use std::path::Path;
fn main() {
let version = std::env::var("CARGO_PKG_VERSION").unwrap_or_else(|_| "unknown".to_string());
let git_hash = std::process::Command::new("git")
.args(["rev-parse", "--short", "HEAD"])
.output()
.ok()
.and_then(|o| String::from_utf8(o.stdout).ok())
.map(|s| s.trim().to_string())
.unwrap_or_else(|| "unknown".to_string());
let timestamp = std::process::Command::new("date")
.args(["-u", "+%Y-%m-%dT%H:%M:%SZ"])
.output()
.ok()
.and_then(|o| String::from_utf8(o.stdout).ok())
.map(|s| s.trim().to_string())
.unwrap_or_else(|| "unknown".to_string());
println!("cargo:rustc-env=BUILD_VERSION={}", version);
println!("cargo:rustc-env=BUILD_GIT_HASH={}", git_hash);
println!("cargo:rustc-env=BUILD_TIMESTAMP={}", timestamp);
// ── Schema migration manifest ──
// Scan release/migrate_*.sql, compute SHA256, embed as JSON string
let manifest_dir = std::env::var("CARGO_MANIFEST_DIR").unwrap_or_else(|_| ".".to_string());
let release_dir = Path::new(&manifest_dir).join("release");
let mut migrations = BTreeMap::new(); // sorted by filename
if let Ok(entries) = std::fs::read_dir(&release_dir) {
for entry in entries.flatten() {
let path = entry.path();
let fname = path.file_name().and_then(|n| n.to_str()).unwrap_or("");
if fname.starts_with("migrate_") && fname.ends_with(".sql") {
if let Ok(content) = std::fs::read(&path) {
let hash = sha256_hex(&content);
migrations.insert(fname.to_string(), hash);
}
}
}
}
// Encode as comma-separated: name1:hash1,name2:hash2,...
let manifest: String = migrations
.iter()
.map(|(name, hash)| format!("{}:{}", name, hash))
.collect::<Vec<_>>()
.join(",");
println!("cargo:rustc-env=REQUIRED_MIGRATIONS={}", manifest);
println!(
"cargo:info=Embedded {} migration checksums",
migrations.len()
);
}
fn sha256_hex(data: &[u8]) -> String {
use std::io::Write;
use std::process::{Command, Stdio};
if let Ok(mut child) = Command::new("shasum")
.arg("-a")
.arg("256")
.stdin(Stdio::piped())
.stdout(Stdio::piped())
.spawn()
{
if let Some(mut stdin) = child.stdin.take() {
let _ = stdin.write_all(data);
}
if let Ok(out) = child.wait_with_output() {
if let Ok(s) = String::from_utf8(out.stdout) {
if let Some(hash) = s.split(' ').next() {
return hash.to_string();
}
}
}
}
"unknown".to_string()
}

View File

@@ -1,105 +1,178 @@
# Momentry Core 配置管理
# Momentry Core Config Management
## 目錄結構
## Directory Structure
```
momentry_core_0.1/
├── .env.example # 配置模板(已納入版本控制)
├── .env # 本地配置(已從版本控制排除)
├── .env.local # 本地覆蓋配置(已從版本控制排除)
├── .env.example # Template (version controlled)
├── .env # Local config (gitignored)
├── .env.development # Playground dev overrides (gitignored)
├── .env.local # Local overrides (gitignored)
├── config/
── README.md # 本文件
└── src/core/config.rs # 配置代碼
── README.md # This file
│ └── port_registry.tsv # Central port registry
└── src/core/config.rs # Config code with lazy_static env reading
```
## 配置加載順序
## Load Order
1. `.env` - 默認本地配置
2. `.env.local` - 本地覆蓋(最高優先級)
For `momentry_playground` (development):
1. `.env` — shared defaults
2. `.env.development` — dev-specific overrides (loaded by playground binary)
## 環境變數列表
For `momentry` (production):
1. `.env` — production config
### 數據庫配置
In Rust: `config.rs` reads env vars with lazy_static, falling back to hardcoded defaults.
| 變數 | 說明 | 默認值 |
|------|------|--------|
| `DATABASE_URL` | PostgreSQL 連接字串 | `postgres://accusys@localhost:5432/momentry` |
## Environment Variables
### Redis 配置
### Server
| 變數 | 說明 | 默認值 |
|------|------|--------|
| `REDIS_URL` | Redis 連接字串 | `redis://:accusys@localhost:6379` |
| `REDIS_PASSWORD` | Redis 密碼 | `accusys` |
| Variable | Description | Default |
|----------|-------------|---------|
| `MOMENTRY_SERVER_PORT` | Server port (3002=prod, 3003=dev) | `3002` |
| `MOMENTRY_REDIS_PREFIX` | Redis key prefix | `momentry:` (prod), `momentry_dev:` (dev) |
### 存儲路徑
### Database
| 變數 | 說明 | 默認值 |
|------|------|--------|
| `MOMENTRY_OUTPUT_DIR` | 輸出目錄 | `/Users/accusys/momentry/output` |
| `MOMENTRY_BACKUP_DIR` | 備份目錄 | `/Users/accusys/momentry/backup/momentry` |
| `MOMENTRY_SCRIPTS_DIR` | 腳本目錄 | `/Users/accusys/momentry_core_0.1/scripts` |
| `MOMENTRY_PYTHON_PATH` | Python 路徑 | `/opt/homebrew/bin/python3.11` |
| Variable | Description | Default |
|----------|-------------|---------|
| `DATABASE_URL` | PostgreSQL connection string | `postgres://accusys@localhost:5432/momentry` |
| `DATABASE_SCHEMA` | Schema for dev isolation | `dev` |
| `MONGODB_URL` | MongoDB connection string | `mongodb://localhost:27017` |
| `MONGODB_DATABASE` | MongoDB database name | `momentry` (prod), `momentry_dev` (dev) |
| `MONGODB_CACHE_ENABLED` | MongoDB cache toggle | `true` |
| `MONGODB_CACHE_TTL_VIDEOS` | Cache TTL for videos | `300` |
| `MONGODB_CACHE_TTL_SEARCH` | Cache TTL for search | `300` |
| `MONGODB_CACHE_TTL_HYBRID_SEARCH` | Cache TTL for hybrid search | `600` |
| `MONGODB_CACHE_TTL_VIDEO_META` | Cache TTL for video metadata | `3600` |
### 處理器超時(秒)
### Redis
| 變數 | 說明 | 默認值 |
|------|------|--------|
| `MOMENTRY_ASR_TIMEOUT` | ASR 處理超時 | `3600` |
| `MOMENTRY_CUT_TIMEOUT` | CUT 處理超時 | `3600` |
| `MOMENTRY_DEFAULT_TIMEOUT` | 默認超時 | `7200` |
| Variable | Description | Default |
|----------|-------------|---------|
| `REDIS_URL` | Redis connection string | `redis://:accusys@localhost:6379` |
| `REDIS_PASSWORD` | Redis password | `accusys` |
| `REDIS_CACHE_TTL_HEALTH` | Health check cache TTL | `30` |
| `REDIS_CACHE_TTL_VIDEO_META` | Video metadata cache TTL | `3600` |
### 日誌
### Qdrant
| 變數 | 說明 | 默認值 |
|------|------|--------|
| `RUST_LOG` | 日誌級別 | `info` |
| `MOMENTRY_LOG_LEVEL` | 日誌級別(備選) | `info` |
| Variable | Description | Default |
|----------|-------------|---------|
| `QDRANT_URL` | Qdrant server URL | `http://localhost:6333` |
| `QDRANT_API_KEY` | Qdrant API key | `Test3200Test3200Test3200` |
| `QDRANT_COLLECTION` | Collection name | `momentry_rule1` (prod), `momentry_dev_rule1_v2` (dev) |
## 使用方式
### LLM
### 1. 首次設置
| Variable | Description | Default |
|----------|-------------|---------|
| `MOMENTRY_LLM_CHAT_URL` | Chat/function-calling endpoint | `http://127.0.0.1:8082/v1/chat/completions` |
| `MOMENTRY_LLM_CHAT_MODEL` | Chat model name | `google_gemma-4-26B-A4B-it-Q5_K_M.gguf` |
| `MOMENTRY_LLM_VISION_URL` | Vision LLM endpoint (E4B) | falls back to CHAT_URL |
| `MOMENTRY_LLM_VISION_MODEL` | Vision model name (E4B) | falls back to CHAT_MODEL |
| `MOMENTRY_LLM_SUMMARY_URL` | Summary LLM endpoint (5W1H) | falls back to CHAT_URL |
| `MOMENTRY_LLM_SUMMARY_MODEL` | Summary model name | falls back to CHAT_MODEL |
| `MOMENTRY_LLM_SUMMARY_ENABLED` | Toggle 5W1H summary generation | `true` |
| `MOMENTRY_LLM_SUMMARY_TIMEOUT` | 5W1H timeout in seconds | `120` |
| `MOMENTRY_LLM_CHAT_TIMEOUT` | Chat LLM timeout in seconds | `120` |
| `MOMENTRY_LLM_VISION_TIMEOUT` | Vision LLM timeout in seconds | `120` |
### Embedding
| Variable | Description | Default |
|----------|-------------|---------|
| `MOMENTRY_EMBED_URL` | Embedding server URL | `http://localhost:11436` |
### TMDb Integration
| Variable | Description | Default |
|----------|-------------|---------|
| `TMDB_API_KEY` | TMDb API key (required for probe) | (none) |
| `MOMENTRY_TMDB_PROBE_ENABLED` | Enable TMDb probe during register | `false` |
### Paths
| Variable | Description | Default |
|----------|-------------|---------|
| `MOMENTRY_OUTPUT_DIR` | Output directory for processing | `/Users/accusys/momentry/output` |
| `MOMENTRY_BACKUP_DIR` | Backup directory | `/Users/accusys/momentry/backup/momentry` |
| `MOMENTRY_SCRIPTS_DIR` | Python scripts directory | `/Users/accusys/momentry_core_0.1/scripts` |
| `MOMENTRY_PYTHON_PATH` | Python interpreter path | `/opt/homebrew/bin/python3.11` |
| `MOMENTRY_MEDIA_BASE_URL` | Base URL for media serving | (none) |
### Processor Timeouts
| Variable | Description | Default |
|----------|-------------|---------|
| `MOMENTRY_ASR_TIMEOUT` | ASR timeout in seconds | `3600` |
| `MOMENTRY_CUT_TIMEOUT` | CUT timeout in seconds | `3600` |
| `MOMENTRY_DEFAULT_TIMEOUT` | Default timeout in seconds | `7200` |
### Logging
| Variable | Description | Default |
|----------|-------------|---------|
| `RUST_LOG` | Rust log level (tracing) | `info` |
| `MOMENTRY_LOG_LEVEL` | Fallback log level | `info` |
### Worker
| Variable | Description | Default |
|----------|-------------|---------|
| `MOMENTRY_WORKER_ENABLED` | Enable background worker | `true` |
| `MOMENTRY_MAX_CONCURRENT` | Max concurrent jobs | `6` |
| `MOMENTRY_POLL_INTERVAL` | Poll interval in seconds | `10` |
| `MOMENTRY_WORKER_BATCH_SIZE` | Batch size | `5` |
### Synonym Expansion
| Variable | Description | Default |
|----------|-------------|---------|
| `MOMENTRY_SYNONYM_FILES` | Comma-separated paths to synonym JSON files | (none) |
| `MOMENTRY_SYNONYM_FILE` | Single synonym file (deprecated) | (none) |
### Encryption
| Variable | Description | Default |
|----------|-------------|---------|
| `AUDIT_ENCRYPTION_KEY` | 32-byte hex encryption key (64 hex chars) | (none) |
## Port Registry
See `config/port_registry.tsv` for the authoritative list of all ports and their owners.
| Port | Service | Owner | Config Key |
|------|---------|-------|------------|
| 5432 | PostgreSQL | postgres | `DATABASE_URL` |
| 6379 | Redis | redis-server | `REDIS_URL` |
| 6333 | Qdrant | qdrant | `QDRANT_URL` |
| 8082 | LLM Chat (A4B) | llama-server | `MOMENTRY_LLM_CHAT_URL` |
| 8083 | LLM Vision (E4B) | llama-server | `MOMENTRY_LLM_VISION_URL` |
| 11434 | Ollama | ollama | `MOMENTRY_OLLAMA_URL` |
| 11436 | Embedding | embeddinggemma_server.py | `MOMENTRY_EMBED_URL` |
| 27017 | MongoDB | mongod | `MONGODB_URL` |
| 3002 | Production API | momentry | `MOMENTRY_SERVER_PORT` |
| 3003 | Playground API | momentry_playground | `MOMENTRY_SERVER_PORT` |
## Quick Start
```bash
# 複製模板
# 1. Copy template
cp .env.example .env
# 編輯配置
nano .env
# 2. Edit .env for production or use .env.development for playground
# 3. Start all services
./scripts/start_momentry.sh
```
### 2. 本地覆蓋
## Version Control
創建 `.env.local` 設置僅本地適用的配置:
```bash
# .env.local 示例
DATABASE_URL=postgres://local:password@localhost:5432/momentry_dev
MOMENTRY_LOG_LEVEL=debug
```
### 3. 運行應用
```bash
# 加載配置並運行
source .env && cargo run
# 或使用 direnv
direnv allow
```
## 版本控制策略
| 文件 | 版本控制 | 說明 |
|------|---------|------|
| `.env.example` | ✅ 追蹤 | 模板,包含所有選項 |
| `.env` | ❌ 忽略 | 本地敏感配置 |
| `.env.local` | ❌ 忽略 | 本地覆蓋配置 |
## 部署檢查清單
- [ ] 複製 `.env.example``.env`
- [ ] 設置數據庫連接
- [ ] 設置 Redis 密碼
- [ ] 配置目錄路徑
- [ ] 確認日誌級別
| File | Tracked | Purpose |
|------|---------|---------|
| `.env.example` | ✅ Yes | Template with all options documented |
| `.env` | ❌ No | Local sensitive config |
| `.env.development` | ❌ No | Dev-specific overrides |
| `.env.local` | ❌ No | Local overrides (highest priority) |

24
config/port_registry.tsv Normal file
View File

@@ -0,0 +1,24 @@
# Port Registry - Momentry Core
# Each port must have exactly one owner.
# Before adding a service: pick a free port, add a row here, then configure.
#
# Port Service Owner Config Key Default Source
22 ssh sshd - - macOS
80 http Caddy - - Caddyfile
443 https Caddy - - Caddyfile
2019 caddy-admin Caddy - - Caddyfile (internal)
3000 gitea gitea - 3000 start_momentry.sh
3002 production momentry MOMENTRY_SERVER_PORT 3002 run-server-3002.sh
3003 playground momentry_playground MOMENTRY_SERVER_PORT 3003 start_momentry.sh
3200 dashboard Caddy - - Caddyfile
3306 mariadb mariadbd - 3306 start_momentry.sh
5432 postgresql postgres DATABASE_URL postgres://...:5432 start_momentry.sh
6379 redis redis-server REDIS_URL redis://...:6379 start_momentry.sh
6333 qdrant qdrant QDRANT_URL http://...:6333 start_momentry.sh
8081 wordpress Caddy - - Caddyfile
8082 llm-chat llama-server MOMENTRY_LLM_CHAT_URL http://...:8082 start_momentry.sh
8083 llm-vision llama-server MOMENTRY_LLM_VISION_URL http://...:8083 start_momentry.sh
9000 php-fpm php-fpm - 9000 brew services
11434 ollama ollama MOMENTRY_OLLAMA_URL http://...:11434 start_momentry.sh
11436 embedding embeddinggemma MOMENTRY_EMBED_URL http://...:11436 start_momentry.sh
27017 mongodb mongod MONGODB_URL mongodb://...:27017 start_momentry.sh
1 # Port Registry - Momentry Core
2 # Each port must have exactly one owner.
3 # Before adding a service: pick a free port, add a row here, then configure.
4 #
5 # Port Service Owner Config Key Default Source
6 22 ssh sshd - - macOS
7 80 http Caddy - - Caddyfile
8 443 https Caddy - - Caddyfile
9 2019 caddy-admin Caddy - - Caddyfile (internal)
10 3000 gitea gitea - 3000 start_momentry.sh
11 3002 production momentry MOMENTRY_SERVER_PORT 3002 run-server-3002.sh
12 3003 playground momentry_playground MOMENTRY_SERVER_PORT 3003 start_momentry.sh
13 3200 dashboard Caddy - - Caddyfile
14 3306 mariadb mariadbd - 3306 start_momentry.sh
15 5432 postgresql postgres DATABASE_URL postgres://...:5432 start_momentry.sh
16 6379 redis redis-server REDIS_URL redis://...:6379 start_momentry.sh
17 6333 qdrant qdrant QDRANT_URL http://...:6333 start_momentry.sh
18 8081 wordpress Caddy - - Caddyfile
19 8082 llm-chat llama-server MOMENTRY_LLM_CHAT_URL http://...:8082 start_momentry.sh
20 8083 llm-vision llama-server MOMENTRY_LLM_VISION_URL http://...:8083 start_momentry.sh
21 9000 php-fpm php-fpm - 9000 brew services
22 11434 ollama ollama MOMENTRY_OLLAMA_URL http://...:11434 start_momentry.sh
23 11436 embedding embeddinggemma MOMENTRY_EMBED_URL http://...:11436 start_momentry.sh
24 27017 mongodb mongod MONGODB_URL mongodb://...:27017 start_momentry.sh

123
config/production.toml Normal file
View File

@@ -0,0 +1,123 @@
# Momentry Core Production Configuration
# Version: 1.0.0
# Effective: 2025-03-27
[server]
host = "0.0.0.0"
port = 3002
workers = 4
log_level = "info"
max_connections = 1000
keep_alive = 75
[database]
url = "postgres://accusys@localhost:5432/momentry"
pool_size = 20
idle_timeout = 300
max_lifetime = 1800
[redis]
url = "redis://:accusys@localhost:6379"
prefix = "momentry:"
pool_size = 50
connection_timeout = 5
read_timeout = 3
write_timeout = 3
[storage]
output_dir = "/Users/accusys/momentry/output"
backup_dir = "/Users/accusys/momentry/backup"
max_file_size = "10GB"
[processors]
asr_timeout = 7200 # 2 hours for long videos
ocr_timeout = 3600 # 1 hour
yolo_timeout = 14400 # 4 hours
face_timeout = 3600 # 1 hour
pose_timeout = 7200 # 2 hours
asrx_timeout = 10800 # 3 hours for speaker diarization
cut_timeout = 7200 # 2 hours for scene detection
caption_timeout = 3600 # 1 hour for captioning
story_timeout = 3600 # 1 hour for story generation
default_timeout = 7200
max_concurrent = 2 # Limit to prevent overload
[asr]
model_size = "medium"
device = "cpu"
language = "auto"
task = "transcribe"
beam_size = 5
best_of = 5
[ocr]
languages = "en"
confidence = 0.7
gpu = false
model_path = "~/.EasyOCR/model"
[yolo]
model_size = "yolov8n.pt"
confidence = 0.25
iou = 0.45
gpu = false
auto_save_interval = 30
auto_save_frames = 300
classes = "" # empty = all classes
[face]
method = "haar"
confidence = 0.5
min_size = 30
max_size = 300
scale_factor = 1.1
min_neighbors = 3
gpu = false
gpu_backend = "cpu" # cpu, cuda, mps, rocm
enable_mps = false
[pose]
model_size = "yolov8n-pose.pt"
confidence = 0.25
iou = 0.45
gpu = false
keypoint_confidence = 0.5
max_persons = 10
[asrx]
model_size = "medium"
device = "cpu"
language = "en"
batch_size = 16
diarization = true
min_speakers = 1
max_speakers = 10
[cut]
method = "content"
threshold = 27.0
min_scene_length = 0.5
show_progress = true
[caption]
model = "gpt-4"
max_tokens = 1000
temperature = 0.7
[story]
model = "gpt-4"
max_tokens = 2000
temperature = 0.8
[audit]
enabled = true
log_file = "/Users/accusys/momentry/logs/audit.log"
retention_days = 90
[monitoring]
enabled = true
metrics_port = 9090
health_check_interval = 30
alert_threshold_cpu = 80
alert_threshold_memory = 85
alert_threshold_disk = 90

View File

@@ -0,0 +1,761 @@
# AGENTS.md - Momentry Core
Rust-based digital asset management system with video analysis and RAG capabilities.
---
## ⚠️ CRITICAL: 開發隔離原則
### 絕對禁止事項
- **絕對不可修改 `/Users/accusys/wordpress/` 目錄下的任何檔案**
- **絕對不可修改 n8n 工作流或設定**
- **絕對不可修改 WordPress 或 n8n 的資料庫 table**
- **除非是 release 作業,絕對不可動 port 3002 (production)**
- **🔴 DELETE / REMOVE / DROP / CLEAR 任何資料前必須先問使用者「要刪嗎?」獲得明確同意後才能執行**
- **🔴 Qdrant collection 刪除、DB truncate、檔案刪除、資料清空 — 一律要先問**
- **🔴 不確定是否該刪 → 先問,不要自己決定**
### 開發範圍界定
| 範圍 | 狀態 | 說明 |
|------|------|------|
| `momentry_core_0.1/` | ✅ **可開發** | Momentry Core 主要開發目錄 |
| `momentry_core_0.1/portal/` | ✅ **可開發** | Tauri Portal 前端 |
| `momentry_core_0.1/src/` | ✅ **可開發** | Rust 後端程式碼 |
| `/Users/accusys/wordpress/` | ❌ **禁止修改** | WordPress/Marcom 團隊負責 |
| n8n 工作流 | ❌ **禁止修改** | 自動化流程,與 dev 無關 |
| WordPress/n8n 資料庫 table | ❌ **禁止修改** | Marcom 團隊管理,與 dev 無關 |
### 開發環境
| 服務 | Port | 用途 | 命令 |
|------|------|------|------|
| Playground | 3003 | **唯一開發環境** | `cargo run --bin momentry_playground -- server` |
| Production | 3002 | ❌ 禁止修改 | `cargo run -- server` (僅 release 時) |
| Portal (Tauri) | 1420 | 前端開發 | `npm run tauri dev` |
## ⚠️ 交叉污染防制 (Cross-Contamination Prevention)
**每個執行前必須評估是否會汙染其他獨立作業。**
### Scope Isolation Matrix
| 執行內容 | 允許的 Scope | 禁止影響 | 檢查事項 |
|----------|-------------|----------|----------|
| M4 delivery binary | `target/release/momentry` | Playground (3003), Production (3002) | 確認舊 process 未被誤殺 |
| Playground server | `localhost:3003`, `dev.*` schema | Production (3002), `public.*` schema | `DATABASE_SCHEMA=dev` |
| Production deploy | `localhost:3002`, `public.*` schema | Playground (3003), `dev.*` schema | 先停 production不影響 playground |
| Git commit | 只包含意圖修改的檔案 | 無關的 untracked files | `git status` 確認 stage 內容正確 |
| CI / packaged tests | 測試環境 | 正式資料 | 測試用 DB 不能連到 production |
| Doc changes | 指定文件 | 其他文件、程式碼 | `git diff --stat` 檢查 scope |
| SQL migration | 目標 schema | 其他 schema、無關 table | `WHERE` clause 要精準 |
| `sed` / `grep` / mass edit | 目標檔案集 | 非目標檔案 | 先用 `grep -c` 確認只有目標檔案匹配 |
### Recent Violations / Near-Misses
| 事件 | 問題 | 防止方式 |
|------|------|----------|
| `sed` API doc 編號 | `sed -i '' 's/.../.../g'` 改到所有行 | 先 `grep -c` 確認匹配,`git diff` 再提交 |
| 亂加 `/api/v1/register` route | 不必要的 API 別名,汙染路由表 | 角色切換:路由設計不該由實作方決定 |
| `API_WORKSPACE/` vs `GUIDES/` vs `REFERENCE/` vs `DESIGN/` vs `OPERATIONS/` vs `INTEGRATIONS/` | 文件放到錯誤分類 | API 文件改在 API_WORKSPACE/modules/ 編輯,`make deploy` 生成到 GUIDES/ |
| Build release binary in plan mode | 浪費時間,無意義 | 嚴格遵守 plan/build mode 規定 |
### ⛔ 嚴格測試隔離規則 (Strict Test Isolation)
- **所有測試 (Test) 必須在 Dev (3003) 進行**。
- **絕對禁止 (ABSOLUTELY FORBIDDEN)** 在任何測試指令、Demo 流程或 API 檢查中使用 `localhost:3002`
- 即使是「測試 Unregister」或「檢查版本」若未明確標示為 "Production Deployment",一律視為違規。
- **預設行為**: 所有 curl, CLI, 或程式碼測試指令,預設 URL 必須為 `http://localhost:3003`
### 違反後果
- 修改 WordPress/n8n 可能影響 marcom 團隊工作與生產環境
- 修改 WordPress/n8n 資料庫 table 可能破壞自動化流程與資料完整性
- 修改 port 3002 可能中斷正在使用的服務 (這是非常嚴重的錯誤)
- 所有 dev 測試必須在 playground (3003) 進行
---
## AI Coding Principles (Karpathy-Inspired)
Behavioral guidelines to reduce common LLM coding mistakes.
Source: [andrej-karpathy-skills](https://github.com/forrestchang/andrej-karpathy-skills) (94K stars)
**Tradeoff:** These guidelines bias toward caution over speed. For trivial tasks, use judgment.
### 1. Think Before Coding
**Don't assume. Don't hide confusion. Surface tradeoffs.**
- State your assumptions explicitly. If uncertain, ask.
- If multiple interpretations exist, present them - don't pick silently.
- If a simpler approach exists, say so. Push back when warranted.
- If something is unclear, stop. Name what's confusing. Ask.
### 2. Simplicity First
**Minimum code that solves the problem. Nothing speculative.**
- No features beyond what was asked.
- No abstractions for single-use code.
- No "flexibility" or "configurability" that wasn't requested.
- No error handling for impossible scenarios.
- If you write 200 lines and it could be 50, rewrite it.
Ask yourself: "Would a senior engineer say this is overcomplicated?" If yes, simplify.
### 3. Surgical Changes
**Touch only what you must. Clean up only your own mess.**
When editing existing code:
- Don't "improve" adjacent code, comments, or formatting.
- Don't refactor things that aren't broken.
- Match existing style, even if you'd do it differently.
- If you notice unrelated dead code, mention it - don't delete it.
When your changes create orphans:
- Remove imports/variables/functions that YOUR changes made unused.
- Don't remove pre-existing dead code unless asked.
The test: Every changed line should trace directly to the user's request.
### 4. Goal-Driven Execution
**Define success criteria. Loop until verified.**
Transform tasks into verifiable goals:
- "Add validation" -> "Write tests for invalid inputs, then make them pass"
- "Fix the bug" -> "Write a test that reproduces it, then make it pass"
- "Refactor X" -> "Ensure tests pass before and after"
For multi-step tasks, state a brief plan:
```
1. [Step] -> verify: [check]
2. [Step] -> verify: [check]
3. [Step] -> verify: [check]
```
Strong success criteria let you loop independently. Weak criteria ("make it work") require constant clarification.
---
These guidelines are working if: fewer unnecessary changes in diffs, fewer rewrites due to overcomplication, and clarifying questions come before implementation rather than after mistakes.
---
## Terminology (V4.0)
| Term | Scope | Description | Example |
|------|-------|-------------|---------|
| **file_uuid** | Video file | Video file identifier (renamed from `video_uuid`) | `384b0ff44aaaa1f1` |
| **identity_uuid** | Global identity | Global person identity (cross-file) | `a9a90105-6d6b-46ff-92da-0c3c1a57dff4` |
| **face_id** | Single detection | Single face detection (frame-level) | `face_100` |
| **trace_id** | Face tracking | Face tracking ID (Face Tracker output) | `2` |
| **chunk_id** | Sentence chunk | Sentence chunk (from pre_chunks via rules) | `chunk_1` |
| **speaker_id** | Speaker segment | Speaker ID (from ASRX) | `SPEAKER_0` |
| **person_id** | ❌ **Deprecated** | Video-local person ID (removed in V4.0) | - |
### Architecture (V4.0)
```
Face → Identity (Two-layer, direct binding)
person_identities table: REMOVED
file_identities table: ADDED (N:N relationship)
```
### Key Changes (V3.x → V4.0)
| Change | V3.x | V4.0 |
|--------|------|------|
| **video_uuid** | Used everywhere | **file_uuid** |
| **person_identities** | Required (303 records) | **Removed** |
| **person_id APIs** | 28 endpoints | **Removed** (except register/bind) |
| **Face binding** | Person → Identity | **Face → Identity** (direct) |
| **Chunk binding** | Manual | **Auto** (time alignment) |
---
## Build & Run Commands
```bash
# Build project (use debug builds for development/testing)
cargo build
cargo build --bin momentry
cargo build --bin momentry_playground
# Build all binaries
cargo build --bins
# Run CLI
cargo run -- --help
cargo run -- register /path/to/video.mp4
cargo run -- server --host 0.0.0.0 --port 3002
# Run playground (development binary)
cargo run --bin momentry_playground -- server
cargo run --bin momentry_playground -- --help
```
### ⚠️ CRITICAL: `cargo build --release` PROHIBITION
- **NEVER run `cargo build --release` unless the user explicitly says "release the binary" or "正式 release"**
- `cargo build --release` is SLOW and only needed when producing a production binary for deployment
- For all development, testing, debugging, and linting: use `cargo build` or `cargo check`
- If uncertain, ALWAYS ask the user first
## Binaries
| Binary | Purpose | Port | Redis Prefix | Environment |
|--------|---------|------|--------------|-------------|
| `momentry` | Production | 3002 | `momentry:` | `.env` |
| `momentry_playground` | Development | 3003 | `momentry_dev:` | `.env.development` |
| `momentry_player` | Video player | - | - | - |
## Testing
```bash
# Run all tests
cargo test
# Run single test by name
cargo test test_name
# Run with output
cargo test -- --nocapture
# Doc tests
cargo test --doc
```
## Linting & Formatting
```bash
# Format code (edition=2021, max_width=100, tab_spaces=4)
cargo fmt
cargo fmt -- --check
# Lint
cargo clippy
cargo clippy --all-features
# Check for errors
cargo check
cargo check --all-features
```
## Code Style
### General
- Use Rust 2021 edition
- Use tracing for logging (not println!)
- Keep lines under 100 characters
### Imports (order: std → external → local)
```rust
use std::path::Path;
use anyhow::{Context, Result};
use async_trait::async_trait;
use serde::{Deserialize, Serialize};
use crate::core::chunk::Chunk;
```
### Error Handling
- Use `anyhow::Result<T>` for application code
- Use `thiserror` for library code
- Use `.context()` for error context
- Use `anyhow::bail!()` for early returns
```rust
fn example() -> Result<SomeType> {
let output = Command::new("ffprobe")
.args([...])
.output()
.context("Failed to run ffprobe")?;
if !output.status.success() {
anyhow::bail!("Command failed");
}
Ok(result)
}
```
### Naming
- Types/Enums: PascalCase (`VideoRecord`, `ChunkType`)
- Functions/Variables: snake_case (`get_video_by_uuid`)
- Traits: PascalCase with -er suffix (`Database`, `ChunkStore`)
- Files: snake_case (`postgres_db.rs`)
### Types
- Use `serde::{Deserialize, Serialize}` for serializable types
- Use `#[serde(rename_all = "snake_case")]` for enum variants
- Use explicit numeric types (i64, u32, f64)
```rust
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct VideoRecord {
pub id: i64,
pub uuid: String,
pub duration: f64,
pub width: u32,
}
#[derive(Debug, Clone, Copy, Serialize, Deserialize, PartialEq)]
#[serde(rename_all = "snake_case")]
pub enum ChunkType {
TimeBased,
Sentence,
Cut,
}
```
### Async Programming
- Use `tokio` runtime with full features
- Use `#[async_trait]` for async trait methods
```rust
#[async_trait]
pub trait Database: Send + Sync {
async fn init() -> Result<Self>
where Self: Sized;
}
```
## Code Structure
```
src/
├── main.rs # CLI entry point
├── lib.rs # Library exports
├── core/
│ ├── api_key/ # API key management (anomaly, blacklist, encryption, etc.)
│ ├── chunk/ # Chunking logic
│ ├── config.rs # Centralized configuration (env vars)
│ ├── db/ # Database (PostgreSQL, MongoDB, Redis, Qdrant)
│ ├── embedding/ # Vector embeddings
│ ├── overlay/ # Video overlay
│ ├── probe/ # ffprobe integration
│ ├── processor/ # ASR, OCR, YOLO, Face, Pose, CUT, ASRX
│ │ └── executor.rs # Unified Python script executor
│ ├── storage/ # File management
│ └── thumbnail/ # Thumbnail extraction
├── api/ # HTTP API (axum)
├── player/ # Video player
├── ui/ # TUI components
└── watcher/ # File system watcher
```
## Key Dependencies
- **Error handling**: `anyhow`, `thiserror`
- **Async**: `tokio` (full features), `async-trait`
- **CLI**: `clap` (derive)
- **Serialization**: `serde`, `serde_json`, `chrono`
- **Database**: `sqlx`, `mongodb`, `redis` (1.0), `qdrant-client`
- **HTTP**: `axum`, `tower`
- **Logging**: `tracing`, `tracing-subscriber`
- **Config**: `once_cell` (lazy static config)
## Environment Variables
### Server
- `MOMENTRY_SERVER_PORT` - API server port (default: `3002` for production, `3003` for playground)
- `MOMENTRY_REDIS_PREFIX` - Redis key prefix (default: `momentry:` for production, `momentry_dev:` for playground)
- `MOMENTRY_API_KEY` - API key for Player online mode testing
### Testing API Key
```bash
export MOMENTRY_API_KEY="muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
# Test Player online mode
cargo run --features player --bin momentry_player -- -o
```
### Database
- `DATABASE_URL` - PostgreSQL (default: `postgres://accusys@localhost:5432/momentry`)
### Redis
- `REDIS_URL` - Redis URL (default: `redis://:accusys@localhost:6379`)
- `REDIS_PASSWORD` - Redis password (default: `accusys`)
### Paths
- `MOMENTRY_OUTPUT_DIR` - Output directory (default: `/Users/accusys/momentry/output`)
- `MOMENTRY_BACKUP_DIR` - Backup directory
- `MOMENTRY_PYTHON_PATH` - Python path (default: `/opt/homebrew/bin/python3.11`)
- `MOMENTRY_SCRIPTS_DIR` - Scripts directory
### Processor Timeouts
- `MOMENTRY_ASR_TIMEOUT` - ASR timeout in seconds (default: 3600)
- `MOMENTRY_CUT_TIMEOUT` - CUT timeout in seconds (default: 3600)
- `MOMENTRY_DEFAULT_TIMEOUT` - Default timeout (default: 7200)
### TMDb Integration (Face Clustering)
- `TMDB_API_KEY` - TMDb API key for movie metadata lookup (required for `MOMENTRY_TMDB_PROBE_ENABLED=true`)
- `MOMENTRY_TMDB_PROBE_ENABLED` - Enable TMDb probe during registration (default: `false`)
- Register phase: searches TMDb by filename, creates identities with tmdb_id/tmdb_profile
- Post-process phase: matches detected faces against TMDb identities via cosine similarity
### Synonym Expansion
- `MOMENTRY_SYNONYM_FILES` - Comma-separated paths to synonym JSON files (e.g., `data/english_synonyms.json,data/llm_synonyms.json`)
- `MOMENTRY_SYNONYM_FILE` - Single synonym JSON file path (deprecated, use above)
### Logging
- `RUST_LOG` or `MOMENTRY_LOG_LEVEL` - Log level (default: `info`)
## Notes
- Unit tests exist (86 library tests)
- Video processing uses external tools (ffprobe, Python scripts)
- Multi-database architecture (PostgreSQL, MongoDB, Redis, Qdrant)
- Monitor directory is a separate system (not Rust)
- PythonExecutor provides unified script execution with timeout support
- Redis 1.0.x for improved performance
- FaceNet CoreML model (`models/facenet512.mlpackage`) replaces InsightFace for embedding extraction (MIT license, ANE-accelerated)
### LLM Synonym Generation
Generate synonym database using llama.cpp (Gemma4):
```bash
# Generate full database (162 entries, ~5 minutes)
python3 scripts/generate_synonyms_llamacpp.py
# Quick test
python3 scripts/generate_synonyms_llamacpp.py --test
# Resume from existing file
python3 scripts/generate_synonyms_llamacpp.py --resume
# Output: data/llm_synonyms.json (27 Chinese + 135 English words)
```
## Task Management
### 使用 todowrite 追蹤任務
```bash
# 創建任務清單
/todo 建立配置模組 [in_progress]
/todo 添加單元測試 [pending]
# 更新狀態
/todo 完成標記 [completed]
```
### 任務批次建議
- 一次處理 1-2 個功能
- 每個功能完成後驗證 (clippy + test)
- 驗證通過後再繼續下一個
## Code Review Checklist
完成任務後檢查:
- [ ] `cargo clippy --lib` 通過
- [ ] `cargo test --lib` 通過
- [ ] `cargo fmt -- --check` 通過
- [ ] 文檔已更新 (如需要)
- [ ] 新功能有單元測試
## Commit Guidelines
```bash
# feat: 新功能
git commit -m "feat: add monitor_jobs table"
# fix: 錯誤修復
git commit -m "fix: resolve SQL injection in store_vector"
# refactor: 重構
git commit -m "refactor: use parameterized queries"
# docs: 文檔更新
git commit -m "docs: update AGENTS.md with new modules"
```
## Pre-commit Hook
專案已配置 `.git/hooks/pre-commit`,提交前自動檢查:
```bash
# 檢查內容
1. cargo fmt --check # Rust 格式化檢查
2. cargo clippy --lib # Rust Lint 檢查
3. cargo test --lib # Rust 單元測試
4. ruff check # Python Lint 檢查
5. ruff format --check # Python 格式化檢查
6. markdownlint # Markdown 格式檢查
7. shellcheck # Shell 腳本檢查
# 跳過檢查(不建議)
git commit --no-verify
# 跳過特定檢查
git commit --skip-checks
```
**注意**: Hook 僅檢查已暫存的 Rust/Python/Markdown 文件。
### Python 環境設置
```bash
# 安裝 ruff
pip install ruff==0.11.2
# 格式化 Python 文件
ruff format scripts/
# Lint Python 文件
ruff check scripts/
```
### Markdown 環境設置
```bash
# 安裝 markdownlint-cli (使用系統 Node.js)
npm install -g markdownlint-cli
# 檢查 Markdown 文件
markdownlint docs/
# 配置檔案
.markdownlint.json
```
### Shell 環境設置
```bash
# 安裝 shellcheck
brew install shellcheck
# 檢查 Shell 腳本
shellcheck scripts/*.sh monitor/**/*.sh
```
**注意**: Hook 只檢查 error 等級的 shellcheck 問題style 警告會顯示但不阻擋提交。
## Release Workflow
### Release 前準備
每次 release production binary 前,必須:
1. **建立 Release Tag**
```bash
git tag -a v0.X.X -m "Release vX.X.X - YYYY-MM-DD"
git push origin v0.X.X
```
2. **備份獨立 Source Code**
```bash
# 建立 release 獨立目錄
RELEASE_DIR="/Users/accusys/momentry_core_releases/v0.X.X"
mkdir -p "$RELEASE_DIR"
# 複製完整原始碼(排除不必要的檔案)
rsync -av --exclude='.git' --exclude='target' --exclude='node_modules' \
/Users/accusys/momentry_core_0.1/ "$RELEASE_DIR/"
# 記錄 release 資訊
echo "Release: v0.X.X" > "$RELEASE_DIR/RELEASE_INFO.txt"
echo "Date: $(date)" >> "$RELEASE_DIR/RELEASE_INFO.txt"
echo "Git Commit: $(git rev-parse HEAD)" >> "$RELEASE_DIR/RELEASE_INFO.txt"
echo "Binary: $(ls -la target/release/momentry)" >> "$RELEASE_DIR/RELEASE_INFO.txt"
```
3. **備份 Binary**
```bash
cp target/release/momentry "$RELEASE_DIR/momentry_v0.X.X"
cp target/release/momentry_playground "$RELEASE_DIR/momentry_playground_v0.X.X" 2>/dev/null
```
4. **記錄資料庫 Schema**
```bash
pg_dump -U accusys -d momentry --schema-only > "$RELEASE_DIR/schema_v0.X.X.sql"
```
### 重要性
- 避免 release binary 與 current source code 不一致
- 方便追蹤特定 release 的程式碼狀態
- 必要時可快速復原或比對差異
- 確保資料庫 schema 與程式碼版本對應
## Reference Documents
| 文件 | 用途 |
|------|------|
| `docs/OPENCODE_GUIDE.md` | OpenCode 使用規範 |
| `docs/ARCHITECTURE_EVALUATION.md` | 架構優化待評估項目 (含 GraphRAG) |
| `docs/PENDING_ISSUES.md` | 待解決問題追蹤 |
| `docs/MOMENTRY_CORE_MONITORING.md` | 監控系統規範 |
| `docs/MOMENTRY_CORE_REDIS_KEYS.md` | Redis Key 設計規範 |
| `docs/PYTHON.md` | Python 腳本規範 |
| `docs/FILE_CHANGE_MANAGEMENT.md` | 文件修改管理規範 |
| `docs/YOLO_RESUME_INTEGRATION.md` | YOLO Resume 功能整合記錄 |
| `docs/DOCUMENT_EMBEDDING_STRATEGY.md` | Parent-Child 嵌入策略 |
| `docs/PROCESSING_PIPELINE.md` | 處理流程文檔 |
| `docs/N8N_DEMO_WORKFLOW.md` | n8n 工作流文檔 |
| `docs/FRESH_MAC_INSTALLATION.md` | 全新 Mac 安裝指南 |
| `docs/SERVICES.md` | 服務總覽與管理 |
| `docs/SFTPGO_DEMO_USER.md` | SFTPGo 用戶指南 |
## Document Change Workflow
修改文件前請參考 `docs/FILE_CHANGE_MANAGEMENT.md`,確保:
1. **修改前**:完整閱讀文件、執行預檢清單
2. **修改中**:提供變更計畫、取得確認
3. **修改後**:展示 diff、更新版本歷史
4. **驗證**:執行 lint/test、提交前審查
### AI 工具修改規範
AI 工具修改文件時:
- 必須先完整閱讀文件(不可只讀取部分章節)
- 修改前先提出變更計畫供確認
- 修改後展示 diff 內容
- 更新版本歷史表
## PHP Development
WordPress 作為 Momentry Portal負責 n8n 自動化與 sftpgo 檔案服務的頁面整合。
### 編輯器設定
| 編輯器 | LSP 方案 | 安裝方式 |
|--------|----------|----------|
| VS Code | Intelephense | Extension Marketplace (推薦) |
| Cursor | Intelephense | Extension Marketplace (推薦) |
| CLI | phpactor | `~/bin/phpactor` |
### Intelephense (VS Code/Cursor)
1. 安裝 Extension: 搜尋 "Intelephense"
2. 設定:
```json
{
"intelephense.stubs": ["wordpress"]
}
```
### phpactor (CLI)
```bash
# 安裝方式
brew install composer
curl -sSL https://github.com/phpactor/phpactor/releases/latest/download/phpactor.phar -o ~/bin/phpactor
chmod +x ~/bin/phpactor
# 安裝 WordPress Stubs
cd /Users/accusys/wordpress/web
composer require --dev php-stubs/wordpress-stubs
# 建立 WordPress 索引
cd /Users/accusys/wordpress/web
~/bin/phpactor index:build --reset
# 常用指令
~/bin/phpactor class:search "WP_User" # 搜尋類別
~/bin/phpactor index:query WP_User # 查看類別資訊
~/bin/phpactor navigate /path/to/file.php # 導航到定義
```
### WordPress 程式碼位置
| 類型 | 路徑 |
|------|------|
| 主題 | `/Users/accusys/wordpress/web/wp-content/themes/` |
| 插件 | `/Users/accusys/wordpress/web/wp-content/plugins/` |
### 與 marcom 團隊協作
| 角色 | 負責 |
|------|------|
| marcom 團隊 | Figma 設計 / Elementor 建構 |
| OpenCode | 程式碼實作 / 重構 |
### 開發時程
```
Phase 1: marcom 建構 (現在) → Elementor 頁面建構
Phase 2: 交付審視 (TBD) → 功能確認 / 重構評估
Phase 3: OpenCode 重構 → 純程式碼實作,交付無 Elementor 依賴版本
```
## M4 通知規範
### 固定通知方式
通知 M4 的唯一管道:**`M4_workspace/` 下建立回覆文件 + `git commit`**。不需口頭、即時訊息、郵件。
### 命名規則
```
docs_v1.0/M4_workspace/YYYY-MM-DD_<topic>_response.md (回覆 M4 問題)
docs_v1.0/M4_workspace/YYYY-MM-DD_<topic>.md (主動通報)
docs_v1.0/M4_workspace/YYYY-MM-DD_<topic>_test_report.md (測試報告)
```
### 觸發時機
| 情境 | 動作 |
|------|------|
| M4 提交問題報告到 `M4_workspace/` | 修復後,回覆 `*_response.md` |
| 完成 M4 要求的任務 | 回覆 `*_response.md` |
| 重大變更(模型替換、架構變更) | 主動通知 `*.md` |
| 新測試包產出 | `*_test_report.md` |
### 交付檢查
1. 文件寫入 `docs_v1.0/M4_workspace/`
2. `git add` 包含該文件
3. `git commit` 含相關變更
4. M4 透過 git log 查看
詳細規範見 `docs_v1.0/M4_workspace/M4_NOTIFICATION_PROTOCOL.md`。
## UUID Naming Rule
**Never use bare `uuid` in API route paths, query params, JSON keys, or code variable names. Always qualify:**
| Context | Must use | Never |
|---------|----------|-------|
| Video/file resource | `file_uuid` | `uuid` |
| Identity resource | `identity_uuid` | `uuid` |
| Query parameter | `file_uuid=`, `identity_uuid=` | `uuid=` |
| Route path | `:file_uuid`, `:identity_uuid` | `:uuid` |
| JSON key | `"file_uuid"`, `"identity_uuid"` | `"uuid"` |
This applies to docs, code, API responses, and curl examples. Exceptions: internal database primary key names (e.g. `identities.uuid` column).
## Document Compliance Checklist
Before creating any file in `docs_v1.0/` (API_WORKSPACE, GUIDES, REFERENCE, DESIGN, OPERATIONS, INTEGRATIONS), verify all items below.
**IMPORTANT**: API functional documents are generated from `API_WORKSPACE/modules/`. Edit modules there, then run `make deploy` in `API_WORKSPACE/` to update `GUIDES/`. Never edit generated files in `GUIDES/` directly. See `DESIGN/Modular_Doc_System_V1.0.md` for the full system design.
### P0 — Mandatory (7 items)
| # | Check | Rule |
|---|-------|------|
| 1 | YAML frontmatter | `title`, `version`, `date`, `author`, `status` present |
| 2 | Version history | Table at bottom of file tracking changes |
| 3 | Top info table | scope, status, applicable to, etc. |
| 4 | PascalCase filename | e.g. `DetectorRegistry.md`, not `detector_registry.md` |
| 5 | `_` separator | Within filenames use `_`, never spaces or other chars |
| 6 | English content | Entire file in English |
| 7 | Correct directory | File must reside in appropriate directory: `API_WORKSPACE/modules/` (API endpoint modules), `GUIDES/` (user docs, generated), `REFERENCE/` (data models), `DESIGN/` (architecture), `OPERATIONS/` (infra/release), `INTEGRATIONS/` (n8n/tests) |
### P0b — UUID Naming
| # | Check | Rule |
|---|-------|------|
| 8 | `file_uuid` not bare `uuid` | All file references use `file_uuid` (see UUID Naming Rule above) |
| 9 | `identity_uuid` not bare `uuid` | All identity references use `identity_uuid` |
### P1 — Suggested (3 items)
| # | Check | Note |
|---|-------|------|
| 1 | Cross-references | Link to related docs in API_WORKSPACE/, GUIDES/, REFERENCE/, DESIGN/, OPERATIONS/ |
| 2 | Glossary terms | Define non-obvious terms inline or link glossary |
| 3 | Diagrams | Include Mermaid/ASCII diagram for complex topics |
### Exception
`M4_workspace/` files are exempt from this checklist (free-format reply documents).
---
## Delivery Procedure
完整交付程序M4_workspace → M5 → Release → Deploy → Public
`docs_v1.0/OPERATIONS/DELIVERY_PROCEDURE.md`

View File

@@ -0,0 +1,71 @@
# System Audit — 2026-05-17
## Current State
### Embedding Storage (三重冗余,無主)
| 資料類型 | PG pgvector | Qdrant | JSON 檔案 |
|---------|------------|--------|-----------|
| Sentence 向量 | `chunk.embedding` ✅ | `dev_v1` / `rule1_v2` / `sentence_*` ✅ | ❌ 無 |
| Story 向量 | `chunk.embedding` ✅ | `dev_v1` / `dev_stories` ✅ | `.story_llm.json` ✅ |
| Face 向量 | ❌ 已清除(依使用者指示) | `dev_faces` ✅ (97K) | `.face.json` ✅ |
| Voice 向量 | ❌ 無 | `dev_voice` ✅ (4K) | ❌ 無 |
### Pipeline 問題
| 問題 | 影響 |
|------|------|
| `processor_results.duration_secs` 全為 0 | 無法查各步驟耗時 |
| `processor_results.started_at/completed_at` 全 NULL | 時間線遺失 |
| Redis timing 在 job 完成後被清掉 | 唯一 timing 來源消失 |
| `get_chunk_by_chunk_id_and_uuid` 原本是 stub已修 | Smart search 找不到 PG chunk |
| `server.rs::search()` 未 mount 但仍編譯 | Dead code混淆 Qdrant 用途 |
| Face embedding 只寫 Qdrant 不寫 PG | 已刪除則全失 |
### Qdrant Collections 現況
| Collection | Points | 來源 | UUID |
|-----------|--------|------|------|
| `dev_v1` | 9,936 | PG rebuild | ✅ bd80fec... |
| `dev_faces` | 97,000 | face.json rebuild | ✅ bd80fec... |
| `dev_stories` | 560 | Snapshot | ✅ bd80fec... |
| `dev_voice` | 4,188 | Snapshot | ✅ bd80fec... |
| `dev_rule1_v2` | 3,417 | Snapshot | ✅ bd80fec... |
| `sentence_story` | 4,188 | Snapshot | ✅ bd80fec... |
| `sentence_summary` | 4,188 | Snapshot | ✅ bd80fec... |
## Safeguards & Fixes
### P0 — 必須修
| # | Fix | 做法 |
|---|-----|------|
| 1 | **Pipeline timing 寫入 DB** | `update_processor_result()` 加入 `started_at``completed_at``duration_secs` |
| 2 | **Qdrant 不當主要儲存** | Embedding 以 PG `chunk.embedding` 為 source of truthQdrant 唯讀 cache |
| 3 | **Smart search 只走 PG pgvector** | `search_parent_chunks_semantic` 已正確,無需 Qdrant |
| 4 | **移除 `server.rs::search()` dead code** | 或 mount 到正式 route 並確認可用 |
### P1 — 建議修
| # | Fix | 做法 |
|---|-----|------|
| 5 | **刪除 Qdrant 前先 snapshot** | 自動 snapshot script |
| 6 | **清理多餘 Qdrant collections** | `dev_voice` / `dev_stories` / `dev_rule1_v2` / `sentence_*` 無 server reader可移除 |
| 7 | **Face embedding 寫入 PG 或移除 dead code** | 目前 face Qdrant write 無人讀取,可移除 `sync_face_embeddings` |
| 8 | **UUID 一致性檢查** | 同一 content 不應產生不同 UUID |
### P2 — 可選
| # | Fix | 做法 |
|---|-----|------|
| 9 | `chunk_selector.rs` player binaryhardcode `momentry_rule1` | 改讀 env var 或 PG |
| 10 | AGENTS.md 已加入 delete 安全規則 | ✅ Done |
## Data Recovery Path
| 資料來源 | 可恢復到 | 方法 |
|---------|---------|------|
| `chunk.embedding` (PG) | Qdrant `dev_v1` | SQL → Qdrant upsert |
| `face.json` (磁碟) | Qdrant `dev_faces` | Python script |
| `story_llm.json` (磁碟) | Qdrant `dev_stories` | Python script |
| Qdrant snapshots (phase1) | Qdrant collections | Snapshot upload API |

View File

@@ -0,0 +1,388 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>01 Auth - Momentry API Docs</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; background: #f5f5f5; color: #333; padding: 40px; }
.container { max-width: 960px; margin: 0 auto; background: white; border-radius: 12px; box-shadow: 0 2px 12px rgba(0,0,0,0.08); padding: 40px; }
h1 { font-size: 24px; margin: 24px 0 12px; }
h2 { font-size: 20px; margin: 20px 0 10px; color: #222; }
h3 { font-size: 16px; margin: 16px 0 8px; color: #444; }
p { line-height: 1.6; margin: 8px 0; }
table { border-collapse: collapse; width: 100%; margin: 12px 0; font-size: 14px; }
th, td { border: 1px solid #ddd; padding: 8px 12px; text-align: left; }
th { background: #f0f0f0; font-weight: 600; }
code { background: #f0f0f0; padding: 2px 6px; border-radius: 3px; font-size: 13px; }
pre { background: #f8f8f8; border: 1px solid #ddd; border-radius: 6px; padding: 12px; overflow-x: auto; margin: 12px 0; }
pre code { background: none; padding: 0; }
a { color: #0066cc; }
.back { display: inline-block; margin-bottom: 20px; color: #666; }
.back:hover { color: #333; }
</style>
</head>
<body>
<div class="container">
<a class="back" href="index.html">&larr; Back to index</a>
<!-- module: auth -->
<!-- description: Authentication — login, logout, JWT, session cookie, API key -->
<!-- depends: -->
<h2>Base URL</h2>
<table class="table">
<thead>
<tr>
<th>Environment</th>
<th>URL</th>
<th>Purpose</th>
</tr>
</thead>
<tbody>
<tr>
<td>Production</td>
<td><code>http://localhost:3002</code></td>
<td>Production deployment</td>
</tr>
<tr>
<td>External (M5)</td>
<td><code>https://m5api.momentry.ddns.net</code></td>
<td>Remote access</td>
</tr>
</tbody>
</table>
<h2>Variables</h2>
<p>All examples in this documentation use these environment variables:</p>
<div class="codehilite"><pre><span></span><code><span class="nv">API</span><span class="o">=</span><span class="s2">&quot;http://localhost:3002&quot;</span>
<span class="nv">KEY</span><span class="o">=</span><span class="s2">&quot;your-api-key-here&quot;</span>
</code></pre></div>
<h2>Authentication</h2>
<p>All endpoints under <code>/api/v1/*</code> require authentication.
The following endpoints are public (no auth needed):</p>
<ul>
<li><code>GET /health</code></li>
<li><code>POST /api/v1/auth/login</code></li>
<li><code>POST /api/v1/auth/logout</code></li>
</ul>
<h3>Three Authentication Modes</h3>
<p>The system supports three authentication methods, checked in <strong>priority order</strong> by the middleware:</p>
<div class="codehilite"><pre><span></span><code>Middleware priority:
1. Session Cookie (Portal/browser)
2. JWT Bearer (API clients, CLI)
3. API Key Header (legacy compatibility)
4. API Key Query Param (?api_key=)
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Mode</th>
<th>Transport</th>
<th>Expiry</th>
<th>Scope</th>
<th>Best for</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Session Cookie</strong></td>
<td><code>Cookie: session_id=&lt;session_id&gt;</code></td>
<td>24h</td>
<td>per-browser session</td>
<td>Portal (browser)</td>
</tr>
<tr>
<td><strong>JWT</strong></td>
<td><code>Authorization: Bearer &lt;token&gt;</code></td>
<td>1h</td>
<td>per-login token</td>
<td>API clients, CLI, scripts</td>
</tr>
<tr>
<td><strong>API Key</strong></td>
<td><code>X-API-Key: &lt;key&gt;</code></td>
<td>90d</td>
<td>fixed key for automation</td>
<td>Legacy scripts, WordPress</td>
</tr>
</tbody>
</table>
<hr />
<h3>Login</h3>
<p><strong>Default accounts &amp; API keys:</strong></p>
<table class="table">
<thead>
<tr>
<th>Username</th>
<th>Password</th>
<th>API Key</th>
<th>Role</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>admin</code></td>
<td><code>admin</code></td>
<td></td>
<td>admin</td>
</tr>
<tr>
<td><code>demo</code></td>
<td><code>demo</code></td>
<td><code>muser_demo_key_32chars_abcdef1234567890</code></td>
<td>user</td>
</tr>
</tbody>
</table>
<p>The demo API key is set via <code>MOMENTRY_DEMO_API_KEY</code> env var and can be used in place of JWT for marcom integrations:</p>
<div class="codehilite"><pre><span></span><code><span class="c1"># Using API key instead of JWT</span>
curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/files/scan&quot;</span><span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: muser_demo_key_32chars_abcdef1234567890&quot;</span>
</code></pre></div>
<div class="codehilite"><pre><span></span><code><span class="c1"># Login as admin</span>
curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/auth/login&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-d<span class="w"> </span><span class="s1">&#39;{&quot;username&quot;: &quot;admin&quot;, &quot;password&quot;: &quot;admin&quot;}&#39;</span>
<span class="c1"># Login as demo user</span>
curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/auth/login&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-d<span class="w"> </span><span class="s1">&#39;{&quot;username&quot;: &quot;demo&quot;, &quot;password&quot;: &quot;demo&quot;}&#39;</span>
</code></pre></div>
<h4>Success Response</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;success&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;jwt&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;eyJhbGciOiJIUzI1NiIs...&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;api_key&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;muser_...&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;user&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;username&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;admin&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;role&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;admin&quot;</span>
<span class="w"> </span><span class="p">},</span>
<span class="w"> </span><span class="nt">&quot;expires_at&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;2026-05-18T13:00:00Z&quot;</span>
<span class="p">}</span>
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>jwt</code></td>
<td>string</td>
<td>JWT access token. Use as <code>Authorization: Bearer &lt;jwt&gt;</code>. Expires in 1 hour.</td>
</tr>
<tr>
<td><code>api_key</code></td>
<td>string</td>
<td>Legacy API key. Use as <code>X-API-Key: &lt;key&gt;</code>. Good for 90 days.</td>
</tr>
<tr>
<td><code>user.username</code></td>
<td>string</td>
<td>Username</td>
</tr>
<tr>
<td><code>user.role</code></td>
<td>string</td>
<td>Role: <code>admin</code>, <code>user</code>, or <code>readonly</code></td>
</tr>
<tr>
<td><code>expires_at</code></td>
<td>string</td>
<td>ISO8601 timestamp of JWT expiration</td>
</tr>
</tbody>
</table>
<p>The login endpoint also sets a <code>Set-Cookie</code> header for browser-based clients:</p>
<div class="codehilite"><pre><span></span><code><span class="nt">Set-Cookie</span><span class="o">:</span><span class="w"> </span><span class="nt">session_id</span><span class="o">=&lt;</span><span class="nt">session_id</span><span class="o">&gt;;</span><span class="w"> </span><span class="nt">Path</span><span class="o">=/;</span><span class="w"> </span><span class="nt">HttpOnly</span><span class="o">;</span><span class="w"> </span><span class="nt">SameSite</span><span class="o">=</span><span class="nt">Strict</span><span class="o">;</span><span class="w"> </span><span class="nt">Max-Age</span><span class="o">=</span><span class="nt">86400</span>
</code></pre></div>
<h4>Error Response (401)</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;success&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;message&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Invalid username or password&quot;</span>
<span class="p">}</span>
</code></pre></div>
<hr />
<h3>Using JWT</h3>
<p>JWT is preferred for API clients (CLI scripts, WordPress). It is validated by the middleware without a database lookup (stateless).</p>
<div class="codehilite"><pre><span></span><code><span class="c1"># Login and capture JWT</span>
<span class="nv">JWT</span><span class="o">=</span><span class="k">$(</span>curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/auth/login&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-d<span class="w"> </span><span class="s1">&#39;{&quot;username&quot;:&quot;admin&quot;,&quot;password&quot;:&quot;admin&quot;}&#39;</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>python3<span class="w"> </span>-c<span class="w"> </span><span class="s2">&quot;import json,sys;print(json.load(sys.stdin)[&#39;jwt&#39;])&quot;</span><span class="k">)</span>
<span class="c1"># Use JWT for all subsequent requests</span>
curl<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Authorization: Bearer </span><span class="nv">$JWT</span><span class="s2">&quot;</span><span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/files/scan&quot;</span>
curl<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Authorization: Bearer </span><span class="nv">$JWT</span><span class="s2">&quot;</span><span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/resource/tmdb&quot;</span>
</code></pre></div>
<p>JWT is short-lived (1 hour). When it expires, request a new one via login.</p>
<hr />
<h3>Using Session Cookie (Browser)</h3>
<p>Browser-based clients (Portal) get a session cookie automatically after login. The browser sends the cookie with every request—no manual header needed.</p>
<div class="codehilite"><pre><span></span><code><span class="c1"># Login captures the session cookie from Set-Cookie header</span>
curl<span class="w"> </span>-v<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/auth/login&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-d<span class="w"> </span><span class="s1">&#39;{&quot;username&quot;:&quot;admin&quot;,&quot;password&quot;:&quot;admin&quot;}&#39;</span><span class="w"> </span><span class="m">2</span>&gt;<span class="p">&amp;</span><span class="m">1</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>grep<span class="w"> </span><span class="s2">&quot;Set-Cookie&quot;</span>
<span class="c1"># Browser automatically sends: Cookie: session_id=&lt;session_id&gt;</span>
<span class="c1"># No manual header needed for subsequent requests</span>
</code></pre></div>
<p>The session cookie is HttpOnly (not accessible from JavaScript) and SameSite=Strict (protected against CSRF).</p>
<hr />
<h3>Using Legacy API Key</h3>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/files/scan&quot;</span>
<span class="c1"># Also accepted via Bearer header (non-JWT format) or query parameter:</span>
curl<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Authorization: Bearer </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/files/scan&quot;</span>
curl<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/files/scan?api_key=</span><span class="nv">$KEY</span><span class="s2">&quot;</span>
</code></pre></div>
<p>API keys are validated via SHA256 hash lookup in the database. They are long-lived (90 days) and intended for automation.</p>
<h3>Obtaining an API Key (CLI)</h3>
<div class="codehilite"><pre><span></span><code>momentry<span class="w"> </span>api-key<span class="w"> </span>create<span class="w"> </span><span class="s2">&quot;My API Key&quot;</span><span class="w"> </span>--key-type<span class="w"> </span>user
</code></pre></div>
<hr />
<h3>Logout</h3>
<div class="codehilite"><pre><span></span><code><span class="c1"># Logout using the session cookie (browser)</span>
curl<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/auth/logout&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Cookie: session_id=&lt;uuid&gt;&quot;</span>
</code></pre></div>
<h4>What logout does</h4>
<table class="table">
<thead>
<tr>
<th>Auth mode</th>
<th>Effect</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Session Cookie</strong></td>
<td>Session deleted from database. Same cookie returns 401 on subsequent requests.</td>
</tr>
<tr>
<td><strong>JWT</strong></td>
<td>JWT remains valid until expiry. (JWT is stateless — logout adds JWT to a blacklist only if API key mode is used.)</td>
</tr>
<tr>
<td><strong>API Key</strong></td>
<td>API key remains valid. (Legacy keys are shared across sessions — revoking would break other clients.)</td>
</tr>
</tbody>
</table>
<h4>Example: full session lifecycle</h4>
<div class="codehilite"><pre><span></span><code><span class="c1"># 1. Login</span>
<span class="nv">SESSION_ID</span><span class="o">=</span><span class="k">$(</span>curl<span class="w"> </span>-s<span class="w"> </span>-D<span class="w"> </span>-<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/auth/login&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-d<span class="w"> </span><span class="s1">&#39;{&quot;username&quot;:&quot;admin&quot;,&quot;password&quot;:&quot;admin&quot;}&#39;</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>grep<span class="w"> </span><span class="s2">&quot;Set-Cookie&quot;</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>sed<span class="w"> </span><span class="s1">&#39;s/.*session_id=\([^;]*\).*/\1/&#39;</span><span class="k">)</span>
<span class="c1"># 2. Use session (works)</span>
curl<span class="w"> </span>-s<span class="w"> </span>-o<span class="w"> </span>/dev/null<span class="w"> </span>-w<span class="w"> </span><span class="s2">&quot;HTTP %{http_code}\n&quot;</span><span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/resource/tmdb&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Cookie: session_id=</span><span class="nv">$SESSION_ID</span><span class="s2">&quot;</span>
<span class="c1"># → HTTP 200</span>
<span class="c1"># 3. Logout</span>
curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/auth/logout&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Cookie: session_id=</span><span class="nv">$SESSION_ID</span><span class="s2">&quot;</span>
<span class="c1"># → {&quot;success&quot;: true}</span>
<span class="c1"># 4. Use session again (rejected)</span>
curl<span class="w"> </span>-s<span class="w"> </span>-o<span class="w"> </span>/dev/null<span class="w"> </span>-w<span class="w"> </span><span class="s2">&quot;HTTP %{http_code}\n&quot;</span><span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/resource/tmdb&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Cookie: session_id=</span><span class="nv">$SESSION_ID</span><span class="s2">&quot;</span>
<span class="c1"># → HTTP 401</span>
</code></pre></div>
<hr />
<h3>Authentication Flow Summary</h3>
<div class="codehilite"><pre><span></span><code>Login Request
┌──────────────────┐
│ 1. Check users │ ← users table (argon2 password verify)
│ table │
└──────┬───────────┘
┌───┴───┐
│ match │
└───┬───┘
┌──────────────────┐
│ 2. Create JWT │ ← 1h expiry, signed with JWT_SECRET
├──────────────────┤
│ 3. Create │ ← 24h expiry, stored in sessions table
│ session │
├──────────────────┤
│ 4. Set-Cookie │ ← HttpOnly, SameSite=Strict, Path=/
├──────────────────┤
│ 5. Return │ ← JWT + api_key + user info to client
└──────────────────┘
</code></pre></div>
<div class="codehilite"><pre><span></span><code>Protected Request
┌──────────────────────┐
│ Middleware checks: │
│ │
│ 1. Cookie session? │ → DB lookup session → get api_key → verify
│ │
│ 2. JWT Bearer? │ → verify JWT signature → decode claims
│ │
│ 3. X-API-Key? │ → SHA256 hash → DB lookup → verify
│ │
│ 4. ?api_key=? │ → same as #3
│ │
│ 5. None → 401 │
└──────────────────────┘
</code></pre></div>
<hr />
<h3>Error Responses</h3>
<table class="table">
<thead>
<tr>
<th>HTTP</th>
<th>When</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>401</code></td>
<td>Missing or invalid authentication</td>
</tr>
<tr>
<td><code>401</code></td>
<td>Session expired or logged out</td>
</tr>
<tr>
<td><code>401</code></td>
<td>JWT expired</td>
</tr>
<tr>
<td><code>401</code></td>
<td>API key revoked or inactive</td>
</tr>
</tbody>
</table>
<hr />
<h3>Related</h3>
<ul>
<li><code>POST /api/v1/resource/tmdb/check</code> — test authentication + TMDb API connectivity</li>
<li><code>GET /health/detailed</code> — view auth status (integrations section)</li>
</ul>
</div>
</body>
</html>

View File

@@ -0,0 +1,277 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>02 Health - Momentry API Docs</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; background: #f5f5f5; color: #333; padding: 40px; }
.container { max-width: 960px; margin: 0 auto; background: white; border-radius: 12px; box-shadow: 0 2px 12px rgba(0,0,0,0.08); padding: 40px; }
h1 { font-size: 24px; margin: 24px 0 12px; }
h2 { font-size: 20px; margin: 20px 0 10px; color: #222; }
h3 { font-size: 16px; margin: 16px 0 8px; color: #444; }
p { line-height: 1.6; margin: 8px 0; }
table { border-collapse: collapse; width: 100%; margin: 12px 0; font-size: 14px; }
th, td { border: 1px solid #ddd; padding: 8px 12px; text-align: left; }
th { background: #f0f0f0; font-weight: 600; }
code { background: #f0f0f0; padding: 2px 6px; border-radius: 3px; font-size: 13px; }
pre { background: #f8f8f8; border: 1px solid #ddd; border-radius: 6px; padding: 12px; overflow-x: auto; margin: 12px 0; }
pre code { background: none; padding: 0; }
a { color: #0066cc; }
.back { display: inline-block; margin-bottom: 20px; color: #666; }
.back:hover { color: #333; }
</style>
</head>
<body>
<div class="container">
<a class="back" href="index.html">&larr; Back to index</a>
<!-- module: health -->
<!-- description: Health check endpoints -->
<!-- depends: 01_auth -->
<h2>Health Check</h2>
<h3><code>GET /health</code></h3>
<p><strong>Auth</strong>: Public
<strong>Scope</strong>: system-level</p>
<p>Returns basic server health status — used by load balancers and monitoring.</p>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/health&quot;</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>jq<span class="w"> </span><span class="s1">&#39;{status, version}&#39;</span>
</code></pre></div>
<h4>Response (200)</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;status&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;ok&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;version&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;1.0.0&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;build_git_hash&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;3a6c1865&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;build_timestamp&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;2026-05-16T13:38:15Z&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;uptime_ms&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">3015</span>
<span class="p">}</span>
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>status</code></td>
<td>string</td>
<td><code>ok</code> or <code>degraded</code></td>
</tr>
<tr>
<td><code>version</code></td>
<td>string</td>
<td>Semver version</td>
</tr>
<tr>
<td><code>build_git_hash</code></td>
<td>string</td>
<td>Git commit hash</td>
</tr>
<tr>
<td><code>build_timestamp</code></td>
<td>string</td>
<td>Binary build time</td>
</tr>
<tr>
<td><code>uptime_ms</code></td>
<td>integer</td>
<td>Milliseconds since server start</td>
</tr>
</tbody>
</table>
<hr />
<h3><code>GET /health/detailed</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: system-level</p>
<p>Returns full system health including each service status, resource utilization, pipeline readiness, schema migration status, identity file sync status, and external integrations.</p>
<blockquote>
<p>Requires authentication (JWT, session cookie, or API key). The basic <code>/health</code> endpoint remains public for load balancer checks.</p>
</blockquote>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/health/detailed&quot;</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>jq<span class="w"> </span><span class="s1">&#39;{status, services, resources: {cpu: .resources.cpu_used_percent, memory: .resources.memory_used_percent}}&#39;</span>
</code></pre></div>
<h4>Response (200)</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;status&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;ok&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;version&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;1.0.0&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;services&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;postgres&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="nt">&quot;status&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;ok&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;latency_ms&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">3</span><span class="p">},</span>
<span class="w"> </span><span class="nt">&quot;redis&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="nt">&quot;status&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;ok&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;latency_ms&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">1</span><span class="p">},</span>
<span class="w"> </span><span class="nt">&quot;qdrant&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="nt">&quot;status&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;ok&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;latency_ms&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">5</span><span class="p">}</span>
<span class="w"> </span><span class="p">},</span>
<span class="w"> </span><span class="nt">&quot;resources&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;cpu_used_percent&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">12.5</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;memory_available_mb&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">32768</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;memory_used_percent&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">31.7</span>
<span class="w"> </span><span class="p">},</span>
<span class="w"> </span><span class="nt">&quot;pipeline&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;scripts_ready&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;scripts_count&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">345</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;processors&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;asr&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;yolo&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;face&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;pose&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;ocr&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;cut&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;scene&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;asrx&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;visual_chunk&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span>
<span class="w"> </span><span class="p">},</span>
<span class="w"> </span><span class="nt">&quot;models_ready&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;models_count&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">42</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;scripts_integrity&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="nt">&quot;matched&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">332</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;total&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">345</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;ok&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="p">},</span>
<span class="w"> </span><span class="nt">&quot;ffmpeg&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span>
<span class="w"> </span><span class="p">},</span>
<span class="w"> </span><span class="nt">&quot;schema&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;table_exists&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;applied&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[{</span><span class="nt">&quot;filename&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;migrate_add_users_table.sql&quot;</span><span class="p">}],</span>
<span class="w"> </span><span class="nt">&quot;required&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[],</span>
<span class="w"> </span><span class="nt">&quot;ok&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span>
<span class="w"> </span><span class="p">},</span>
<span class="w"> </span><span class="nt">&quot;identities&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;directory_exists&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;files_count&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">3481</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;index_ok&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;db_count&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">3481</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;synced&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span>
<span class="w"> </span><span class="p">},</span>
<span class="w"> </span><span class="nt">&quot;integrations&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;tmdb&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;api_key_configured&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;enabled&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;api_reachable&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">null</span>
<span class="w"> </span><span class="p">}</span>
<span class="w"> </span><span class="p">}</span>
<span class="p">}</span>
</code></pre></div>
<h4>Response Fields</h4>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>status</code></td>
<td>string</td>
<td><code>ok</code> if all essential services healthy</td>
</tr>
<tr>
<td><code>services</code></td>
<td>object</td>
<td>Per-service status (postgres, redis, qdrant)</td>
</tr>
<tr>
<td><code>services.*.status</code></td>
<td>string</td>
<td><code>ok</code>, <code>error</code>, or <code>degraded</code></td>
</tr>
<tr>
<td><code>services.*.latency_ms</code></td>
<td>int</td>
<td>Response time in milliseconds</td>
</tr>
<tr>
<td><code>resources</code></td>
<td>object</td>
<td>CPU, memory usage</td>
</tr>
<tr>
<td><code>pipeline.scripts_ready</code></td>
<td>boolean</td>
<td>Scripts directory accessible</td>
</tr>
<tr>
<td><code>pipeline.scripts_count</code></td>
<td>int</td>
<td>Number of Python processor scripts</td>
</tr>
<tr>
<td><code>pipeline.processors</code></td>
<td>object</td>
<td>Per-processor availability</td>
</tr>
<tr>
<td><code>pipeline.models_ready</code></td>
<td>boolean</td>
<td>Models directory accessible</td>
</tr>
<tr>
<td><code>pipeline.scripts_integrity</code></td>
<td>object</td>
<td>SHA256 checksum verification results</td>
</tr>
<tr>
<td><code>schema.ok</code></td>
<td>boolean</td>
<td>All required migrations applied</td>
</tr>
<tr>
<td><code>identities.synced</code></td>
<td>boolean</td>
<td>Identity file count matches DB count</td>
</tr>
<tr>
<td><code>integrations.tmdb</code></td>
<td>object</td>
<td>TMDB API key config and reachability</td>
</tr>
</tbody>
</table>
<h4>Health status rules</h4>
<table class="table">
<thead>
<tr>
<th>Condition</th>
<th>status</th>
</tr>
</thead>
<tbody>
<tr>
<td>All services ok</td>
<td><code>ok</code></td>
</tr>
<tr>
<td>Any service error</td>
<td><code>degraded</code></td>
</tr>
<tr>
<td>Postgres or Redis error</td>
<td><code>degraded</code> (server still responds)</td>
</tr>
</tbody>
</table>
<hr />
<h3>Stats Endpoints</h3>
<table class="table">
<thead>
<tr>
<th>Method</th>
<th>Endpoint</th>
<th>Auth</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>GET</td>
<td><code>/api/v1/stats/sftpgo</code></td>
<td>No</td>
<td>SFTPGo service status</td>
</tr>
</tbody>
</table>
</div>
</body>
</html>

View File

@@ -0,0 +1,444 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>03 Register - Momentry API Docs</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; background: #f5f5f5; color: #333; padding: 40px; }
.container { max-width: 960px; margin: 0 auto; background: white; border-radius: 12px; box-shadow: 0 2px 12px rgba(0,0,0,0.08); padding: 40px; }
h1 { font-size: 24px; margin: 24px 0 12px; }
h2 { font-size: 20px; margin: 20px 0 10px; color: #222; }
h3 { font-size: 16px; margin: 16px 0 8px; color: #444; }
p { line-height: 1.6; margin: 8px 0; }
table { border-collapse: collapse; width: 100%; margin: 12px 0; font-size: 14px; }
th, td { border: 1px solid #ddd; padding: 8px 12px; text-align: left; }
th { background: #f0f0f0; font-weight: 600; }
code { background: #f0f0f0; padding: 2px 6px; border-radius: 3px; font-size: 13px; }
pre { background: #f8f8f8; border: 1px solid #ddd; border-radius: 6px; padding: 12px; overflow-x: auto; margin: 12px 0; }
pre code { background: none; padding: 0; }
a { color: #0066cc; }
.back { display: inline-block; margin-bottom: 20px; color: #666; }
.back:hover { color: #333; }
</style>
</head>
<body>
<div class="container">
<a class="back" href="index.html">&larr; Back to index</a>
<!-- module: register -->
<!-- description: File registration — register, scan -->
<!-- depends: 01_auth -->
<h2>File Registration</h2>
<h3><code>POST /api/v1/files/register</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<p>Register a video file for processing. Returns the file's metadata and UUID.</p>
<p><strong>New in v0.1.2</strong>: Registration now <strong>automatically triggers the processing pipeline</strong> — no need to call <code>POST /api/v1/file/:file_uuid/process</code> separately. The system will:
1. Register the file and run ffprobe
2. Auto-run offline TMDb probe (reads local identity files, no API calls)
3. Create a monitor job for the worker
4. Worker starts all 10 processors (Cut → ASR → ASRX → YOLO → OCR → Face → Pose → VisualChunk → Story → 5W1H)</p>
<p>If the file already exists (same content hash), returns the existing record with <code>already_exists: true</code>.</p>
<h4>Request Parameters</h4>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Required</th>
<th>Default</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>file_path</code></td>
<td>string</td>
<td>Yes</td>
<td></td>
<td>Path to video file on disk</td>
</tr>
<tr>
<td><code>pattern</code></td>
<td>string</td>
<td>No</td>
<td></td>
<td>Regex pattern for batch register (requires <code>file_path</code> to be a directory)</td>
</tr>
<tr>
<td><code>user_id</code></td>
<td>integer</td>
<td>No</td>
<td></td>
<td>User ID to associate with registration</td>
</tr>
<tr>
<td><code>content_hash</code></td>
<td>string</td>
<td>No</td>
<td></td>
<td>Pre-computed SHA-256 hash (skips computation)</td>
</tr>
</tbody>
</table>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code><span class="c1"># Register a single file</span>
curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/files/register&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-d<span class="w"> </span><span class="s1">&#39;{&quot;file_path&quot;: &quot;/path/to/video.mp4&quot;}&#39;</span>
<span class="c1"># Batch register files matching a pattern in a directory</span>
curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/files/register&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-d<span class="w"> </span><span class="s1">&#39;{&quot;file_path&quot;: &quot;/path/to/dir&quot;, &quot;pattern&quot;: &quot;.*\\.mp4$&quot;}&#39;</span>
</code></pre></div>
<h4>Response (200)</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;success&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;file_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;3a6c1865...&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;file_name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;video.mp4&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;file_path&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;/path/to/video.mp4&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;file_type&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;video&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;duration&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">120.5</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;width&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">1920</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;height&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">1080</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;fps&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">24.0</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;total_frames&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">2892</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;already_exists&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;message&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;File registered successfully&quot;</span>
<span class="p">}</span>
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>success</code></td>
<td>boolean</td>
<td>Always true on 200</td>
</tr>
<tr>
<td><code>file_uuid</code></td>
<td>string</td>
<td>32-char hex UUID of the registered file</td>
</tr>
<tr>
<td><code>file_name</code></td>
<td>string</td>
<td>File name (auto-renamed if name conflict)</td>
</tr>
<tr>
<td><code>file_path</code></td>
<td>string</td>
<td>Canonical path on disk</td>
</tr>
<tr>
<td><code>file_type</code></td>
<td>string</td>
<td><code>"video"</code>, <code>"audio"</code>, or <code>"unknown"</code></td>
</tr>
<tr>
<td><code>duration</code></td>
<td>float</td>
<td>Duration in seconds</td>
</tr>
<tr>
<td><code>width</code></td>
<td>integer</td>
<td>Video width in pixels</td>
</tr>
<tr>
<td><code>height</code></td>
<td>integer</td>
<td>Video height in pixels</td>
</tr>
<tr>
<td><code>fps</code></td>
<td>float</td>
<td>Frames per second</td>
</tr>
<tr>
<td><code>total_frames</code></td>
<td>integer</td>
<td>Total frame count</td>
</tr>
<tr>
<td><code>already_exists</code></td>
<td>boolean</td>
<td>True if same content was already registered</td>
</tr>
<tr>
<td><code>message</code></td>
<td>string</td>
<td>Human-readable status</td>
</tr>
</tbody>
</table>
<h4>Error Responses</h4>
<table class="table">
<thead>
<tr>
<th>HTTP</th>
<th>When</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>401</code></td>
<td>Missing or invalid API key</td>
</tr>
<tr>
<td><code>400</code></td>
<td>Invalid request body</td>
</tr>
<tr>
<td><code>404</code></td>
<td>File path does not exist</td>
</tr>
</tbody>
</table>
<hr />
<h3><code>GET /api/v1/files/scan</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<p>Scan the filesystem directory and list all media files, showing which are registered, processing, or unregistered.</p>
<h4>Query Parameters</h4>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Required</th>
<th>Default</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>page</code></td>
<td>integer</td>
<td>No</td>
<td>1</td>
<td>Page number (1-based)</td>
</tr>
<tr>
<td><code>page_size</code></td>
<td>integer</td>
<td>No</td>
<td>all</td>
<td>Items per page (alias: <code>limit</code>)</td>
</tr>
<tr>
<td><code>limit</code></td>
<td>integer</td>
<td>No</td>
<td>all</td>
<td>Max items (alias for <code>page_size</code>)</td>
</tr>
<tr>
<td><code>pattern</code></td>
<td>string</td>
<td>No</td>
<td></td>
<td>Regex filter on file name (e.g., <code>.*\\.mp4$</code>)</td>
</tr>
<tr>
<td><code>sort_by</code></td>
<td>string</td>
<td>No</td>
<td><code>name</code></td>
<td>Sort field: <code>name</code>, <code>size</code>, <code>modified</code>, <code>status</code></td>
</tr>
<tr>
<td><code>sort_order</code></td>
<td>string</td>
<td>No</td>
<td><code>asc</code></td>
<td>Sort direction: <code>asc</code> or <code>desc</code></td>
</tr>
</tbody>
</table>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code><span class="c1"># Full scan</span>
curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/files/scan&quot;</span><span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>jq<span class="w"> </span><span class="s1">&#39;{total, registered_count, unregistered_count}&#39;</span>
<span class="c1"># Paginated (page 1, 5 per page)</span>
curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/files/scan?page=1&amp;page_size=5&quot;</span><span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>jq<span class="w"> </span><span class="s1">&#39;{page, total_pages, files: [.files[].file_name]}&#39;</span>
<span class="c1"># Regex filter: only mp4 files</span>
curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/files/scan?pattern=.*\\.mp4</span>$<span class="s2">&quot;</span><span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>jq<span class="w"> </span><span class="s1">&#39;{filtered_total, files: [.files[].file_name]}&#39;</span>
<span class="c1"># Sort by file size (largest first)</span>
curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/files/scan?sort_by=size&amp;sort_order=desc&amp;page_size=5&quot;</span><span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>jq<span class="w"> </span><span class="s1">&#39;[.files[] | {file_name, file_size}]&#39;</span>
<span class="c1"># Sort by modified time (most recent first)</span>
curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/files/scan?sort_by=modified&amp;sort_order=desc&amp;page_size=5&quot;</span><span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>jq<span class="w"> </span><span class="s1">&#39;[.files[] | {file_name, modified_time}]&#39;</span>
<span class="c1"># Sort by status</span>
curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/files/scan?sort_by=status&amp;page_size=5&quot;</span><span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>jq<span class="w"> </span><span class="s1">&#39;[.files[] | {file_name, status}]&#39;</span>
</code></pre></div>
<h4>Response (200)</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;files&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;file_name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;video.mp4&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;file_size&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">12345678</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;is_registered&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;file_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;3a6c1865...&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;status&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;completed&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;registration_time&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;2026-05-16T12:00:00Z&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;job_id&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">42</span>
<span class="w"> </span><span class="p">}</span>
<span class="w"> </span><span class="p">],</span>
<span class="w"> </span><span class="nt">&quot;total&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">107</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;filtered_total&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">80</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;page&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">1</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;page_size&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">20</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;total_pages&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">4</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;registered_count&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">26</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;unregistered_count&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">81</span>
<span class="p">}</span>
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>files</code></td>
<td>array</td>
<td>Array of file info objects (paginated)</td>
</tr>
<tr>
<td><code>files[].file_name</code></td>
<td>string</td>
<td>File name</td>
</tr>
<tr>
<td><code>files[].relative_path</code></td>
<td>string</td>
<td>Path relative to scan root</td>
</tr>
<tr>
<td><code>files[].file_path</code></td>
<td>string</td>
<td>Absolute path on disk</td>
</tr>
<tr>
<td><code>files[].file_size</code></td>
<td>integer</td>
<td>File size in bytes</td>
</tr>
<tr>
<td><code>files[].modified_time</code></td>
<td>string</td>
<td>Last modified timestamp (ISO8601)</td>
</tr>
<tr>
<td><code>files[].is_registered</code></td>
<td>boolean</td>
<td>Whether file is registered in DB</td>
</tr>
<tr>
<td><code>files[].file_uuid</code></td>
<td>string</td>
<td>32-char hex UUID (only if registered)</td>
</tr>
<tr>
<td><code>files[].status</code></td>
<td>string</td>
<td><code>"completed"</code>, <code>"processing"</code>, <code>"registered"</code>, <code>"unregistered"</code>, or <code>null</code></td>
</tr>
<tr>
<td><code>files[].registration_time</code></td>
<td>string</td>
<td>DB registration timestamp (only if registered)</td>
</tr>
<tr>
<td><code>files[].job_id</code></td>
<td>integer</td>
<td>Processing job ID (only if a job exists)</td>
</tr>
<tr>
<td><code>total</code></td>
<td>integer</td>
<td>Total files found on disk (unfiltered)</td>
</tr>
<tr>
<td><code>filtered_total</code></td>
<td>integer</td>
<td>Files matching regex filter</td>
</tr>
<tr>
<td><code>page</code></td>
<td>integer</td>
<td>Current page number</td>
</tr>
<tr>
<td><code>page_size</code></td>
<td>integer</td>
<td>Items per page</td>
</tr>
<tr>
<td><code>total_pages</code></td>
<td>integer</td>
<td>Total pages</td>
</tr>
<tr>
<td><code>registered_count</code></td>
<td>integer</td>
<td>Files registered in DB</td>
</tr>
<tr>
<td><code>unregistered_count</code></td>
<td>integer</td>
<td>Files not yet registered</td>
</tr>
</tbody>
</table>
<h4>Notes</h4>
<table class="table">
<thead>
<tr>
<th>Feature</th>
<th>Behavior</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Regex</strong></td>
<td>Case-insensitive (<code>(?i)</code> prefix auto-applied). Applied to <code>file_name</code>.</td>
</tr>
<tr>
<td><strong>Sort order</strong></td>
<td>Default (<code>sort_by=name</code>): registered files first, then alphabetically. <code>sort_by=status</code>: alphabetical by status string.</td>
</tr>
<tr>
<td><strong>Pagination</strong></td>
<td><code>page_size</code> and <code>limit</code> are aliases. Default: show all results.</td>
</tr>
<tr>
<td><strong>Processing order</strong></td>
<td><code>pattern</code> regex filter → <code>sort_by</code>/<code>sort_order</code><code>page</code>/<code>page_size</code> slice.</td>
</tr>
</tbody>
</table>
</div>
</body>
</html>

View File

@@ -0,0 +1,291 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>04 Lookup - Momentry API Docs</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; background: #f5f5f5; color: #333; padding: 40px; }
.container { max-width: 960px; margin: 0 auto; background: white; border-radius: 12px; box-shadow: 0 2px 12px rgba(0,0,0,0.08); padding: 40px; }
h1 { font-size: 24px; margin: 24px 0 12px; }
h2 { font-size: 20px; margin: 20px 0 10px; color: #222; }
h3 { font-size: 16px; margin: 16px 0 8px; color: #444; }
p { line-height: 1.6; margin: 8px 0; }
table { border-collapse: collapse; width: 100%; margin: 12px 0; font-size: 14px; }
th, td { border: 1px solid #ddd; padding: 8px 12px; text-align: left; }
th { background: #f0f0f0; font-weight: 600; }
code { background: #f0f0f0; padding: 2px 6px; border-radius: 3px; font-size: 13px; }
pre { background: #f8f8f8; border: 1px solid #ddd; border-radius: 6px; padding: 12px; overflow-x: auto; margin: 12px 0; }
pre code { background: none; padding: 0; }
a { color: #0066cc; }
.back { display: inline-block; margin-bottom: 20px; color: #666; }
.back:hover { color: #333; }
</style>
</head>
<body>
<div class="container">
<a class="back" href="index.html">&larr; Back to index</a>
<!-- module: lookup -->
<!-- description: File lookup by name and unregistration -->
<!-- depends: 01_auth, 03_register -->
<h2>File Lookup</h2>
<h3><code>GET /api/v1/files/lookup</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<p>Search registered files by file name. Performs a case-insensitive LIKE search on the file name column. Returns basic info about matching files.</p>
<h4>Query Parameters</h4>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Required</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>file_name</code></td>
<td>string</td>
<td>Yes</td>
<td>File name to search for (partial matches supported)</td>
</tr>
</tbody>
</table>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code><span class="c1"># Look up a specific file</span>
curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/files/lookup?file_name=video.mp4&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span>
<span class="c1"># Partial name search</span>
curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/files/lookup?file_name=charade&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>jq<span class="w"> </span><span class="s1">&#39;.matches[].file_name&#39;</span>
</code></pre></div>
<h4>Response (200)</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;file_name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;video.mp4&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;exists&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;matches&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;file_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;a03485a40b2df2d3&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;file_name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;video.mp4&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;file_type&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;video&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;status&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;completed&quot;</span>
<span class="w"> </span><span class="p">}</span>
<span class="w"> </span><span class="p">],</span>
<span class="w"> </span><span class="nt">&quot;next_name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;video (2).mp4&quot;</span>
<span class="p">}</span>
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>file_name</code></td>
<td>string</td>
<td>Searched name</td>
</tr>
<tr>
<td><code>exists</code></td>
<td>boolean</td>
<td>Exact name match exists</td>
</tr>
<tr>
<td><code>matches</code></td>
<td>array</td>
<td>Array of matching registered files</td>
</tr>
<tr>
<td><code>matches[].file_uuid</code></td>
<td>string</td>
<td>32-char hex UUID</td>
</tr>
<tr>
<td><code>matches[].file_name</code></td>
<td>string</td>
<td>Registered file name</td>
</tr>
<tr>
<td><code>matches[].file_type</code></td>
<td>string</td>
<td><code>"video"</code>, <code>"audio"</code>, or <code>null</code></td>
</tr>
<tr>
<td><code>matches[].status</code></td>
<td>string</td>
<td>Registration/processing status</td>
</tr>
<tr>
<td><code>next_name</code></td>
<td>string</td>
<td>Suggested name for avoiding conflicts</td>
</tr>
</tbody>
</table>
<hr />
<h2>Unregister</h2>
<h3><code>POST /api/v1/unregister</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<p>Delete a registered file from the system. Supports single file by UUID, or batch by directory + regex pattern.</p>
<h4>What gets deleted</h4>
<table class="table">
<thead>
<tr>
<th>Removed (default)</th>
<th>Not removed</th>
</tr>
</thead>
<tbody>
<tr>
<td>Database records (videos, chunks, embeddings, processor_results, pre_chunks)</td>
<td>The original source video file on disk</td>
</tr>
<tr>
<td>Processor output JSON files (<code>{uuid}.*.json</code>) — unless <code>delete_output_files: false</code></td>
<td>Temp/working directories</td>
</tr>
<tr>
<td>In-memory cache entries</td>
<td></td>
</tr>
<tr>
<td>MongoDB cached lists</td>
<td></td>
</tr>
</tbody>
</table>
<blockquote>
<p>⚠️ Database deletion is <strong>irreversible</strong>. To keep output files, set <code>"delete_output_files": false</code>.</p>
</blockquote>
<h4>Request Parameters</h4>
<p>At least one mode must be specified: either <code>file_uuid</code> alone, or <code>file_path</code> + <code>pattern</code> together.</p>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Required</th>
<th>Default</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>file_uuid</code></td>
<td>string</td>
<td>*</td>
<td></td>
<td>Single file UUID to delete</td>
</tr>
<tr>
<td><code>file_path</code></td>
<td>string</td>
<td>*</td>
<td></td>
<td>Directory path (for batch delete)</td>
</tr>
<tr>
<td><code>pattern</code></td>
<td>string</td>
<td>*</td>
<td></td>
<td>Regex pattern (requires <code>file_path</code>)</td>
</tr>
<tr>
<td><code>delete_output_files</code></td>
<td>boolean</td>
<td>No</td>
<td><code>true</code></td>
<td>If <code>true</code>, also delete processor output JSON files (<code>{uuid}.*.json</code>). Set to <code>false</code> to keep them.</td>
</tr>
</tbody>
</table>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code><span class="c1"># Delete a single file by UUID (default: also deletes output JSON files)</span>
curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/unregister&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-d<span class="w"> </span><span class="s1">&#39;{&quot;file_uuid&quot;: &quot;&#39;</span><span class="s2">&quot;</span><span class="nv">$FILE_UUID</span><span class="s2">&quot;</span><span class="s1">&#39;&quot;}&#39;</span>
<span class="c1"># Keep output JSON files, only delete DB records</span>
curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/unregister&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-d<span class="w"> </span><span class="s1">&#39;{&quot;file_uuid&quot;: &quot;&#39;</span><span class="s2">&quot;</span><span class="nv">$FILE_UUID</span><span class="s2">&quot;</span><span class="s1">&#39;&quot;, &quot;delete_output_files&quot;: false}&#39;</span>
<span class="c1"># Batch delete all mp4 files in a directory</span>
curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/unregister&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-d<span class="w"> </span><span class="s1">&#39;{&quot;file_path&quot;: &quot;/path/to/dir&quot;, &quot;pattern&quot;: &quot;.*\\.mp4$&quot;}&#39;</span>
</code></pre></div>
<h4>Response (200)</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;success&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;file_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;a03485a40b2df2d3&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;message&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Video unregistered successfully&quot;</span>
<span class="p">}</span>
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>success</code></td>
<td>boolean</td>
<td>True if deletion succeeded</td>
</tr>
<tr>
<td><code>file_uuid</code></td>
<td>string</td>
<td>UUID of the deleted file (single mode)</td>
</tr>
<tr>
<td><code>message</code></td>
<td>string</td>
<td>Human-readable status</td>
</tr>
</tbody>
</table>
<h4>Error Responses</h4>
<table class="table">
<thead>
<tr>
<th>HTTP</th>
<th>When</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>400</code></td>
<td>Neither <code>file_uuid</code> nor <code>file_path</code>+<code>pattern</code> provided</td>
</tr>
<tr>
<td><code>404</code></td>
<td>File UUID not found</td>
</tr>
<tr>
<td><code>401</code></td>
<td>Missing or invalid API key</td>
</tr>
</tbody>
</table>
</div>
</body>
</html>

View File

@@ -0,0 +1,505 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>05 Process - Momentry API Docs</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; background: #f5f5f5; color: #333; padding: 40px; }
.container { max-width: 960px; margin: 0 auto; background: white; border-radius: 12px; box-shadow: 0 2px 12px rgba(0,0,0,0.08); padding: 40px; }
h1 { font-size: 24px; margin: 24px 0 12px; }
h2 { font-size: 20px; margin: 20px 0 10px; color: #222; }
h3 { font-size: 16px; margin: 16px 0 8px; color: #444; }
p { line-height: 1.6; margin: 8px 0; }
table { border-collapse: collapse; width: 100%; margin: 12px 0; font-size: 14px; }
th, td { border: 1px solid #ddd; padding: 8px 12px; text-align: left; }
th { background: #f0f0f0; font-weight: 600; }
code { background: #f0f0f0; padding: 2px 6px; border-radius: 3px; font-size: 13px; }
pre { background: #f8f8f8; border: 1px solid #ddd; border-radius: 6px; padding: 12px; overflow-x: auto; margin: 12px 0; }
pre code { background: none; padding: 0; }
a { color: #0066cc; }
.back { display: inline-block; margin-bottom: 20px; color: #666; }
.back:hover { color: #333; }
</style>
</head>
<body>
<div class="container">
<a class="back" href="index.html">&larr; Back to index</a>
<!-- module: process -->
<!-- description: Processing pipeline — trigger, probe, progress, jobs -->
<!-- depends: 01_auth, 03_register -->
<h2>Processing Pipeline</h2>
<h3><code>POST /api/v1/file/:file_uuid/process</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<p>Trigger the processing pipeline for a registered file. Creates a monitor job that the worker picks up and processes sequentially. Returns immediately with the job info—processing runs asynchronously in the background.</p>
<h4>Request Parameters</h4>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Required</th>
<th>Default</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>processors</code></td>
<td>string[]</td>
<td>No</td>
<td>all</td>
<td>Specific processors to run: <code>["cut","asr","asrx","yolo","ocr","face","pose","visual_chunk","story","5w1h"]</code></td>
</tr>
<tr>
<td><code>rules</code></td>
<td>string[]</td>
<td>No</td>
<td>all</td>
<td>Rule names to apply (currently unused)</td>
</tr>
</tbody>
</table>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code><span class="c1"># Run all processors</span>
curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/file/</span><span class="nv">$FILE_UUID</span><span class="s2">/process&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span>-d<span class="w"> </span><span class="s1">&#39;{}&#39;</span>
<span class="c1"># Run specific processors only</span>
curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/file/</span><span class="nv">$FILE_UUID</span><span class="s2">/process&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-d<span class="w"> </span><span class="s1">&#39;{&quot;processors&quot;: [&quot;asr&quot;, &quot;face&quot;, &quot;yolo&quot;]}&#39;</span>
</code></pre></div>
<h4>Response (200)</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;success&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;job_id&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">42</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;file_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;3a6c1865...&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;status&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;processing&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;pids&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="mi">12345</span><span class="p">,</span><span class="w"> </span><span class="mi">12346</span><span class="p">],</span>
<span class="w"> </span><span class="nt">&quot;message&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Processing triggered for video.mp4&quot;</span>
<span class="p">}</span>
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>success</code></td>
<td>boolean</td>
<td>Always true on 200</td>
</tr>
<tr>
<td><code>job_id</code></td>
<td>integer</td>
<td>Monitor job ID (for job tracking)</td>
</tr>
<tr>
<td><code>file_uuid</code></td>
<td>string</td>
<td>32-char hex UUID of the file</td>
</tr>
<tr>
<td><code>status</code></td>
<td>string</td>
<td><code>"processing"</code></td>
</tr>
<tr>
<td><code>pids</code></td>
<td>integer[]</td>
<td>Process IDs of started processors</td>
</tr>
<tr>
<td><code>message</code></td>
<td>string</td>
<td>Human-readable status</td>
</tr>
</tbody>
</table>
<h4>Error Responses</h4>
<table class="table">
<thead>
<tr>
<th>HTTP</th>
<th>When</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>404</code></td>
<td>File UUID not found</td>
</tr>
<tr>
<td><code>401</code></td>
<td>Missing or invalid API key</td>
</tr>
</tbody>
</table>
<hr />
<h3><code>GET /api/v1/file/:file_uuid/probe</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<p>Get ffprobe metadata for a registered file. Returns video/audio stream info, codec details, duration, resolution, and frame rate.</p>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/file/</span><span class="nv">$FILE_UUID</span><span class="s2">/probe&quot;</span><span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span>
</code></pre></div>
<h4>Response (200)</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;file_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;3a6c1865...&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;file_name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;video.mp4&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;file_size&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">794863677</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;duration&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">120.5</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;width&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">1920</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;height&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">1080</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;fps&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">24.0</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;total_frames&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">2892</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;cached&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;format&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;filename&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;/path/to/video.mp4&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;format_name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;mov,mp4,m4a,3gp&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;duration&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;120.5&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;size&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;12345678&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;bit_rate&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;819200&quot;</span>
<span class="w"> </span><span class="p">},</span>
<span class="w"> </span><span class="nt">&quot;streams&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;index&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">0</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;codec_name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;h264&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;codec_type&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;video&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;width&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">1920</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;height&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">1080</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;r_frame_rate&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;24/1&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;duration&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;120.5&quot;</span>
<span class="w"> </span><span class="p">}</span>
<span class="w"> </span><span class="p">]</span>
<span class="p">}</span>
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>file_uuid</code></td>
<td>string</td>
<td>32-char hex UUID</td>
</tr>
<tr>
<td><code>file_name</code></td>
<td>string</td>
<td>File name</td>
</tr>
<tr>
<td><code>file_size</code></td>
<td>integer</td>
<td>File size in bytes (from filesystem)</td>
</tr>
<tr>
<td><code>duration</code></td>
<td>float</td>
<td>Duration in seconds</td>
</tr>
<tr>
<td><code>width</code></td>
<td>integer</td>
<td>Video width in pixels</td>
</tr>
<tr>
<td><code>height</code></td>
<td>integer</td>
<td>Video height in pixels</td>
</tr>
<tr>
<td><code>fps</code></td>
<td>float</td>
<td>Frames per second</td>
</tr>
<tr>
<td><code>total_frames</code></td>
<td>integer</td>
<td>Estimated total frames</td>
</tr>
<tr>
<td><code>cached</code></td>
<td>boolean</td>
<td>True if result was from cached probe JSON</td>
</tr>
<tr>
<td><code>format</code></td>
<td>object</td>
<td>Container format info (ffprobe format section)</td>
</tr>
<tr>
<td><code>streams</code></td>
<td>array</td>
<td>Array of stream info objects</td>
</tr>
</tbody>
</table>
<hr />
<h3><code>GET /api/v1/progress/:file_uuid</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<p>Get real-time processing progress for a file via Redis pub/sub. Includes per-processor status, current/total frames, ETA, and system resource stats.</p>
<h4>Pipeline Order</h4>
<table class="table">
<thead>
<tr>
<th>Order</th>
<th>Processor</th>
<th>Dependencies</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td><code>cut</code></td>
<td></td>
<td>Scene detection</td>
</tr>
<tr>
<td>2</td>
<td><code>asr</code></td>
<td>cut</td>
<td>Speech-to-text (per scene)</td>
</tr>
<tr>
<td>3</td>
<td><code>asrx</code></td>
<td>asr</td>
<td>Speaker diarization</td>
</tr>
<tr>
<td>4</td>
<td><code>yolo</code></td>
<td></td>
<td>Object detection</td>
</tr>
<tr>
<td>5</td>
<td><code>ocr</code></td>
<td></td>
<td>Text recognition</td>
</tr>
<tr>
<td>6</td>
<td><code>face</code></td>
<td></td>
<td>Face detection &amp; embedding</td>
</tr>
<tr>
<td>7</td>
<td><code>pose</code></td>
<td></td>
<td>Pose estimation</td>
</tr>
<tr>
<td>8</td>
<td><code>visual_chunk</code></td>
<td>yolo</td>
<td>Visual scene chunks</td>
</tr>
<tr>
<td>9</td>
<td><code>story</code></td>
<td>asr, asrx, cut, yolo, face</td>
<td>Scene summaries (template)</td>
</tr>
<tr>
<td>10</td>
<td><code>5w1h</code></td>
<td>story</td>
<td>5W1H analysis (Gemma4 LLM)</td>
</tr>
</tbody>
</table>
<p>All processors except <code>story</code> and <code>5w1h</code> run concurrently when their dependencies are met. Story and 5W1H run sequentially after their prerequisites.</p>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/progress/</span><span class="nv">$FILE_UUID</span><span class="s2">&quot;</span><span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>jq<span class="w"> </span><span class="s1">&#39;{overall_progress, processors: [.processors[] | {processor_type, status}]}&#39;</span>
</code></pre></div>
<h4>Response (200)</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;file_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;3a6c1865...&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;overall_progress&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">71</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;cpu_percent&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">45.2</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;gpu_percent&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">30.1</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;memory_percent&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">62.4</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;processors&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
<span class="w"> </span><span class="p">{</span><span class="nt">&quot;processor_type&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;asr&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;status&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;complete&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;progress&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">100</span><span class="p">},</span>
<span class="w"> </span><span class="p">{</span><span class="nt">&quot;processor_type&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;yolo&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;status&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;running&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;progress&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">65</span><span class="p">},</span>
<span class="w"> </span><span class="p">{</span><span class="nt">&quot;processor_type&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;face&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;status&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;pending&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;progress&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">0</span><span class="p">}</span>
<span class="w"> </span><span class="p">]</span>
<span class="p">}</span>
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>file_uuid</code></td>
<td>string</td>
<td>32-char hex UUID</td>
</tr>
<tr>
<td><code>overall_progress</code></td>
<td>integer</td>
<td>Overall progress percentage (0100)</td>
</tr>
<tr>
<td><code>processors</code></td>
<td>array</td>
<td>Per-processor status list</td>
</tr>
<tr>
<td><code>processors[].processor_type</code></td>
<td>string</td>
<td>Processor name (<code>asr</code>, <code>cut</code>, <code>yolo</code>, etc.)</td>
</tr>
<tr>
<td><code>processors[].status</code></td>
<td>string</td>
<td><code>"pending"</code>, <code>"running"</code>, <code>"complete"</code>, or <code>"failed"</code></td>
</tr>
<tr>
<td><code>processors[].progress</code></td>
<td>integer</td>
<td>Per-processor progress (0100)</td>
</tr>
<tr>
<td><code>processors[].eta_seconds</code></td>
<td>integer</td>
<td>Estimated seconds remaining (running processors)</td>
</tr>
<tr>
<td><code>processors[].current</code></td>
<td>integer</td>
<td>Current frame count</td>
</tr>
<tr>
<td><code>processors[].total</code></td>
<td>integer</td>
<td>Total frame count</td>
</tr>
<tr>
<td><code>cpu_percent</code></td>
<td>float</td>
<td>Current CPU usage</td>
</tr>
<tr>
<td><code>gpu_percent</code></td>
<td>float</td>
<td>Current GPU utilization</td>
</tr>
<tr>
<td><code>memory_percent</code></td>
<td>float</td>
<td>Current memory usage</td>
</tr>
</tbody>
</table>
<hr />
<h3><code>GET /api/v1/jobs</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: system-level</p>
<p>List all processing jobs (monitor jobs) in the system. Shows job status, which file each job is processing, and current processor info.</p>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/jobs&quot;</span><span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>jq<span class="w"> </span><span class="s1">&#39;{count, jobs: [.jobs[] | {uuid, status}]}&#39;</span>
</code></pre></div>
<h4>Response (200)</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;jobs&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;id&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">42</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;3a6c1865...&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;status&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;running&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;current_processor&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;yolo&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;created_at&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;2026-05-16T12:00:00Z&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;started_at&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;2026-05-16T12:01:00Z&quot;</span>
<span class="w"> </span><span class="p">}</span>
<span class="w"> </span><span class="p">],</span>
<span class="w"> </span><span class="nt">&quot;count&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">15</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;page&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">1</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;page_size&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">20</span>
<span class="p">}</span>
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>jobs</code></td>
<td>array</td>
<td>Array of job info objects</td>
</tr>
<tr>
<td><code>jobs[].id</code></td>
<td>integer</td>
<td>Job ID</td>
</tr>
<tr>
<td><code>jobs[].uuid</code></td>
<td>string</td>
<td>File UUID being processed</td>
</tr>
<tr>
<td><code>jobs[].status</code></td>
<td>string</td>
<td><code>"pending"</code>, <code>"running"</code>, <code>"completed"</code>, <code>"failed"</code></td>
</tr>
<tr>
<td><code>jobs[].current_processor</code></td>
<td>string</td>
<td>Currently active processor, or null</td>
</tr>
<tr>
<td><code>count</code></td>
<td>integer</td>
<td>Total job count</td>
</tr>
<tr>
<td><code>page</code></td>
<td>integer</td>
<td>Current page number</td>
</tr>
<tr>
<td><code>page_size</code></td>
<td>integer</td>
<td>Jobs per page</td>
</tr>
</tbody>
</table>
</div>
</body>
</html>

View File

@@ -0,0 +1,280 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>06 Search - Momentry API Docs</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; background: #f5f5f5; color: #333; padding: 40px; }
.container { max-width: 960px; margin: 0 auto; background: white; border-radius: 12px; box-shadow: 0 2px 12px rgba(0,0,0,0.08); padding: 40px; }
h1 { font-size: 24px; margin: 24px 0 12px; }
h2 { font-size: 20px; margin: 20px 0 10px; color: #222; }
h3 { font-size: 16px; margin: 16px 0 8px; color: #444; }
p { line-height: 1.6; margin: 8px 0; }
table { border-collapse: collapse; width: 100%; margin: 12px 0; font-size: 14px; }
th, td { border: 1px solid #ddd; padding: 8px 12px; text-align: left; }
th { background: #f0f0f0; font-weight: 600; }
code { background: #f0f0f0; padding: 2px 6px; border-radius: 3px; font-size: 13px; }
pre { background: #f8f8f8; border: 1px solid #ddd; border-radius: 6px; padding: 12px; overflow-x: auto; margin: 12px 0; }
pre code { background: none; padding: 0; }
a { color: #0066cc; }
.back { display: inline-block; margin-bottom: 20px; color: #666; }
.back:hover { color: #333; }
</style>
</head>
<body>
<div class="container">
<a class="back" href="index.html">&larr; Back to index</a>
<!-- module: search -->
<!-- description: Vector search, BM25, smart search, universal search, visual search -->
<!-- depends: 01_auth -->
<h2>Search APIs</h2>
<h3><code>POST /api/v1/search/smart</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<p>Semantic vector search using EmbeddingGemma-300m. Generates a query embedding via EmbeddingGemma (port 11436), then searches pgvector <code>story_parent</code> and <code>llm_parent</code> chunks by cosine similarity.</p>
<h4>Request Parameters</h4>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Required</th>
<th>Default</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>file_uuid</code></td>
<td>string</td>
<td>Yes</td>
<td></td>
<td>File UUID to search within</td>
</tr>
<tr>
<td><code>query</code></td>
<td>string</td>
<td>Yes</td>
<td></td>
<td>Search text</td>
</tr>
<tr>
<td><code>limit</code></td>
<td>integer</td>
<td>No</td>
<td>5</td>
<td>Max results to return</td>
</tr>
<tr>
<td><code>page</code></td>
<td>integer</td>
<td>No</td>
<td>1</td>
<td>Page number</td>
</tr>
<tr>
<td><code>page_size</code></td>
<td>integer</td>
<td>No</td>
<td>5</td>
<td>Items per page</td>
</tr>
</tbody>
</table>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/search/smart&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Authorization: Bearer </span><span class="nv">$JWT</span><span class="s2">&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-d<span class="w"> </span><span class="s1">&#39;{&quot;file_uuid&quot;: &quot;&#39;</span><span class="s2">&quot;</span><span class="nv">$FILE_UUID</span><span class="s2">&quot;</span><span class="s1">&#39;&quot;, &quot;query&quot;: &quot;Audrey Hepburn&quot;}&#39;</span>
</code></pre></div>
<h4>Response (200)</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;query&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Audrey Hepburn&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;results&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;parent_id&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">1087822</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;scene_order&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">1087822</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;start_frame&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">104438</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;end_frame&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">104538</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;fps&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">24.0</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;start_time&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">4351.6</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;end_time&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">4355.76</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;summary&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;[4352s-4356s, 4s] Cast: Audrey Hepburn. Total: 2 lines, 10 words. Speakers: Audrey Hepburn (2 lines)&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;similarity&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">0.67</span>
<span class="w"> </span><span class="p">}</span>
<span class="w"> </span><span class="p">],</span>
<span class="w"> </span><span class="nt">&quot;page&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">1</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;page_size&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">5</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;strategy&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;semantic_vector_search&quot;</span>
<span class="p">}</span>
</code></pre></div>
<hr />
<h3><code>POST /api/v1/search/universal</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<p>Multi-type BM25 full-text search across chunks, frames, and persons. Uses PostgreSQL <code>tsvector</code>.</p>
<h4>Request Parameters</h4>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Required</th>
<th>Default</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>query</code></td>
<td>string</td>
<td>Yes</td>
<td></td>
<td>Search text</td>
</tr>
<tr>
<td><code>file_uuid</code></td>
<td>string</td>
<td>No</td>
<td></td>
<td>Restrict to specific file</td>
</tr>
<tr>
<td><code>types</code></td>
<td>string[]</td>
<td>No</td>
<td><code>["chunk","frame","person"]</code></td>
<td>Search types</td>
</tr>
<tr>
<td><code>limit</code></td>
<td>integer</td>
<td>No</td>
<td>10</td>
<td>Max results per type</td>
</tr>
<tr>
<td><code>page</code></td>
<td>integer</td>
<td>No</td>
<td>1</td>
<td>Page number</td>
</tr>
<tr>
<td><code>page_size</code></td>
<td>integer</td>
<td>No</td>
<td>20</td>
<td>Items per page</td>
</tr>
</tbody>
</table>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/search/universal&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Authorization: Bearer </span><span class="nv">$JWT</span><span class="s2">&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-d<span class="w"> </span><span class="s1">&#39;{&quot;file_uuid&quot;: &quot;&#39;</span><span class="s2">&quot;</span><span class="nv">$FILE_UUID</span><span class="s2">&quot;</span><span class="s1">&#39;&quot;, &quot;query&quot;: &quot;Cary Grant&quot;}&#39;</span>
</code></pre></div>
<h4>Response (200)</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;results&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;type&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;chunk&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;chunk_id&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;bd80fec92b0b6963d177a2c55bf713e2_2&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;chunk_type&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;story_child&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;start_frame&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">5103</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;end_frame&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">5127</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;start_time&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">212.64</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;end_time&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">213.64</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;text&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;[213s-214s] Cary Grant: \&quot;Olá!\&quot;&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;score&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">0.9</span>
<span class="w"> </span><span class="p">}</span>
<span class="w"> </span><span class="p">],</span>
<span class="w"> </span><span class="nt">&quot;total&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">20</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;took_ms&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">18</span>
<span class="p">}</span>
</code></pre></div>
<hr />
<h3><code>POST /api/v1/search/frames</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<p>Search face detection frames by identity name or trace ID.</p>
<hr />
<h3><code>POST /api/v1/search/identity_text</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<p>Search text chunks spoken by a specific identity.</p>
<hr />
<h3>Visual Search</h3>
<table class="table">
<thead>
<tr>
<th>Method</th>
<th>Endpoint</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>POST</td>
<td><code>/api/v1/search/visual</code></td>
<td>Search visual chunks</td>
</tr>
<tr>
<td>POST</td>
<td><code>/api/v1/search/visual/class</code></td>
<td>Search by object class</td>
</tr>
<tr>
<td>POST</td>
<td><code>/api/v1/search/visual/density</code></td>
<td>Search by object density</td>
</tr>
<tr>
<td>POST</td>
<td><code>/api/v1/search/visual/combination</code></td>
<td>Search by object combination</td>
</tr>
<tr>
<td>POST</td>
<td><code>/api/v1/search/visual/stats</code></td>
<td>Visual chunk statistics</td>
</tr>
</tbody>
</table>
<h4>Embedding Model</h4>
<table class="table">
<thead>
<tr>
<th>Detail</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Model</strong></td>
<td>EmbeddingGemma-300m</td>
</tr>
<tr>
<td><strong>Endpoint</strong></td>
<td><code>POST /api/v1/embeddings</code> on port 11436</td>
</tr>
<tr>
<td><strong>Dimension</strong></td>
<td>768</td>
</tr>
<tr>
<td><strong>Storage</strong></td>
<td>pgvector (<code>chunk.embedding</code> column)</td>
</tr>
</tbody>
</table>
</div>
</body>
</html>

View File

@@ -0,0 +1,510 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>07 Identity - Momentry API Docs</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; background: #f5f5f5; color: #333; padding: 40px; }
.container { max-width: 960px; margin: 0 auto; background: white; border-radius: 12px; box-shadow: 0 2px 12px rgba(0,0,0,0.08); padding: 40px; }
h1 { font-size: 24px; margin: 24px 0 12px; }
h2 { font-size: 20px; margin: 20px 0 10px; color: #222; }
h3 { font-size: 16px; margin: 16px 0 8px; color: #444; }
p { line-height: 1.6; margin: 8px 0; }
table { border-collapse: collapse; width: 100%; margin: 12px 0; font-size: 14px; }
th, td { border: 1px solid #ddd; padding: 8px 12px; text-align: left; }
th { background: #f0f0f0; font-weight: 600; }
code { background: #f0f0f0; padding: 2px 6px; border-radius: 3px; font-size: 13px; }
pre { background: #f8f8f8; border: 1px solid #ddd; border-radius: 6px; padding: 12px; overflow-x: auto; margin: 12px 0; }
pre code { background: none; padding: 0; }
a { color: #0066cc; }
.back { display: inline-block; margin-bottom: 20px; color: #666; }
.back:hover { color: #333; }
</style>
</head>
<body>
<div class="container">
<a class="back" href="index.html">&larr; Back to index</a>
<!-- module: identity -->
<!-- description: Global identities — CRUD, detail, files, faces, bind, unbind, search -->
<!-- depends: 01_auth -->
<h2>Global Identities</h2>
<h3><code>GET /api/v1/identities</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: identity-level</p>
<p>List all registered identities with pagination.</p>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/identities?page=1&amp;page_size=20&quot;</span><span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>jq<span class="w"> </span><span class="s1">&#39;{count, identities: [.identities[] | {name}]}&#39;</span>
</code></pre></div>
<hr />
<h3><code>GET /api/v1/identity/:identity_uuid</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: identity-level</p>
<p>Get detailed information for a specific identity, including metadata and TMDb references.</p>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/identity/</span><span class="nv">$IDENTITY_UUID</span><span class="s2">&quot;</span><span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span>
</code></pre></div>
<h4>Response (200)</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;success&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;identity_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;a9a901056d6b46ff92da0c3c1a57dff4&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Cary Grant&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;identity_type&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;people&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;source&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;tmdb&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;status&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;confirmed&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;tmdb_id&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">112</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;tmdb_profile&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;{output}/identities/{identity_uuid}/profile.jpg&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;metadata&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{},</span>
<span class="w"> </span><span class="nt">&quot;reference_data&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{},</span>
<span class="w"> </span><span class="nt">&quot;created_at&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;2026-05-16T12:00:00Z&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;updated_at&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">null</span>
<span class="p">}</span>
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>identity_uuid</code></td>
<td>string</td>
<td>Identity identifier</td>
</tr>
<tr>
<td><code>name</code></td>
<td>string</td>
<td>Identity name</td>
</tr>
<tr>
<td><code>identity_type</code></td>
<td>string</td>
<td><code>"people"</code> or null</td>
</tr>
<tr>
<td><code>source</code></td>
<td>string</td>
<td><code>.json</code>, <code>auto</code>, <code>tmdb</code>, <code>user_defined</code>, or <code>merged</code></td>
</tr>
<tr>
<td><code>status</code></td>
<td>string</td>
<td><code>"confirmed"</code>, <code>"pending"</code>, or <code>"inactive"</code></td>
</tr>
<tr>
<td><code>tmdb_id</code></td>
<td>integer</td>
<td>TMDb person ID (only if source = tmdb)</td>
</tr>
<tr>
<td><code>tmdb_profile</code></td>
<td>string</td>
<td>Local profile image path (<code>{output}/identities/{uuid}/profile.jpg</code>)</td>
</tr>
<tr>
<td><code>metadata</code></td>
<td>object</td>
<td>Metadata JSON (tmdb_character, cast_order, etc.)</td>
</tr>
<tr>
<td><code>created_at</code></td>
<td>string</td>
<td>Creation timestamp</td>
</tr>
</tbody>
</table>
<hr />
<h3><code>DELETE /api/v1/identity/:identity_uuid</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: identity-level</p>
<p>Delete an identity permanently.</p>
<hr />
<h3><code>GET /api/v1/identity/:identity_uuid/files</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: identity-level</p>
<p>Get all files where this identity appears. Returns per-file summary including face count, confidence, and appearance time range.</p>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/identity/</span><span class="nv">$IDENTITY_UUID</span><span class="s2">/files&quot;</span><span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span>
</code></pre></div>
<hr />
<h3><code>GET /api/v1/identity/:identity_uuid/faces</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: identity-level</p>
<p>Get all face detection records associated with this identity.</p>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/identity/</span><span class="nv">$IDENTITY_UUID</span><span class="s2">/faces&quot;</span><span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span>
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>file_uuid</code></td>
<td>string</td>
<td>File where face was detected</td>
</tr>
<tr>
<td><code>frame_number</code></td>
<td>integer</td>
<td>Frame number of detection</td>
</tr>
<tr>
<td><code>face_id</code></td>
<td>string</td>
<td>Face ID (format: <code>face_{frame_number}</code>)</td>
</tr>
<tr>
<td><code>confidence</code></td>
<td>float</td>
<td>Detection confidence</td>
</tr>
</tbody>
</table>
<hr />
<h3><code>GET /api/v1/identity/:identity_uuid/chunks</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: identity-level</p>
<p>Get all text chunks (sentences) spoken while this identity's face was on screen. Useful for finding what a person said.</p>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/identity/</span><span class="nv">$IDENTITY_UUID</span><span class="s2">/chunks&quot;</span><span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span>
</code></pre></div>
<h4>Response (200)</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;success&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;identity_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;a9a901056d6b46ff92da0c3c1a57dff4&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;data&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;id&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">0</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;file_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;bd80fec92b0b6963d177a2c55bf713e2&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;chunk_id&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;bd80fec92b0b6963d177a2c55bf713e2_2&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;chunk_type&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;sentence&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;start_frame&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">5103</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;end_frame&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">5127</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;fps&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">24.0</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;start_time&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">212.64</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;end_time&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">213.64</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;text_content&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;[213s-214s] Cary Grant: \&quot;Olá!\&quot;&quot;</span>
<span class="w"> </span><span class="p">}</span>
<span class="w"> </span><span class="p">]</span>
<span class="p">}</span>
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>file_uuid</code></td>
<td>string</td>
<td>File identifier</td>
</tr>
<tr>
<td><code>chunk_id</code></td>
<td>string</td>
<td>Sentence chunk identifier</td>
</tr>
<tr>
<td><code>start_frame</code></td>
<td>integer</td>
<td>Frame-accurate start position</td>
</tr>
<tr>
<td><code>end_frame</code></td>
<td>integer</td>
<td>Frame-accurate end position</td>
</tr>
<tr>
<td><code>fps</code></td>
<td>float</td>
<td>Frames per second</td>
</tr>
<tr>
<td><code>start_time</code></td>
<td>float</td>
<td>Start time in seconds</td>
</tr>
<tr>
<td><code>end_time</code></td>
<td>float</td>
<td>End time in seconds</td>
</tr>
<tr>
<td><code>text_content</code></td>
<td>string</td>
<td>Spoken text content</td>
</tr>
</tbody>
</table>
<hr />
<h3><code>POST /api/v1/identity/:identity_uuid/bind</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: identity-level</p>
<p>Bind a face detection to an identity. Associates the face trace with the identity for future search and recognition.</p>
<h4>Request Parameters</h4>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Required</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>file_uuid</code></td>
<td>string</td>
<td>Yes</td>
<td>File where face is detected</td>
</tr>
<tr>
<td><code>face_id</code></td>
<td>string</td>
<td>Yes</td>
<td>Face ID (format: <code>{frame}_{idx}</code>)</td>
</tr>
</tbody>
</table>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/identity/</span><span class="nv">$IDENTITY_UUID</span><span class="s2">/bind&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-d<span class="w"> </span><span class="s1">&#39;{&quot;file_uuid&quot;: &quot;&#39;</span><span class="s2">&quot;</span><span class="nv">$FILE_UUID</span><span class="s2">&quot;</span><span class="s1">&#39;&quot;, &quot;face_id&quot;: &quot;1_5&quot;}&#39;</span>
</code></pre></div>
<hr />
<h3><code>POST /api/v1/identity/:identity_uuid/unbind</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: identity-level</p>
<p>Unbind a face detection from an identity. Removes the identity association from the face record.</p>
<hr />
<h3><code>GET /api/v1/identities/search</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: identity-level</p>
<p>Search identities by name (ILIKE search). Returns matching identity records.</p>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/identities/search?q=Cary&quot;</span><span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span>
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>name</code></td>
<td>string</td>
<td>Identity name</td>
</tr>
<tr>
<td><code>source</code></td>
<td>string</td>
<td>Identity source</td>
</tr>
<tr>
<td><code>tmdb_id</code></td>
<td>integer</td>
<td>TMDb ID (if source = tmdb)</td>
</tr>
<tr>
<td><code>file_uuid</code></td>
<td>string</td>
<td>Associated file</td>
</tr>
</tbody>
</table>
<hr />
<hr />
<h3><code>POST /api/v1/identity/upload</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: identity-level</p>
<p>Upload an identity.json file to create or update an identity. Accepts the same format as the identity.json files stored on disk.</p>
<p>If an identity with the same <code>name</code> already exists, it will be updated with the new values.</p>
<h4>Request</h4>
<p>The request body is an <code>IdentityFile</code> object:</p>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Required</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>identity_uuid</code></td>
<td>string</td>
<td>Yes</td>
<td>Identity identifier</td>
</tr>
<tr>
<td><code>name</code></td>
<td>string</td>
<td>Yes</td>
<td>Identity display name</td>
</tr>
<tr>
<td><code>identity_type</code></td>
<td>string</td>
<td>No</td>
<td><code>"people"</code> or null</td>
</tr>
<tr>
<td><code>source</code></td>
<td>string</td>
<td>No</td>
<td><code>.json</code>, <code>auto</code>, <code>tmdb</code>, <code>user_defined</code>, or <code>merged</code></td>
</tr>
<tr>
<td><code>status</code></td>
<td>string</td>
<td>No</td>
<td><code>"confirmed"</code>, <code>"pending"</code>, or <code>"inactive"</code></td>
</tr>
<tr>
<td><code>tmdb_id</code></td>
<td>integer</td>
<td>No</td>
<td>TMDb person ID</td>
</tr>
<tr>
<td><code>tmdb_profile</code></td>
<td>string</td>
<td>No</td>
<td>TMDb profile image URL</td>
</tr>
<tr>
<td><code>metadata</code></td>
<td>object</td>
<td>No</td>
<td>Arbitrary metadata JSON</td>
</tr>
<tr>
<td><code>file_bindings</code></td>
<td>array</td>
<td>No</td>
<td>Array of <code>{ file_uuid, trace_ids, face_count }</code> (informational)</td>
</tr>
</tbody>
</table>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/identity/upload&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-d<span class="w"> </span><span class="s1">&#39;{</span>
<span class="s1"> &quot;version&quot;: 1,</span>
<span class="s1"> &quot;identity_uuid&quot;: &quot;a9a901056d6b46ff92da0c3c1a57dff4&quot;,</span>
<span class="s1"> &quot;name&quot;: &quot;Cary Grant&quot;,</span>
<span class="s1"> &quot;identity_type&quot;: &quot;people&quot;,</span>
<span class="s1"> &quot;source&quot;: &quot;.json&quot;,</span>
<span class="s1"> &quot;status&quot;: &quot;confirmed&quot;,</span>
<span class="s1"> &quot;metadata&quot;: {},</span>
<span class="s1"> &quot;file_bindings&quot;: []</span>
<span class="s1"> }&#39;</span>
</code></pre></div>
<h4>Response (200)</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;success&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;identity_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;a9a901056d6b46ff92da0c3c1a57dff4&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Cary Grant&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;message&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Identity uploaded successfully&quot;</span>
<span class="p">}</span>
</code></pre></div>
<hr />
<hr />
<h3><code>POST /api/v1/identity/:identity_uuid/profile-image</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: identity-level</p>
<p>Upload a profile image (JPEG or PNG) for an identity. The image is saved to <code>{output}/identities/{uuid}/profile.{ext}</code>.</p>
<p>Uses <code>multipart/form-data</code> with field name <code>image</code>.</p>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/identity/</span><span class="nv">$IDENTITY_UUID</span><span class="s2">/profile-image&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-F<span class="w"> </span><span class="s2">&quot;image=@/path/to/photo.jpg&quot;</span>
</code></pre></div>
<h4>Response (200)</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;success&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;identity_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;a9a901056d6b46ff92da0c3c1a57dff4&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;path&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;/path/to/output/identities/.../profile.jpg&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;message&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Profile image saved: profile.jpg&quot;</span>
<span class="p">}</span>
</code></pre></div>
<h4>Error Responses</h4>
<table class="table">
<thead>
<tr>
<th>HTTP</th>
<th>When</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>400</code></td>
<td>Missing image field or unsupported format</td>
</tr>
<tr>
<td><code>404</code></td>
<td>Identity not found</td>
</tr>
<tr>
<td><code>415</code></td>
<td>Unsupported image type (use JPEG or PNG)</td>
</tr>
</tbody>
</table>
<hr />
<h3><code>GET /api/v1/identity/:identity_uuid/profile-image</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: identity-level</p>
<p>Retrieve the profile image for an identity. Returns the raw image data with appropriate Content-Type header.</p>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/identity/</span><span class="nv">$IDENTITY_UUID</span><span class="s2">/profile-image&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span>-o<span class="w"> </span>profile.jpg
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Response Header</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>content-type</code></td>
<td><code>image/jpeg</code> or <code>image/png</code></td>
</tr>
</tbody>
</table>
</div>
</body>
</html>

View File

@@ -0,0 +1,97 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>08 Identity Agent - Momentry API Docs</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; background: #f5f5f5; color: #333; padding: 40px; }
.container { max-width: 960px; margin: 0 auto; background: white; border-radius: 12px; box-shadow: 0 2px 12px rgba(0,0,0,0.08); padding: 40px; }
h1 { font-size: 24px; margin: 24px 0 12px; }
h2 { font-size: 20px; margin: 20px 0 10px; color: #222; }
h3 { font-size: 16px; margin: 16px 0 8px; color: #444; }
p { line-height: 1.6; margin: 8px 0; }
table { border-collapse: collapse; width: 100%; margin: 12px 0; font-size: 14px; }
th, td { border: 1px solid #ddd; padding: 8px 12px; text-align: left; }
th { background: #f0f0f0; font-weight: 600; }
code { background: #f0f0f0; padding: 2px 6px; border-radius: 3px; font-size: 13px; }
pre { background: #f8f8f8; border: 1px solid #ddd; border-radius: 6px; padding: 12px; overflow-x: auto; margin: 12px 0; }
pre code { background: none; padding: 0; }
a { color: #0066cc; }
.back { display: inline-block; margin-bottom: 20px; color: #666; }
.back:hover { color: #333; }
</style>
</head>
<body>
<div class="container">
<a class="back" href="index.html">&larr; Back to index</a>
<!-- module: identity_agent -->
<!-- description: Identity agent — match from photo, match from trace -->
<!-- depends: 01_auth, 07_identity -->
<h2>Identity Agent</h2>
<h3><code>POST /api/v1/agents/identity/match-from-photo</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<p>Upload a face photo to match against known identities. Detects face via InsightFace, extracts 512D embedding via CoreML FaceNet, then searches pgvector for the closest identity.</p>
<h4>Request</h4>
<p><code>multipart/form-data</code> with field <code>image</code> (JPEG/PNG) and optional <code>file_uuid</code>.</p>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/agents/identity/match-from-photo&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Authorization: Bearer </span><span class="nv">$JWT</span><span class="s2">&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-F<span class="w"> </span><span class="s2">&quot;image=@/path/to/face.jpg&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-F<span class="w"> </span><span class="s2">&quot;file_uuid=</span><span class="nv">$FILE_UUID</span><span class="s2">&quot;</span>
</code></pre></div>
<h4>Response (200)</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;success&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;matches&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;identity_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;a9a90105...&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Cary Grant&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;similarity&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">0.87</span>
<span class="w"> </span><span class="p">}</span>
<span class="w"> </span><span class="p">]</span>
<span class="p">}</span>
</code></pre></div>
<hr />
<h3><code>POST /api/v1/agents/identity/match-from-trace</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<p>Match a face trace (tracked face across frames) against known identities. Samples 3 angles from the trace, generates embeddings, and searches pgvector.</p>
<h4>Request Parameters</h4>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Required</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>file_uuid</code></td>
<td>string</td>
<td>Yes</td>
<td>File containing the trace</td>
</tr>
<tr>
<td><code>trace_id</code></td>
<td>integer</td>
<td>Yes</td>
<td>Face trace ID to match</td>
</tr>
</tbody>
</table>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/agents/identity/match-from-trace&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Authorization: Bearer </span><span class="nv">$JWT</span><span class="s2">&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-d<span class="w"> </span><span class="s1">&#39;{&quot;file_uuid&quot;: &quot;&#39;</span><span class="s2">&quot;</span><span class="nv">$FILE_UUID</span><span class="s2">&quot;</span><span class="s1">&#39;&quot;, &quot;trace_id&quot;: 10}&#39;</span>
</code></pre></div>
</div>
</body>
</html>

View File

@@ -0,0 +1,303 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>08 Media - Momentry API Docs</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; background: #f5f5f5; color: #333; padding: 40px; }
.container { max-width: 960px; margin: 0 auto; background: white; border-radius: 12px; box-shadow: 0 2px 12px rgba(0,0,0,0.08); padding: 40px; }
h1 { font-size: 24px; margin: 24px 0 12px; }
h2 { font-size: 20px; margin: 20px 0 10px; color: #222; }
h3 { font-size: 16px; margin: 16px 0 8px; color: #444; }
p { line-height: 1.6; margin: 8px 0; }
table { border-collapse: collapse; width: 100%; margin: 12px 0; font-size: 14px; }
th, td { border: 1px solid #ddd; padding: 8px 12px; text-align: left; }
th { background: #f0f0f0; font-weight: 600; }
code { background: #f0f0f0; padding: 2px 6px; border-radius: 3px; font-size: 13px; }
pre { background: #f8f8f8; border: 1px solid #ddd; border-radius: 6px; padding: 12px; overflow-x: auto; margin: 12px 0; }
pre code { background: none; padding: 0; }
a { color: #0066cc; }
.back { display: inline-block; margin-bottom: 20px; color: #666; }
.back:hover { color: #333; }
</style>
</head>
<body>
<div class="container">
<a class="back" href="index.html">&larr; Back to index</a>
<!-- module: media -->
<!-- description: Video streaming & frame extraction -->
<!-- depends: 01_auth -->
<h2>Video Streaming &amp; Frame Extraction</h2>
<p>All video streaming endpoints support the following common query parameters:</p>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Required</th>
<th>Default</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>mode</code></td>
<td>string</td>
<td>No</td>
<td><code>normal</code></td>
<td><code>normal</code> or <code>debug</code> (draws detection overlays)</td>
</tr>
<tr>
<td><code>audio</code></td>
<td>string</td>
<td>No</td>
<td><code>on</code></td>
<td><code>on</code> or <code>off</code></td>
</tr>
</tbody>
</table>
<hr />
<h3><code>GET /api/v1/file/:file_uuid/video</code></h3>
<p>Stream the full video file with range support for seeking.</p>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<h4>Response</h4>
<ul>
<li><strong>200</strong>: Video stream (<code>Content-Type</code> based on file extension)</li>
<li><strong>206</strong>: Partial content (range request)</li>
<li>Supports <code>Range</code> header for seeking</li>
</ul>
<hr />
<h3><code>GET /api/v1/file/:file_uuid/trace/:trace_id/video</code></h3>
<p>Stream video with highlights for a specific face trace (follows a single person across frames with bounding box overlay).</p>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<hr />
<h3><code>GET /api/v1/file/:file_uuid/video/bbox</code></h3>
<p>Stream video with bounding box overlay for all detected objects/faces.</p>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<p>Uses a built-in 5×7 bitmap font renderer to draw labels directly on video frames via FFmpeg <code>drawtext</code> filter.</p>
<hr />
<h3><code>GET /api/v1/file/:file_uuid/thumbnail</code></h3>
<p>Extract a single frame from a video as JPEG image. Uses FFmpeg <code>select</code> filter.</p>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<h4>Query Parameters</h4>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Required</th>
<th>Default</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>frame</code></td>
<td>integer</td>
<td>Yes</td>
<td></td>
<td>Zero-based frame number to extract</td>
</tr>
<tr>
<td><code>x</code></td>
<td>integer</td>
<td>No</td>
<td></td>
<td>Crop start X (left edge). Requires <code>y</code>, <code>w</code>, <code>h</code>.</td>
</tr>
<tr>
<td><code>y</code></td>
<td>integer</td>
<td>No</td>
<td></td>
<td>Crop start Y (top edge). Requires <code>x</code>, <code>w</code>, <code>h</code>.</td>
</tr>
<tr>
<td><code>w</code></td>
<td>integer</td>
<td>No</td>
<td></td>
<td>Crop width in pixels. Requires <code>x</code>, <code>y</code>, <code>h</code>.</td>
</tr>
<tr>
<td><code>h</code></td>
<td>integer</td>
<td>No</td>
<td></td>
<td>Crop height in pixels. Requires <code>x</code>, <code>y</code>, <code>w</code>.</td>
</tr>
</tbody>
</table>
<p>All four crop params (<code>x</code>, <code>y</code>, <code>w</code>, <code>h</code>) must be provided together or omitted.</p>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code><span class="c1"># Extract frame 1000 (full frame)</span>
curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/file/bd80fec92b0b6963d177a2c55bf713e2/thumbnail?frame=1000&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Authorization: Bearer </span><span class="nv">$JWT</span><span class="s2">&quot;</span><span class="w"> </span>-o<span class="w"> </span>frame_1000.jpg
<span class="c1"># Extract and crop face region (x=320, y=240, w=160, h=160)</span>
curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/file/bd80fec92b0b6963d177a2c55bf713e2/thumbnail?frame=1000&amp;x=320&amp;y=240&amp;w=160&amp;h=160&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Authorization: Bearer </span><span class="nv">$JWT</span><span class="s2">&quot;</span><span class="w"> </span>-o<span class="w"> </span>face_crop.jpg
</code></pre></div>
<h4>Response</h4>
<ul>
<li><strong>200</strong>: <code>image/jpeg</code> binary data</li>
<li><strong>404</strong>: File not found</li>
<li><strong>500</strong>: FFmpeg error (e.g., frame number exceeds video duration)</li>
</ul>
<h3><code>GET /api/v1/file/:file_uuid/clip</code></h3>
<p>Extract a video clip (time range) as MPEG-TS stream. Uses FFmpeg <code>-ss</code> fast seek.</p>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<h4>Query Parameters</h4>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Required</th>
<th>Default</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>start_frame</code></td>
<td>integer</td>
<td>No*</td>
<td></td>
<td>Start frame (zero-based). <strong>Frame-accurate</strong> — use this for precision.</td>
</tr>
<tr>
<td><code>end_frame</code></td>
<td>integer</td>
<td>No*</td>
<td></td>
<td>End frame (zero-based, inclusive). Requires <code>start_frame</code>.</td>
</tr>
<tr>
<td><code>start_time</code></td>
<td>float</td>
<td>No*</td>
<td></td>
<td>Start time in seconds. Approximate (FPS-dependent). Fallback if frames not given.</td>
</tr>
<tr>
<td><code>end_time</code></td>
<td>float</td>
<td>No*</td>
<td></td>
<td>End time in seconds. Approximate (FPS-dependent). Fallback if frames not given.</td>
</tr>
<tr>
<td><code>fps</code></td>
<td>float</td>
<td>No</td>
<td>video FPS</td>
<td>Override frames-per-second for frame↔time calculation. Defaults to video's detected FPS.</td>
</tr>
<tr>
<td><code>mode</code></td>
<td>string</td>
<td>No</td>
<td><code>normal</code></td>
<td><code>normal</code> or <code>debug</code> (draws "CLIP" overlay)</td>
</tr>
<tr>
<td><code>audio</code></td>
<td>string</td>
<td>No</td>
<td><code>on</code></td>
<td><code>on</code> or <code>off</code></td>
</tr>
</tbody>
</table>
<p>Either (<code>start_frame</code>+<code>end_frame</code>) OR (<code>start_time</code>+<code>end_time</code>) must be provided.</p>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code><span class="c1"># Clip by frame range (primary)</span>
curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/file/bd80fec92b0b6963d177a2c55bf713e2/clip?start_frame=0&amp;end_frame=47&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Authorization: Bearer </span><span class="nv">$JWT</span><span class="s2">&quot;</span><span class="w"> </span>-o<span class="w"> </span>clip.ts
<span class="c1"># Clip by time range (fallback)</span>
curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/file/bd80fec92b0b6963d177a2c55bf713e2/clip?start_time=30&amp;end_time=45&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Authorization: Bearer </span><span class="nv">$JWT</span><span class="s2">&quot;</span><span class="w"> </span>-o<span class="w"> </span>clip.ts
</code></pre></div>
<h4>Response</h4>
<ul>
<li><strong>200</strong>: <code>video/mp2t</code> MPEG-TS stream</li>
<li><strong>400</strong>: Missing/invalid range parameters</li>
<li><strong>404</strong>: File not found</li>
<li><strong>500</strong>: FFmpeg error</li>
</ul>
<h4>Technical Notes</h4>
<table class="table">
<thead>
<tr>
<th>Detail</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Backend</strong></td>
<td>FFmpeg (<code>ffmpeg-full</code>)</td>
</tr>
<tr>
<td><strong>Seek</strong></td>
<td><code>-ss</code> before <code>-i</code> (fast keyframe seek)</td>
</tr>
<tr>
<td><strong>Format</strong></td>
<td>MPEG-TS (<code>mpegts</code> muxer, pipe-safe)</td>
</tr>
<tr>
<td><strong>Codec</strong></td>
<td>H.264 + AAC</td>
</tr>
<tr>
<td><strong>Cache</strong></td>
<td><code>Cache-Control: public, max-age=86400</code> (24h)</td>
</tr>
</tbody>
</table>
<hr />
<table class="table">
<thead>
<tr>
<th>Detail</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Backend</strong></td>
<td>FFmpeg (<code>ffmpeg-full</code>)</td>
</tr>
<tr>
<td><strong>Filter</strong></td>
<td><code>select=eq(n\,FRAME)</code> to select frame, optional <code>crop=W:H:X:Y</code></td>
</tr>
<tr>
<td><strong>Output</strong></td>
<td>Single JPEG via pipe (<code>image2pipe</code>, <code>mjpeg</code> codec)</td>
</tr>
<tr>
<td><strong>Cache</strong></td>
<td><code>Cache-Control: public, max-age=86400</code> (24h)</td>
</tr>
<tr>
<td><strong>Frame number</strong></td>
<td>Zero-based (<code>frame=0</code> = first frame of video)</td>
</tr>
</tbody>
</table>
</div>
</body>
</html>

View File

@@ -0,0 +1,123 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>09 Tmdb - Momentry API Docs</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; background: #f5f5f5; color: #333; padding: 40px; }
.container { max-width: 960px; margin: 0 auto; background: white; border-radius: 12px; box-shadow: 0 2px 12px rgba(0,0,0,0.08); padding: 40px; }
h1 { font-size: 24px; margin: 24px 0 12px; }
h2 { font-size: 20px; margin: 20px 0 10px; color: #222; }
h3 { font-size: 16px; margin: 16px 0 8px; color: #444; }
p { line-height: 1.6; margin: 8px 0; }
table { border-collapse: collapse; width: 100%; margin: 12px 0; font-size: 14px; }
th, td { border: 1px solid #ddd; padding: 8px 12px; text-align: left; }
th { background: #f0f0f0; font-weight: 600; }
code { background: #f0f0f0; padding: 2px 6px; border-radius: 3px; font-size: 13px; }
pre { background: #f8f8f8; border: 1px solid #ddd; border-radius: 6px; padding: 12px; overflow-x: auto; margin: 12px 0; }
pre code { background: none; padding: 0; }
a { color: #0066cc; }
.back { display: inline-block; margin-bottom: 20px; color: #666; }
.back:hover { color: #333; }
</style>
</head>
<body>
<div class="container">
<a class="back" href="index.html">&larr; Back to index</a>
<!-- module: tmdb -->
<!-- description: TMDb enrichment endpoints — prefetch, probe, resource, check -->
<!-- depends: 01_auth, 03_register -->
<h2>TMDb Enrichment</h2>
<blockquote>
<p><strong>Offline operation</strong>: TMDb prefetch now checks local identity files first (<code>identities/_index.json</code> + <code>*.tmdb.json</code>).
If local files exist, no external API call is made. Internet is only needed for initial data seeding.</p>
</blockquote>
<h3>Overview</h3>
<p>TMDb enrichment is an optional identity enrichment step that can be run after Pipeline face detection completes. The workflow is:</p>
<ol>
<li><strong>Prefetch</strong> (requires internet): Download movie cast data from TMDb API → cache to <code>{file_uuid}.tmdb.json</code></li>
<li><strong>Probe</strong>: Read local cache → create identities for <strong>all</strong> cast members (<code>source='tmdb'</code>) + save <code>identity.json</code> + download profile image to <code>{OUTPUT}/identities/{uuid}/profile.jpg</code></li>
<li><strong>Match</strong>: The worker automatically matches video faces against TMDb identities when <code>MOMENTRY_TMDB_PROBE_ENABLED=true</code></li>
</ol>
<h3><code>POST /api/v1/agents/tmdb/prefetch</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<p>Fetch TMDb cast data for a registered file and cache it locally. This is the only step requiring internet access.</p>
<h4>Request Parameters</h4>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Required</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>file_uuid</code></td>
<td>string</td>
<td>Yes</td>
<td>File UUID to enrich</td>
</tr>
</tbody>
</table>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/agents/tmdb/prefetch&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-d<span class="w"> </span><span class="s1">&#39;{&quot;file_uuid&quot;: &quot;&#39;</span><span class="s2">&quot;</span><span class="nv">$FILE_UUID</span><span class="s2">&quot;</span><span class="s1">&#39;&quot;}&#39;</span>
</code></pre></div>
<h4>Response (200)</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span><span class="nt">&quot;success&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;file_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;...&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;cache_path&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;/output/...tmdb.json&quot;</span><span class="p">}</span>
</code></pre></div>
<h3><code>POST /api/v1/file/:file_uuid/tmdb-probe</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<p>Read local TMDb cache and create/update identities. Requires prefetch to have been run first.</p>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/file/</span><span class="nv">$FILE_UUID</span><span class="s2">/tmdb-probe&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>jq<span class="w"> </span><span class="s1">&#39;{identities_created, movie_title}&#39;</span>
</code></pre></div>
<h4>Response (200 — identities created)</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span><span class="nt">&quot;success&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;identities_created&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">15</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;movie_title&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Charade&quot;</span><span class="p">}</span>
</code></pre></div>
<h4>Response (200 — no cache)</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span><span class="nt">&quot;success&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;message&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;No TMDb cache found. Run tmdb-prefetch first.&quot;</span><span class="p">}</span>
</code></pre></div>
<h3><code>GET /api/v1/resource/tmdb</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: system-level</p>
<p>View TMDb resource status including configuration, identity counts, and cache file count.</p>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/resource/tmdb&quot;</span><span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span><span class="p">|</span><span class="w"> </span>jq<span class="w"> </span><span class="s1">&#39;{identities_seeded, cache_files}&#39;</span>
</code></pre></div>
<h3><code>POST /api/v1/resource/tmdb/check</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: system-level</p>
<p>Ping the TMDb API to verify connectivity and measure latency.</p>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/resource/tmdb/check&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>jq<span class="w"> </span><span class="s1">&#39;.status&#39;</span>
</code></pre></div>
<h4>Response</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;api_key_configured&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;enabled&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;api_reachable&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;api_latency_ms&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">120</span>
<span class="p">}</span>
</code></pre></div>
</div>
</body>
</html>

View File

@@ -0,0 +1,364 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>10 Pipeline - Momentry API Docs</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; background: #f5f5f5; color: #333; padding: 40px; }
.container { max-width: 960px; margin: 0 auto; background: white; border-radius: 12px; box-shadow: 0 2px 12px rgba(0,0,0,0.08); padding: 40px; }
h1 { font-size: 24px; margin: 24px 0 12px; }
h2 { font-size: 20px; margin: 20px 0 10px; color: #222; }
h3 { font-size: 16px; margin: 16px 0 8px; color: #444; }
p { line-height: 1.6; margin: 8px 0; }
table { border-collapse: collapse; width: 100%; margin: 12px 0; font-size: 14px; }
th, td { border: 1px solid #ddd; padding: 8px 12px; text-align: left; }
th { background: #f0f0f0; font-weight: 600; }
code { background: #f0f0f0; padding: 2px 6px; border-radius: 3px; font-size: 13px; }
pre { background: #f8f8f8; border: 1px solid #ddd; border-radius: 6px; padding: 12px; overflow-x: auto; margin: 12px 0; }
pre code { background: none; padding: 0; }
a { color: #0066cc; }
.back { display: inline-block; margin-bottom: 20px; color: #666; }
.back:hover { color: #333; }
</style>
</head>
<body>
<div class="container">
<a class="back" href="index.html">&larr; Back to index</a>
<!-- module: pipeline -->
<!-- description: Pipeline processors, ingestion status, stats endpoints -->
<!-- depends: 01_auth -->
<h2>Pipeline</h2>
<h3>Dependency Graph</h3>
<div class="codehilite"><pre><span></span><code><span class="n">flowchart</span><span class="w"> </span><span class="n">TB</span>
<span class="w"> </span><span class="n">subgraph</span><span class="w"> </span><span class="n">Processors</span><span class="p">[</span><span class="s">&quot;10 Processors&quot;</span><span class="p">]</span>
<span class="w"> </span><span class="n">Cut</span><span class="p">[</span><span class="n">Cut</span><span class="p">]</span><span class="w"> </span><span class="o">--&gt;</span><span class="w"> </span><span class="n">ASR</span><span class="p">[</span><span class="n">ASR</span><span class="p">]</span>
<span class="w"> </span><span class="n">ASR</span><span class="w"> </span><span class="o">--&gt;</span><span class="w"> </span><span class="n">ASRX</span><span class="p">[</span><span class="n">ASRX</span><span class="p">]</span>
<span class="w"> </span><span class="n">ASRX</span><span class="w"> </span><span class="o">--&gt;</span><span class="w"> </span><span class="n">Story</span><span class="p">[</span><span class="n">Story</span><span class="p">]</span>
<span class="w"> </span><span class="n">Cut</span><span class="w"> </span><span class="o">--&gt;</span><span class="w"> </span><span class="n">Story</span>
<span class="w"> </span><span class="n">YOLO</span><span class="p">[</span><span class="n">YOLO</span><span class="p">]</span><span class="w"> </span><span class="o">--&gt;</span><span class="w"> </span><span class="n">VisualChunk</span><span class="p">[</span><span class="n">VisualChunk</span><span class="p">]</span>
<span class="w"> </span><span class="n">VisualChunk</span><span class="w"> </span><span class="o">--&gt;</span><span class="w"> </span><span class="n">Story</span>
<span class="w"> </span><span class="n">Face</span><span class="p">[</span><span class="n">Face</span><span class="p">]</span><span class="w"> </span><span class="o">--&gt;</span><span class="w"> </span><span class="n">Story</span>
<span class="w"> </span><span class="n">Story</span><span class="w"> </span><span class="o">--&gt;</span><span class="w"> </span><span class="n">FiveW1H</span><span class="p">[</span><span class="mi">5</span><span class="n">W1H</span><span class="p">]</span>
<span class="w"> </span><span class="n">OCR</span><span class="p">[</span><span class="n">OCR</span><span class="p">]</span>
<span class="w"> </span><span class="n">Pose</span><span class="p">[</span><span class="n">Pose</span><span class="p">]</span>
<span class="w"> </span><span class="n">end</span>
<span class="w"> </span><span class="n">subgraph</span><span class="w"> </span><span class="n">Ingestion</span><span class="p">[</span><span class="s">&quot;入庫 (Post-Processing)&quot;</span><span class="p">]</span>
<span class="w"> </span><span class="n">ASR</span><span class="w"> </span><span class="o">--&gt;</span><span class="w"> </span><span class="n">Rule1</span><span class="p">[</span><span class="n">Rule</span><span class="w"> </span><span class="mi">1</span><span class="w"> </span><span class="n">Sentence</span><span class="p">]</span>
<span class="w"> </span><span class="n">ASRX</span><span class="w"> </span><span class="o">--&gt;</span><span class="w"> </span><span class="n">Rule1</span>
<span class="w"> </span><span class="n">Rule1</span><span class="w"> </span><span class="o">--&gt;</span><span class="w"> </span><span class="n">Vectorize</span><span class="p">[</span><span class="n">Auto</span><span class="o">-</span><span class="n">Vectorize</span><span class="p">]</span>
<span class="w"> </span><span class="n">Rule1</span><span class="w"> </span><span class="o">--&gt;</span><span class="w"> </span><span class="n">Phase1</span><span class="p">[</span><span class="n">Phase</span><span class="w"> </span><span class="mi">1</span><span class="w"> </span><span class="n">Pack</span><span class="p">]</span>
<span class="w"> </span><span class="n">Cut</span><span class="w"> </span><span class="o">--&gt;</span><span class="w"> </span><span class="n">Rule3</span><span class="p">[</span><span class="n">Rule</span><span class="w"> </span><span class="mi">3</span><span class="w"> </span><span class="n">Scene</span><span class="p">]</span>
<span class="w"> </span><span class="n">ASR</span><span class="w"> </span><span class="o">--&gt;</span><span class="w"> </span><span class="n">Rule3</span>
<span class="w"> </span><span class="n">Face</span><span class="w"> </span><span class="o">--&gt;</span><span class="w"> </span><span class="n">Trace</span><span class="p">[</span><span class="n">Face</span><span class="w"> </span><span class="n">Trace</span><span class="p">]</span>
<span class="w"> </span><span class="n">Trace</span><span class="w"> </span><span class="o">--&gt;</span><span class="w"> </span><span class="n">Qdrant</span><span class="p">[</span><span class="n">Qdrant</span><span class="w"> </span><span class="n">Sync</span><span class="p">]</span>
<span class="w"> </span><span class="n">Trace</span><span class="w"> </span><span class="o">--&gt;</span><span class="w"> </span><span class="n">TraceChunks</span><span class="p">[</span><span class="n">Trace</span><span class="w"> </span><span class="n">Chunks</span><span class="p">]</span>
<span class="w"> </span><span class="n">Trace</span><span class="w"> </span><span class="o">--&gt;</span><span class="w"> </span><span class="n">TKG</span><span class="p">[</span><span class="n">TKG</span><span class="w"> </span><span class="n">Builder</span><span class="p">]</span>
<span class="w"> </span><span class="n">Face</span><span class="w"> </span><span class="o">--&gt;</span><span class="w"> </span><span class="n">TMDbMatch</span><span class="p">[</span><span class="n">TMDb</span><span class="w"> </span><span class="n">Match</span><span class="p">]</span>
<span class="w"> </span><span class="n">Face</span><span class="w"> </span><span class="o">--&gt;</span><span class="w"> </span><span class="n">SceneMeta</span><span class="p">[</span><span class="n">Scene</span><span class="w"> </span><span class="n">Metadata</span><span class="p">]</span>
<span class="w"> </span><span class="n">YOLO</span><span class="w"> </span><span class="o">--&gt;</span><span class="w"> </span><span class="n">SceneMeta</span>
<span class="w"> </span><span class="n">Face</span><span class="w"> </span><span class="o">--&gt;</span><span class="w"> </span><span class="n">IdentityAgent</span><span class="p">[</span><span class="n">Identity</span><span class="w"> </span><span class="n">Agent</span><span class="p">]</span>
<span class="w"> </span><span class="n">ASRX</span><span class="w"> </span><span class="o">--&gt;</span><span class="w"> </span><span class="n">IdentityAgent</span>
<span class="w"> </span><span class="n">Cut</span><span class="w"> </span><span class="o">--&gt;</span><span class="w"> </span><span class="n">Agent5W1H</span><span class="p">[</span><span class="mi">5</span><span class="n">W1H</span><span class="w"> </span><span class="n">Agent</span><span class="p">]</span>
<span class="w"> </span><span class="n">ASR</span><span class="w"> </span><span class="o">--&gt;</span><span class="w"> </span><span class="n">Agent5W1H</span>
<span class="w"> </span><span class="n">Agent5W1H</span><span class="w"> </span><span class="o">--&gt;</span><span class="w"> </span><span class="n">Phase2</span><span class="p">[</span><span class="n">Phase</span><span class="w"> </span><span class="mi">2</span><span class="w"> </span><span class="n">Pack</span><span class="p">]</span>
<span class="w"> </span><span class="n">end</span>
<span class="w"> </span><span class="n">style</span><span class="w"> </span><span class="n">Processors</span><span class="w"> </span><span class="n">fill</span><span class="o">:</span><span class="err">#</span><span class="mi">1</span><span class="n">a1a2e</span><span class="p">,</span><span class="n">stroke</span><span class="o">:</span><span class="err">#</span><span class="n">e94560</span>
<span class="w"> </span><span class="n">style</span><span class="w"> </span><span class="n">Ingestion</span><span class="w"> </span><span class="n">fill</span><span class="o">:</span><span class="err">#</span><span class="mi">16213</span><span class="n">e</span><span class="p">,</span><span class="n">stroke</span><span class="o">:</span><span class="err">#</span><span class="mf">0f</span><span class="mi">3460</span>
</code></pre></div>
<h3>Pipeline Completion Flow</h3>
<p>The pipeline is <strong>not complete</strong> until both the 10 processors AND the 入庫 (ingestion) steps have finished. The worker polls every 3 seconds and only marks the job as <code>completed</code> when all ingestion steps verify OK.</p>
<div class="codehilite"><pre><span></span><code><span class="mf">10</span><span class="w"> </span><span class="n">processors</span><span class="w"> </span><span class="n">done</span>
<span class="w"> </span><span class="err"></span><span class="w"> </span><span class="p">(</span><span class="n">job</span><span class="w"> </span><span class="n">status</span><span class="w"> </span><span class="n">stays</span><span class="w"> </span><span class="s">&quot;running&quot;</span><span class="p">)</span>
<span class="n">Algorithm</span><span class="w"> </span><span class="mf">1</span><span class="w"> </span><span class="n">Trigger</span><span class="p">:</span><span class="w"> </span><span class="n">Rule</span><span class="w"> </span><span class="mf">1</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">Vectorize</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">Phase</span><span class="w"> </span><span class="mf">1</span><span class="w"> </span><span class="n">Pack</span>
<span class="w"> </span><span class="err"></span><span class="w"> </span><span class="p">(</span><span class="n">job</span><span class="w"> </span><span class="kr">run</span><span class="n">s</span><span class="w"> </span><span class="n">in</span><span class="w"> </span><span class="n">parallel</span><span class="p">)</span>
<span class="n">Algorithm</span><span class="w"> </span><span class="mf">2</span><span class="w"> </span><span class="n">Trigger</span><span class="p">:</span><span class="w"> </span><span class="n">Face</span><span class="w"> </span><span class="n">Trace</span><span class="w"> </span><span class="err"></span><span class="w"> </span><span class="n">TKG</span><span class="p">,</span><span class="w"> </span><span class="n">Scene</span><span class="w"> </span><span class="n">Metadata</span><span class="p">,</span><span class="w"> </span><span class="n">Identity</span><span class="w"> </span><span class="n">Agent</span><span class="p">,</span><span class="w"> </span><span class="mf">5</span><span class="n">W1H</span><span class="w"> </span><span class="n">Agent</span>
<span class="w"> </span><span class="err"></span><span class="w"> </span><span class="p">(</span><span class="n">poll</span><span class="w"> </span><span class="n">checks</span><span class="w"> </span><span class="n">every</span><span class="w"> </span><span class="mf">3</span><span class="n">s</span><span class="p">)</span>
<span class="n">Ingestion</span><span class="w"> </span><span class="n">verification</span><span class="p">:</span><span class="w"> </span><span class="n">rule1</span><span class="w"> </span><span class="err"></span><span class="w"> </span><span class="n">vectorize</span><span class="w"> </span><span class="err"></span><span class="w"> </span><span class="n">rule3</span><span class="w"> </span><span class="err"></span><span class="w"> </span><span class="n">face_trace</span><span class="w"> </span><span class="err"></span><span class="w"> </span><span class="n">tkg</span><span class="w"> </span><span class="err"></span><span class="w"> </span><span class="n">scene_meta</span><span class="w"> </span><span class="err"></span><span class="w"> </span><span class="mf">5</span><span class="n">w1h</span><span class="w"> </span><span class="err"></span>
<span class="w"> </span><span class="err"></span>
<span class="n">job</span><span class="w"> </span><span class="n">status</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">&quot;completed&quot;</span>
</code></pre></div>
<h3>10 Processor Stages</h3>
<table class="table">
<thead>
<tr>
<th>#</th>
<th>Processor</th>
<th>Depends On</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td><code>Cut</code></td>
<td></td>
<td>Scene boundary detection (PySceneDetect)</td>
</tr>
<tr>
<td>2</td>
<td><code>ASR</code></td>
<td>Cut</td>
<td>Automatic speech recognition (faster-whisper)</td>
</tr>
<tr>
<td>3</td>
<td><code>ASRX</code></td>
<td>ASR</td>
<td>Speaker diarization + ASR refinement</td>
</tr>
<tr>
<td>4</td>
<td><code>YOLO</code></td>
<td></td>
<td>Object detection (YOLOv8)</td>
</tr>
<tr>
<td>5</td>
<td><code>OCR</code></td>
<td></td>
<td>Optical character recognition</td>
</tr>
<tr>
<td>6</td>
<td><code>Face</code></td>
<td></td>
<td>Face detection + recognition (InsightFace + CoreML)</td>
</tr>
<tr>
<td>7</td>
<td><code>Pose</code></td>
<td></td>
<td>Pose estimation</td>
</tr>
<tr>
<td>8</td>
<td><code>VisualChunk</code></td>
<td>YOLO</td>
<td>Visual object chunking</td>
</tr>
<tr>
<td>9</td>
<td><code>Story</code></td>
<td>ASRX + Cut + YOLO + Face</td>
<td>Narrative scene summarization (LLM, with embedding)</td>
</tr>
<tr>
<td>10</td>
<td><code>5W1H</code></td>
<td>Story</td>
<td>Who/What/When/Where/Why extraction (LLM, with embedding)</td>
</tr>
</tbody>
</table>
<h3>入庫 (Post-Processing / Ingestion)</h3>
<p>These steps run after the 10 processors and are <strong>required for pipeline completion</strong>. The worker checks all of them before marking the job as done.</p>
<table class="table">
<thead>
<tr>
<th>#</th>
<th>Step</th>
<th>Triggers When</th>
<th>Verification</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td><strong>Rule 1 Sentence Chunking</strong></td>
<td>ASR + ASRX done</td>
<td><code>chunk</code> table has rows with <code>chunk_type = 'sentence'</code></td>
</tr>
<tr>
<td>2</td>
<td><strong>Auto-Vectorize</strong></td>
<td>Rule 1 done</td>
<td><code>chunk.embedding</code> IS NOT NULL for sentence chunks</td>
</tr>
<tr>
<td>3</td>
<td><strong>Phase 1 Pack</strong></td>
<td>Rule 1 done</td>
<td><code>release_pack.py --phase 1</code> executed</td>
</tr>
<tr>
<td>4</td>
<td><strong>Rule 3 Scene Chunking</strong></td>
<td>All 10 processors done + Cut + ASR</td>
<td><code>chunk</code> table has rows with <code>chunk_type = 'cut'</code></td>
</tr>
<tr>
<td>5</td>
<td><strong>Face Trace</strong></td>
<td>All 10 processors done + Face</td>
<td><code>face_detections.trace_id</code> IS NOT NULL</td>
</tr>
<tr>
<td>6</td>
<td><strong>Qdrant Face Sync</strong></td>
<td>Face Trace done</td>
<td>Qdrant face_embedding collection populated</td>
</tr>
<tr>
<td>7</td>
<td><strong>Trace Chunks</strong></td>
<td>Face Trace done</td>
<td><code>chunk</code> table has rows with <code>chunk_type = 'trace'</code></td>
</tr>
<tr>
<td>8</td>
<td><strong>TKG Builder</strong></td>
<td>Face Trace done</td>
<td><code>tkg_nodes</code> + <code>tkg_edges</code> tables have rows</td>
</tr>
<tr>
<td>9</td>
<td><strong>TMDb Face Matching</strong></td>
<td>TMDb enabled + Face done</td>
<td><code>face_detections.identity_id</code> IS NOT NULL</td>
</tr>
<tr>
<td>10</td>
<td><strong>Heuristic Scene Metadata</strong></td>
<td>Face + YOLO done</td>
<td><code>{file_uuid}.scene_meta.json</code> exists on disk</td>
</tr>
<tr>
<td>11</td>
<td><strong>Identity Agent</strong></td>
<td>Face + ASRX done</td>
<td><code>identities</code> with <code>source = 'identity_agent'</code></td>
</tr>
<tr>
<td>12</td>
<td><strong>5W1H Agent</strong></td>
<td>Cut + ASR done</td>
<td><code>chunk.summary_text</code> IS NOT NULL for cut chunks</td>
</tr>
<tr>
<td>13</td>
<td><strong>Release Pack</strong></td>
<td>5W1H Agent done</td>
<td><code>release_pack.py --phase 2</code> executed</td>
</tr>
</tbody>
</table>
<h3>Ingestion Status</h3>
<p>Check real-time ingestion status for a file:</p>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/stats/ingestion-status/{file_uuid}&quot;</span>
</code></pre></div>
<p>Returns per-step <code>done</code> / <code>pending</code> status with detail counts.</p>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span><span class="s2">&quot;http://localhost:3003/api/v1/stats/ingestion-status/bd80fec9c42afb0307eb28f22c64c76a&quot;</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>jq<span class="w"> </span><span class="s1">&#39;.steps[] | {name, status, detail}&#39;</span>
</code></pre></div>
<h4>Response</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;file_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;bd80fec9c42afb0307eb28f22c64c76a&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;steps&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
<span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nt">&quot;name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;rule1_sentence&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;status&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;pending&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;detail&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;0 sentence chunks&quot;</span><span class="w"> </span><span class="p">},</span>
<span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nt">&quot;name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;auto_vectorize&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;status&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;pending&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;detail&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;0 embedded&quot;</span><span class="w"> </span><span class="p">},</span>
<span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nt">&quot;name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;rule3_scene&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;status&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;pending&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;detail&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;0 scene chunks&quot;</span><span class="w"> </span><span class="p">},</span>
<span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nt">&quot;name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;face_trace&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;status&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;pending&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;detail&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;0 traces&quot;</span><span class="w"> </span><span class="p">},</span>
<span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nt">&quot;name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;trace_chunks&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;status&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;pending&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;detail&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;0 trace chunks&quot;</span><span class="w"> </span><span class="p">},</span>
<span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nt">&quot;name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;tkg&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;status&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;pending&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;detail&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;0 nodes, 0 edges&quot;</span><span class="w"> </span><span class="p">},</span>
<span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nt">&quot;name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;identity_match&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;status&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;pending&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;detail&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;0 identities&quot;</span><span class="w"> </span><span class="p">},</span>
<span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nt">&quot;name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;scene_metadata&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;status&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;pending&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;detail&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">null</span><span class="w"> </span><span class="p">},</span>
<span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nt">&quot;name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;5w1h&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;status&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;pending&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;detail&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;0 scenes with 5W1H&quot;</span><span class="w"> </span><span class="p">}</span>
<span class="w"> </span><span class="p">]</span>
<span class="p">}</span>
</code></pre></div>
<h3>Stats Endpoints</h3>
<table class="table">
<thead>
<tr>
<th>Method</th>
<th>Endpoint</th>
<th>Auth</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>GET</td>
<td><code>/api/v1/stats/sftpgo</code></td>
<td>No</td>
<td>SFTPGo service status</td>
</tr>
<tr>
<td>GET</td>
<td><code>/api/v1/stats/ingestion-status/:file_uuid</code></td>
<td>No</td>
<td>Per-file ingestion checklist</td>
</tr>
</tbody>
</table>
<h3>Configuration</h3>
<h3><code>POST /api/v1/config/cache</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: system-level</p>
<p>Toggle the Redis cache on or off.</p>
<h4>Request Parameters</h4>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Required</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>enabled</code></td>
<td>boolean</td>
<td>Yes</td>
<td><code>true</code> to enable, <code>false</code> to disable</td>
</tr>
</tbody>
</table>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/config/cache&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-d<span class="w"> </span><span class="s1">&#39;{&quot;enabled&quot;: false}&#39;</span>
</code></pre></div>
<h3>Unmounted Routes</h3>
<p>The following routes are defined in source code but are <strong>NOT</strong> currently mounted in the router:</p>
<table class="table">
<thead>
<tr>
<th>Endpoint</th>
<th>Source file</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>/api/v1/search/persons</code></td>
<td><code>universal_search.rs</code> (not mounted)</td>
</tr>
<tr>
<td><code>/api/v1/who</code></td>
<td><code>who.rs</code></td>
</tr>
<tr>
<td><code>/api/v1/who/candidates</code></td>
<td><code>who.rs</code></td>
</tr>
</tbody>
</table>
</div>
</body>
</html>

View File

@@ -0,0 +1,207 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>12 Agent - Momentry API Docs</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; background: #f5f5f5; color: #333; padding: 40px; }
.container { max-width: 960px; margin: 0 auto; background: white; border-radius: 12px; box-shadow: 0 2px 12px rgba(0,0,0,0.08); padding: 40px; }
h1 { font-size: 24px; margin: 24px 0 12px; }
h2 { font-size: 20px; margin: 20px 0 10px; color: #222; }
h3 { font-size: 16px; margin: 16px 0 8px; color: #444; }
p { line-height: 1.6; margin: 8px 0; }
table { border-collapse: collapse; width: 100%; margin: 12px 0; font-size: 14px; }
th, td { border: 1px solid #ddd; padding: 8px 12px; text-align: left; }
th { background: #f0f0f0; font-weight: 600; }
code { background: #f0f0f0; padding: 2px 6px; border-radius: 3px; font-size: 13px; }
pre { background: #f8f8f8; border: 1px solid #ddd; border-radius: 6px; padding: 12px; overflow-x: auto; margin: 12px 0; }
pre code { background: none; padding: 0; }
a { color: #0066cc; }
.back { display: inline-block; margin-bottom: 20px; color: #666; }
.back:hover { color: #333; }
</style>
</head>
<body>
<div class="container">
<a class="back" href="index.html">&larr; Back to index</a>
<h1>Agent Endpoints</h1>
<p>Agent endpoints provide AI-powered capabilities including translation, identity analysis, and 5W1H extraction.</p>
<h2>POST /api/v1/agents/translate</h2>
<p>Translate text between languages using Gemma4 (llama.cpp, port 8082).</p>
<h3>Request</h3>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;text&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Hello, welcome to Momentry Core.&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;target_language&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Traditional Chinese&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;source_language&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;English&quot;</span>
<span class="p">}</span>
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Required</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>text</code></td>
<td>string</td>
<td></td>
<td>Text to translate</td>
</tr>
<tr>
<td><code>target_language</code></td>
<td>string</td>
<td></td>
<td>Target language name (e.g. "Traditional Chinese", "Japanese")</td>
</tr>
<tr>
<td><code>source_language</code></td>
<td>string</td>
<td></td>
<td>Source language (default: "auto")</td>
</tr>
</tbody>
</table>
<h3>Response</h3>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;success&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;translated_text&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;您好,歡迎使用 Momentry Core。&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;source_language_detected&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;English&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;model_used&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;google_gemma-4-26B-A4B-it-Q5_K_M.gguf&quot;</span>
<span class="p">}</span>
</code></pre></div>
<h3>Supported Language Pairs (tested)</h3>
<table class="table">
<thead>
<tr>
<th>Source</th>
<th>Target</th>
<th>Quality</th>
</tr>
</thead>
<tbody>
<tr>
<td>English</td>
<td>Traditional Chinese</td>
<td></td>
</tr>
<tr>
<td>English</td>
<td>Japanese</td>
<td></td>
</tr>
<tr>
<td>Chinese</td>
<td>English</td>
<td></td>
</tr>
<tr>
<td>English</td>
<td>French</td>
<td></td>
</tr>
<tr>
<td>Chinese</td>
<td>Japanese</td>
<td></td>
</tr>
</tbody>
</table>
<h3>Model</h3>
<ul>
<li><strong>Model</strong>: Gemma4 26B (Q5_K_M)</li>
<li><strong>Engine</strong>: llama.cpp at <code>localhost:8082</code></li>
<li><strong>Endpoint</strong>: <code>/v1/chat/completions</code> (OpenAI-compatible)</li>
<li><strong>Temperature</strong>: 0.1</li>
<li><strong>Max tokens</strong>: 1024</li>
</ul>
<h3>Errors</h3>
<table class="table">
<thead>
<tr>
<th>Status</th>
<th>Condition</th>
</tr>
</thead>
<tbody>
<tr>
<td>500</td>
<td>LLM unreachable or response parse failure</td>
</tr>
<tr>
<td>401</td>
<td>Missing/invalid auth</td>
</tr>
</tbody>
</table>
<hr />
<h2>POST /api/v1/agents/5w1h/analyze</h2>
<p>Extract 5W1H (Who, What, When, Where, Why, How) from a scene. Uses Gemma4 LLM on port 8082.</p>
<h3>Request</h3>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;file_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;3abeee81d94597629ed8cb943f182e94&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;scene_id&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">42</span>
<span class="p">}</span>
</code></pre></div>
<h3>Response</h3>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;success&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;5w1h&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;who&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">&quot;Cary Grant&quot;</span><span class="p">],</span>
<span class="w"> </span><span class="nt">&quot;what&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">&quot;discussing plans&quot;</span><span class="p">],</span>
<span class="w"> </span><span class="nt">&quot;when&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">&quot;1963&quot;</span><span class="p">],</span>
<span class="w"> </span><span class="nt">&quot;where&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">&quot;Paris&quot;</span><span class="p">],</span>
<span class="w"> </span><span class="nt">&quot;why&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">&quot;vacation&quot;</span><span class="p">],</span>
<span class="w"> </span><span class="nt">&quot;how&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">&quot;in person&quot;</span><span class="p">]</span>
<span class="w"> </span><span class="p">}</span>
<span class="p">}</span>
</code></pre></div>
<h2>POST /api/v1/agents/5w1h/batch</h2>
<p>Batch analyze all scenes in a file for 5W1H extraction. Uses the pipeline's <code>parent_chunk_5w1h.py --mode llm</code>.</p>
<h3>Request</h3>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;file_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;3abeee81d94597629ed8cb943f182e94&quot;</span>
<span class="p">}</span>
</code></pre></div>
<h2>GET /api/v1/agents/5w1h/status</h2>
<p>Get status of the 5W1H agent pipeline for a file.</p>
<hr />
<h2>Embedding Model</h2>
<table class="table">
<thead>
<tr>
<th>Detail</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Model</strong></td>
<td>EmbeddingGemma-300m</td>
</tr>
<tr>
<td><strong>Endpoint</strong></td>
<td><code>POST /v1/embeddings</code> on port 11436</td>
</tr>
<tr>
<td><strong>Dimension</strong></td>
<td>768</td>
</tr>
<tr>
<td><strong>Used by</strong></td>
<td><code>parent_chunk_5w1h.py --embed</code>, story, 5W1H, search</td>
</tr>
</tbody>
</table>
</div>
</body>
</html>

View File

@@ -0,0 +1,29 @@
<!DOCTYPE html>
<html lang="zh-TW">
<head>
<meta charset="UTF-8">
<title>Momentry API 文件</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; background: #f5f5f5; color: #333; padding: 40px; }
.container { max-width: 900px; margin: 0 auto; background: white; border-radius: 12px; box-shadow: 0 2px 12px rgba(0,0,0,0.08); padding: 40px; }
h1 { font-size: 28px; margin-bottom: 8px; }
p.subtitle { color: #666; margin-bottom: 24px; }
table { width: 100%; border-collapse: collapse; }
tr { border-bottom: 1px solid #eee; }
tr:last-child { border: none; }
td { padding: 10px 0; }
td.cn { width: 140px; font-weight: 600; color: #333; }
td.en { color: #666; font-size: 14px; }
a { color: #0066cc; text-decoration: none; display: block; }
a:hover td { background: #f8f8f8; border-radius: 4px; }
</style>
</head>
<body>
<div class="container">
<h1>Momentry API 文件</h1>
<p class="subtitle">API 參考手冊 — 登入後可瀏覽各模組文件</p>
<table><tr onclick="window.location='01_auth.html'" style="cursor:pointer"><td class="cn">安全認證</td><td class="en">Authentication</td></tr><tr onclick="window.location='02_health.html'" style="cursor:pointer"><td class="cn">健康檢查</td><td class="en">Health</td></tr><tr onclick="window.location='03_register.html'" style="cursor:pointer"><td class="cn">檔案註冊</td><td class="en">File Registration</td></tr><tr onclick="window.location='04_lookup.html'" style="cursor:pointer"><td class="cn">檔案屬性查詢</td><td class="en">File Lookup</td></tr><tr onclick="window.location='05_process.html'" style="cursor:pointer"><td class="cn">處理流程</td><td class="en">Processing</td></tr><tr onclick="window.location='06_search.html'" style="cursor:pointer"><td class="cn">搜尋功能</td><td class="en">Search</td></tr><tr onclick="window.location='07_identity.html'" style="cursor:pointer"><td class="cn">身份識別</td><td class="en">Identity</td></tr><tr onclick="window.location='08_identity_agent.html'" style="cursor:pointer"><td class="cn">智能身份綁定</td><td class="en">Smart Identity Binding</td></tr><tr onclick="window.location='08_media.html'" style="cursor:pointer"><td class="cn">串流與截圖</td><td class="en">Streaming & Thumbnails</td></tr><tr onclick="window.location='09_tmdb.html'" style="cursor:pointer"><td class="cn">TMDb 整合</td><td class="en">TMDb Integration</td></tr><tr onclick="window.location='10_pipeline.html'" style="cursor:pointer"><td class="cn">生產線</td><td class="en">Pipeline</td></tr><tr onclick="window.location='12_agent.html'" style="cursor:pointer"><td class="cn">智慧代理</td><td class="en">AI Agents</td></tr></table>
</div>
</body>
</html>

View File

@@ -0,0 +1,46 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Login - Momentry Docs</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; background: #f5f5f5; display: flex; justify-content: center; align-items: center; height: 100vh; }
.card { background: white; border-radius: 12px; box-shadow: 0 2px 12px rgba(0,0,0,0.08); padding: 40px; width: 360px; }
h1 { font-size: 24px; margin-bottom: 24px; text-align: center; }
input { width: 100%; padding: 10px 12px; margin-bottom: 12px; border: 1px solid #ddd; border-radius: 6px; font-size: 14px; }
button { width: 100%; padding: 10px; background: #0066cc; color: white; border: none; border-radius: 6px; font-size: 16px; cursor: pointer; }
button:hover { background: #0052a3; }
.error { color: #cc0000; font-size: 13px; margin-bottom: 12px; display: none; }
</style>
</head>
<body>
<div class="card">
<h1>Momentry Docs</h1>
<form id="loginForm">
<input type="text" id="username" placeholder="Username" value="demo" required>
<input type="password" id="password" placeholder="Password" value="demo" required>
<div class="error" id="error">Invalid credentials</div>
<button type="submit">Login</button>
</form>
</div>
<script>
document.getElementById('loginForm').onsubmit = async function(e) {
e.preventDefault();
const resp = await fetch('/api/v1/auth/login', {
method: 'POST',
headers: {'Content-Type': 'application/json'},
body: JSON.stringify({
username: document.getElementById('username').value,
password: document.getElementById('password').value
})
});
if (resp.ok) {
window.location.href = '/doc/index.html';
} else {
document.getElementById('error').style.display = 'block';
}
};
</script>
</body>
</html>

View File

@@ -0,0 +1,180 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>11 Error Codes - Momentry API Docs</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; background: #f5f5f5; color: #333; padding: 40px; }
.container { max-width: 960px; margin: 0 auto; background: white; border-radius: 12px; box-shadow: 0 2px 12px rgba(0,0,0,0.08); padding: 40px; }
h1 { font-size: 24px; margin: 24px 0 12px; }
h2 { font-size: 20px; margin: 20px 0 10px; color: #222; }
h3 { font-size: 16px; margin: 16px 0 8px; color: #444; }
p { line-height: 1.6; margin: 8px 0; }
table { border-collapse: collapse; width: 100%; margin: 12px 0; font-size: 14px; }
th, td { border: 1px solid #ddd; padding: 8px 12px; text-align: left; }
th { background: #f0f0f0; font-weight: 600; }
code { background: #f0f0f0; padding: 2px 6px; border-radius: 3px; font-size: 13px; }
pre { background: #f8f8f8; border: 1px solid #ddd; border-radius: 6px; padding: 12px; overflow-x: auto; margin: 12px 0; }
pre code { background: none; padding: 0; }
a { color: #0066cc; }
.back { display: inline-block; margin-bottom: 20px; color: #666; }
.back:hover { color: #333; }
</style>
</head>
<body>
<div class="container">
<a class="back" href="index.html">&larr; Back to index</a>
<!-- module: error_codes -->
<!-- description: Standard API error codes -->
<!-- depends: -->
<h2>Error Response Format</h2>
<p>All API errors follow this JSON structure:</p>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;success&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;error&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;code&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;E001_NOT_FOUND&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;message&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Resource not found&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;details&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="nt">&quot;resource&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;file_uuid&quot;</span><span class="p">,</span><span class="w"> </span><span class="nt">&quot;value&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;abc&quot;</span><span class="p">}</span>
<span class="w"> </span><span class="p">}</span>
<span class="p">}</span>
</code></pre></div>
<h2>Error Code List</h2>
<h3>Generic Errors (E0xx)</h3>
<table class="table">
<thead>
<tr>
<th>Code</th>
<th>HTTP</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>E001_NOT_FOUND</code></td>
<td>404</td>
<td>Resource not found (file, identity, chunk)</td>
</tr>
<tr>
<td><code>E002_DUPLICATE</code></td>
<td>409</td>
<td>Resource already exists</td>
</tr>
<tr>
<td><code>E003_VALIDATION</code></td>
<td>400</td>
<td>Request parameter validation failed</td>
</tr>
<tr>
<td><code>E004_UNAUTHORIZED</code></td>
<td>401</td>
<td>Invalid API key or token</td>
</tr>
<tr>
<td><code>E005_INTERNAL</code></td>
<td>500</td>
<td>Internal server error</td>
</tr>
</tbody>
</table>
<h3>Processor Errors (E1xx)</h3>
<table class="table">
<thead>
<tr>
<th>Code</th>
<th>HTTP</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>E101_PROCESSOR_FAIL</code></td>
<td>500</td>
<td>Python script execution failed</td>
</tr>
<tr>
<td><code>E102_TIMEOUT</code></td>
<td>504</td>
<td>Processing timeout</td>
</tr>
<tr>
<td><code>E103_RESUME_FAIL</code></td>
<td>500</td>
<td>Resume failed (checkpoint not found)</td>
</tr>
<tr>
<td><code>E104_NO_VIDEO</code></td>
<td>400</td>
<td>Video file path not found</td>
</tr>
</tbody>
</table>
<h3>Identity Errors (E2xx)</h3>
<table class="table">
<thead>
<tr>
<th>Code</th>
<th>HTTP</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>E201_FACE_NOT_FOUND</code></td>
<td>404</td>
<td>Face detection not found</td>
</tr>
<tr>
<td><code>E202_MERGE_CONFLICT</code></td>
<td>409</td>
<td>Identity merge conflict</td>
</tr>
<tr>
<td><code>E203_CANDIDATE_EMPTY</code></td>
<td>404</td>
<td>No candidates available for confirmation</td>
</tr>
</tbody>
</table>
<h3>TMDb Errors (E3xx)</h3>
<table class="table">
<thead>
<tr>
<th>Code</th>
<th>HTTP</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>E301_TMDB_NO_KEY</code></td>
<td>400</td>
<td><code>TMDB_API_KEY</code> environment variable not set</td>
</tr>
<tr>
<td><code>E302_TMDB_UNREACHABLE</code></td>
<td>502</td>
<td>TMDb API unreachable or timed out</td>
</tr>
<tr>
<td><code>E303_TMDB_CACHE_NOT_FOUND</code></td>
<td>200</td>
<td>No local TMDb cache; run prefetch first</td>
</tr>
<tr>
<td><code>E304_TMDB_PROBE_FAILED</code></td>
<td>500</td>
<td>TMDb probe execution failed</td>
</tr>
<tr>
<td><code>E305_TMDB_MOVIE_NOT_FOUND</code></td>
<td>404</td>
<td>No matching TMDb movie found from filename</td>
</tr>
</tbody>
</table>
</div>
</body>
</html>

View File

@@ -0,0 +1,29 @@
<!DOCTYPE html>
<html lang="zh-TW">
<head>
<meta charset="UTF-8">
<title>Momentry API 文件</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; background: #f5f5f5; color: #333; padding: 40px; }
.container { max-width: 900px; margin: 0 auto; background: white; border-radius: 12px; box-shadow: 0 2px 12px rgba(0,0,0,0.08); padding: 40px; }
h1 { font-size: 28px; margin-bottom: 8px; }
p.subtitle { color: #666; margin-bottom: 24px; }
table { width: 100%; border-collapse: collapse; }
tr { border-bottom: 1px solid #eee; }
tr:last-child { border: none; }
td { padding: 10px 0; }
td.cn { width: 140px; font-weight: 600; color: #333; }
td.en { color: #666; font-size: 14px; }
a { color: #0066cc; text-decoration: none; display: block; }
a:hover td { background: #f8f8f8; border-radius: 4px; }
</style>
</head>
<body>
<div class="container">
<h1>Momentry API 文件</h1>
<p class="subtitle">API 參考手冊 — 登入後可瀏覽各模組文件</p>
<table><tr onclick="window.location='11_error_codes.html'" style="cursor:pointer"><td class="cn">錯誤碼</td><td class="en">Error Codes</td></tr></table>
</div>
</body>
</html>

View File

@@ -0,0 +1,46 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Login - Momentry Docs</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; background: #f5f5f5; display: flex; justify-content: center; align-items: center; height: 100vh; }
.card { background: white; border-radius: 12px; box-shadow: 0 2px 12px rgba(0,0,0,0.08); padding: 40px; width: 360px; }
h1 { font-size: 24px; margin-bottom: 24px; text-align: center; }
input { width: 100%; padding: 10px 12px; margin-bottom: 12px; border: 1px solid #ddd; border-radius: 6px; font-size: 14px; }
button { width: 100%; padding: 10px; background: #0066cc; color: white; border: none; border-radius: 6px; font-size: 16px; cursor: pointer; }
button:hover { background: #0052a3; }
.error { color: #cc0000; font-size: 13px; margin-bottom: 12px; display: none; }
</style>
</head>
<body>
<div class="card">
<h1>Momentry Docs</h1>
<form id="loginForm">
<input type="text" id="username" placeholder="Username" value="demo" required>
<input type="password" id="password" placeholder="Password" value="demo" required>
<div class="error" id="error">Invalid credentials</div>
<button type="submit">Login</button>
</form>
</div>
<script>
document.getElementById('loginForm').onsubmit = async function(e) {
e.preventDefault();
const resp = await fetch('/api/v1/auth/login', {
method: 'POST',
headers: {'Content-Type': 'application/json'},
body: JSON.stringify({
username: document.getElementById('username').value,
password: document.getElementById('password').value
})
});
if (resp.ok) {
window.location.href = '/doc/index.html';
} else {
document.getElementById('error').style.display = 'block';
}
};
</script>
</body>
</html>

View File

@@ -0,0 +1,280 @@
<!-- module: auth -->
<!-- description: Authentication — login, logout, JWT, session cookie, API key -->
<!-- depends: -->
## Base URL
| Environment | URL | Purpose |
|-------------|-----|---------|
| Production | `http://localhost:3002` | Production deployment |
| External (M5) | `https://m5api.momentry.ddns.net` | Remote access |
## Variables
All examples in this documentation use these environment variables:
```bash
API="http://localhost:3002"
KEY="your-api-key-here"
```
## Authentication
All endpoints under `/api/v1/*` require authentication.
The following endpoints are public (no auth needed):
- `GET /health`
- `POST /api/v1/auth/login`
- `POST /api/v1/auth/logout`
### Three Authentication Modes
The system supports three authentication methods, checked in **priority order** by the middleware:
```
Middleware priority:
1. Session Cookie (Portal/browser)
2. JWT Bearer (API clients, CLI)
3. API Key Header (legacy compatibility)
4. API Key Query Param (?api_key=)
```
| Mode | Transport | Expiry | Scope | Best for |
|------|-----------|--------|-------|----------|
| **Session Cookie** | `Cookie: session_id=<session_id>` | 24h | per-browser session | Portal (browser) |
| **JWT** | `Authorization: Bearer <token>` | 1h | per-login token | API clients, CLI, scripts |
| **API Key** | `X-API-Key: <key>` | 90d | fixed key for automation | Legacy scripts, WordPress |
---
### Login
**Default accounts & API keys:**
| Username | Password | API Key | Role |
|----------|----------|---------|------|
| `admin` | `admin` | — | admin |
| `demo` | `demo` | `muser_demo_key_32chars_abcdef1234567890` | user |
The demo API key is set via `MOMENTRY_DEMO_API_KEY` env var and can be used in place of JWT for marcom integrations:
```bash
# Using API key instead of JWT
curl -s "$API/api/v1/files/scan" -H "X-API-Key: muser_demo_key_32chars_abcdef1234567890"
```
```bash
# Login as admin
curl -s -X POST "$API/api/v1/auth/login" \
-H "Content-Type: application/json" \
-d '{"username": "admin", "password": "admin"}'
# Login as demo user
curl -s -X POST "$API/api/v1/auth/login" \
-H "Content-Type: application/json" \
-d '{"username": "demo", "password": "demo"}'
```
#### Success Response
```json
{
"success": true,
"jwt": "eyJhbGciOiJIUzI1NiIs...",
"api_key": "muser_...",
"user": {
"username": "admin",
"role": "admin"
},
"expires_at": "2026-05-18T13:00:00Z"
}
```
| Field | Type | Description |
|-------|------|-------------|
| `jwt` | string | JWT access token. Use as `Authorization: Bearer <jwt>`. Expires in 1 hour. |
| `api_key` | string | Legacy API key. Use as `X-API-Key: <key>`. Good for 90 days. |
| `user.username` | string | Username |
| `user.role` | string | Role: `admin`, `user`, or `readonly` |
| `expires_at` | string | ISO8601 timestamp of JWT expiration |
The login endpoint also sets a `Set-Cookie` header for browser-based clients:
```
Set-Cookie: session_id=<session_id>; Path=/; HttpOnly; SameSite=Strict; Max-Age=86400
```
#### Error Response (401)
```json
{
"success": false,
"message": "Invalid username or password"
}
```
---
### Using JWT
JWT is preferred for API clients (CLI scripts, WordPress). It is validated by the middleware without a database lookup (stateless).
```bash
# Login and capture JWT
JWT=$(curl -s -X POST "$API/api/v1/auth/login" \
-H "Content-Type: application/json" \
-d '{"username":"admin","password":"admin"}' | python3 -c "import json,sys;print(json.load(sys.stdin)['jwt'])")
# Use JWT for all subsequent requests
curl -H "Authorization: Bearer $JWT" "$API/api/v1/files/scan"
curl -H "Authorization: Bearer $JWT" "$API/api/v1/resource/tmdb"
```
JWT is short-lived (1 hour). When it expires, request a new one via login.
---
### Using Session Cookie (Browser)
Browser-based clients (Portal) get a session cookie automatically after login. The browser sends the cookie with every request—no manual header needed.
```bash
# Login captures the session cookie from Set-Cookie header
curl -v -X POST "$API/api/v1/auth/login" \
-H "Content-Type: application/json" \
-d '{"username":"admin","password":"admin"}' 2>&1 | grep "Set-Cookie"
# Browser automatically sends: Cookie: session_id=<session_id>
# No manual header needed for subsequent requests
```
The session cookie is HttpOnly (not accessible from JavaScript) and SameSite=Strict (protected against CSRF).
---
### Using Legacy API Key
```bash
curl -H "X-API-Key: $KEY" "$API/api/v1/files/scan"
# Also accepted via Bearer header (non-JWT format) or query parameter:
curl -H "Authorization: Bearer $KEY" "$API/api/v1/files/scan"
curl "$API/api/v1/files/scan?api_key=$KEY"
```
API keys are validated via SHA256 hash lookup in the database. They are long-lived (90 days) and intended for automation.
### Obtaining an API Key (CLI)
```bash
momentry api-key create "My API Key" --key-type user
```
---
### Logout
```bash
# Logout using the session cookie (browser)
curl -X POST "$API/api/v1/auth/logout" \
-H "Cookie: session_id=<uuid>"
```
#### What logout does
| Auth mode | Effect |
|-----------|--------|
| **Session Cookie** | Session deleted from database. Same cookie returns 401 on subsequent requests. |
| **JWT** | JWT remains valid until expiry. (JWT is stateless — logout adds JWT to a blacklist only if API key mode is used.) |
| **API Key** | API key remains valid. (Legacy keys are shared across sessions — revoking would break other clients.) |
#### Example: full session lifecycle
```bash
# 1. Login
SESSION_ID=$(curl -s -D - -X POST "$API/api/v1/auth/login" \
-H "Content-Type: application/json" \
-d '{"username":"admin","password":"admin"}' | grep "Set-Cookie" | sed 's/.*session_id=\([^;]*\).*/\1/')
# 2. Use session (works)
curl -s -o /dev/null -w "HTTP %{http_code}\n" "$API/api/v1/resource/tmdb" \
-H "Cookie: session_id=$SESSION_ID"
# → HTTP 200
# 3. Logout
curl -s -X POST "$API/api/v1/auth/logout" \
-H "Cookie: session_id=$SESSION_ID"
# → {"success": true}
# 4. Use session again (rejected)
curl -s -o /dev/null -w "HTTP %{http_code}\n" "$API/api/v1/resource/tmdb" \
-H "Cookie: session_id=$SESSION_ID"
# → HTTP 401
```
---
### Authentication Flow Summary
```
Login Request
┌──────────────────┐
│ 1. Check users │ ← users table (argon2 password verify)
│ table │
└──────┬───────────┘
┌───┴───┐
│ match │
└───┬───┘
┌──────────────────┐
│ 2. Create JWT │ ← 1h expiry, signed with JWT_SECRET
├──────────────────┤
│ 3. Create │ ← 24h expiry, stored in sessions table
│ session │
├──────────────────┤
│ 4. Set-Cookie │ ← HttpOnly, SameSite=Strict, Path=/
├──────────────────┤
│ 5. Return │ ← JWT + api_key + user info to client
└──────────────────┘
```
```
Protected Request
┌──────────────────────┐
│ Middleware checks: │
│ │
│ 1. Cookie session? │ → DB lookup session → get api_key → verify
│ │
│ 2. JWT Bearer? │ → verify JWT signature → decode claims
│ │
│ 3. X-API-Key? │ → SHA256 hash → DB lookup → verify
│ │
│ 4. ?api_key=? │ → same as #3
│ │
│ 5. None → 401 │
└──────────────────────┘
```
---
### Error Responses
| HTTP | When |
|------|------|
| `401` | Missing or invalid authentication |
| `401` | Session expired or logged out |
| `401` | JWT expired |
| `401` | API key revoked or inactive |
---
### Related
- `POST /api/v1/resource/tmdb/check` — test authentication + TMDb API connectivity
- `GET /health/detailed` — view auth status (integrations section)

View File

@@ -0,0 +1,147 @@
<!-- module: health -->
<!-- description: Health check endpoints -->
<!-- depends: 01_auth -->
## Health Check
### `GET /health`
**Auth**: Public
**Scope**: system-level
Returns basic server health status — used by load balancers and monitoring.
#### Example
```bash
curl "$API/health" | jq '{status, version}'
```
#### Response (200)
```json
{
"status": "ok",
"version": "1.0.0",
"build_git_hash": "3a6c1865",
"build_timestamp": "2026-05-16T13:38:15Z",
"uptime_ms": 3015
}
```
| Field | Type | Description |
|-------|------|-------------|
| `status` | string | `ok` or `degraded` |
| `version` | string | Semver version |
| `build_git_hash` | string | Git commit hash |
| `build_timestamp` | string | Binary build time |
| `uptime_ms` | integer | Milliseconds since server start |
---
### `GET /health/detailed`
**Auth**: Required
**Scope**: system-level
Returns full system health including each service status, resource utilization, pipeline readiness, schema migration status, identity file sync status, and external integrations.
> Requires authentication (JWT, session cookie, or API key). The basic `/health` endpoint remains public for load balancer checks.
#### Example
```bash
curl "$API/health/detailed" | jq '{status, services, resources: {cpu: .resources.cpu_used_percent, memory: .resources.memory_used_percent}}'
```
#### Response (200)
```json
{
"status": "ok",
"version": "1.0.0",
"services": {
"postgres": {"status": "ok", "latency_ms": 3},
"redis": {"status": "ok", "latency_ms": 1},
"qdrant": {"status": "ok", "latency_ms": 5}
},
"resources": {
"cpu_used_percent": 12.5,
"memory_available_mb": 32768,
"memory_used_percent": 31.7
},
"pipeline": {
"scripts_ready": true,
"scripts_count": 345,
"processors": {
"asr": true,
"yolo": true,
"face": true,
"pose": true,
"ocr": true,
"cut": true,
"scene": true,
"asrx": true,
"visual_chunk": true
},
"models_ready": true,
"models_count": 42,
"scripts_integrity": {"matched": 332, "total": 345, "ok": false},
"ffmpeg": true
},
"schema": {
"table_exists": true,
"applied": [{"filename": "migrate_add_users_table.sql"}],
"required": [],
"ok": true
},
"identities": {
"directory_exists": true,
"files_count": 3481,
"index_ok": true,
"db_count": 3481,
"synced": true
},
"integrations": {
"tmdb": {
"api_key_configured": false,
"enabled": false,
"api_reachable": null
}
}
}
```
#### Response Fields
| Field | Type | Description |
|-------|------|-------------|
| `status` | string | `ok` if all essential services healthy |
| `services` | object | Per-service status (postgres, redis, qdrant) |
| `services.*.status` | string | `ok`, `error`, or `degraded` |
| `services.*.latency_ms` | int | Response time in milliseconds |
| `resources` | object | CPU, memory usage |
| `pipeline.scripts_ready` | boolean | Scripts directory accessible |
| `pipeline.scripts_count` | int | Number of Python processor scripts |
| `pipeline.processors` | object | Per-processor availability |
| `pipeline.models_ready` | boolean | Models directory accessible |
| `pipeline.scripts_integrity` | object | SHA256 checksum verification results |
| `schema.ok` | boolean | All required migrations applied |
| `identities.synced` | boolean | Identity file count matches DB count |
| `integrations.tmdb` | object | TMDB API key config and reachability |
#### Health status rules
| Condition | status |
|-----------|--------|
| All services ok | `ok` |
| Any service error | `degraded` |
| Postgres or Redis error | `degraded` (server still responds) |
---
### Stats Endpoints
| Method | Endpoint | Auth | Description |
|--------|----------|------|-------------|
| GET | `/api/v1/stats/sftpgo` | No | SFTPGo service status |

View File

@@ -0,0 +1,184 @@
<!-- module: register -->
<!-- description: File registration — register, scan -->
<!-- depends: 01_auth -->
## File Registration
### `POST /api/v1/files/register`
**Auth**: Required
**Scope**: file-level
Register a video file for processing. Returns the file's metadata and UUID.
**New in v0.1.2**: Registration now **automatically triggers the processing pipeline** — no need to call `POST /api/v1/file/:file_uuid/process` separately. The system will:
1. Register the file and run ffprobe
2. Auto-run offline TMDb probe (reads local identity files, no API calls)
3. Create a monitor job for the worker
4. Worker starts all 10 processors (Cut → ASR → ASRX → YOLO → OCR → Face → Pose → VisualChunk → Story → 5W1H)
If the file already exists (same content hash), returns the existing record with `already_exists: true`.
#### Request Parameters
| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `file_path` | string | Yes | — | Path to video file on disk |
| `pattern` | string | No | — | Regex pattern for batch register (requires `file_path` to be a directory) |
| `user_id` | integer | No | — | User ID to associate with registration |
| `content_hash` | string | No | — | Pre-computed SHA-256 hash (skips computation) |
#### Example
```bash
# Register a single file
curl -s -X POST "$API/api/v1/files/register" \
-H "Content-Type: application/json" \
-H "X-API-Key: $KEY" \
-d '{"file_path": "/path/to/video.mp4"}'
# Batch register files matching a pattern in a directory
curl -s -X POST "$API/api/v1/files/register" \
-H "Content-Type: application/json" \
-H "X-API-Key: $KEY" \
-d '{"file_path": "/path/to/dir", "pattern": ".*\\.mp4$"}'
```
#### Response (200)
```json
{
"success": true,
"file_uuid": "3a6c1865...",
"file_name": "video.mp4",
"file_path": "/path/to/video.mp4",
"file_type": "video",
"duration": 120.5,
"width": 1920,
"height": 1080,
"fps": 24.0,
"total_frames": 2892,
"already_exists": false,
"message": "File registered successfully"
}
```
| Field | Type | Description |
|-------|------|-------------|
| `success` | boolean | Always true on 200 |
| `file_uuid` | string | 32-char hex UUID of the registered file |
| `file_name` | string | File name (auto-renamed if name conflict) |
| `file_path` | string | Canonical path on disk |
| `file_type` | string | `"video"`, `"audio"`, or `"unknown"` |
| `duration` | float | Duration in seconds |
| `width` | integer | Video width in pixels |
| `height` | integer | Video height in pixels |
| `fps` | float | Frames per second |
| `total_frames` | integer | Total frame count |
| `already_exists` | boolean | True if same content was already registered |
| `message` | string | Human-readable status |
#### Error Responses
| HTTP | When |
|------|------|
| `401` | Missing or invalid API key |
| `400` | Invalid request body |
| `404` | File path does not exist |
---
### `GET /api/v1/files/scan`
**Auth**: Required
**Scope**: file-level
Scan the filesystem directory and list all media files, showing which are registered, processing, or unregistered.
#### Query Parameters
| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `page` | integer | No | 1 | Page number (1-based) |
| `page_size` | integer | No | all | Items per page (alias: `limit`) |
| `limit` | integer | No | all | Max items (alias for `page_size`) |
| `pattern` | string | No | — | Regex filter on file name (e.g., `.*\\.mp4$`) |
| `sort_by` | string | No | `name` | Sort field: `name`, `size`, `modified`, `status` |
| `sort_order` | string | No | `asc` | Sort direction: `asc` or `desc` |
#### Example
```bash
# Full scan
curl -s "$API/api/v1/files/scan" -H "X-API-Key: $KEY" | jq '{total, registered_count, unregistered_count}'
# Paginated (page 1, 5 per page)
curl -s "$API/api/v1/files/scan?page=1&page_size=5" -H "X-API-Key: $KEY" | jq '{page, total_pages, files: [.files[].file_name]}'
# Regex filter: only mp4 files
curl -s "$API/api/v1/files/scan?pattern=.*\\.mp4$" -H "X-API-Key: $KEY" | jq '{filtered_total, files: [.files[].file_name]}'
# Sort by file size (largest first)
curl -s "$API/api/v1/files/scan?sort_by=size&sort_order=desc&page_size=5" -H "X-API-Key: $KEY" | jq '[.files[] | {file_name, file_size}]'
# Sort by modified time (most recent first)
curl -s "$API/api/v1/files/scan?sort_by=modified&sort_order=desc&page_size=5" -H "X-API-Key: $KEY" | jq '[.files[] | {file_name, modified_time}]'
# Sort by status
curl -s "$API/api/v1/files/scan?sort_by=status&page_size=5" -H "X-API-Key: $KEY" | jq '[.files[] | {file_name, status}]'
```
#### Response (200)
```json
{
"files": [
{
"file_name": "video.mp4",
"file_size": 12345678,
"is_registered": true,
"file_uuid": "3a6c1865...",
"status": "completed",
"registration_time": "2026-05-16T12:00:00Z",
"job_id": 42
}
],
"total": 107,
"filtered_total": 80,
"page": 1,
"page_size": 20,
"total_pages": 4,
"registered_count": 26,
"unregistered_count": 81
}
```
| Field | Type | Description |
|-------|------|-------------|
| `files` | array | Array of file info objects (paginated) |
| `files[].file_name` | string | File name |
| `files[].relative_path` | string | Path relative to scan root |
| `files[].file_path` | string | Absolute path on disk |
| `files[].file_size` | integer | File size in bytes |
| `files[].modified_time` | string | Last modified timestamp (ISO8601) |
| `files[].is_registered` | boolean | Whether file is registered in DB |
| `files[].file_uuid` | string | 32-char hex UUID (only if registered) |
| `files[].status` | string | `"completed"`, `"processing"`, `"registered"`, `"unregistered"`, or `null` |
| `files[].registration_time` | string | DB registration timestamp (only if registered) |
| `files[].job_id` | integer | Processing job ID (only if a job exists) |
| `total` | integer | Total files found on disk (unfiltered) |
| `filtered_total` | integer | Files matching regex filter |
| `page` | integer | Current page number |
| `page_size` | integer | Items per page |
| `total_pages` | integer | Total pages |
| `registered_count` | integer | Files registered in DB |
| `unregistered_count` | integer | Files not yet registered |
#### Notes
| Feature | Behavior |
|---------|----------|
| **Regex** | Case-insensitive (`(?i)` prefix auto-applied). Applied to `file_name`. |
| **Sort order** | Default (`sort_by=name`): registered files first, then alphabetically. `sort_by=status`: alphabetical by status string. |
| **Pagination** | `page_size` and `limit` are aliases. Default: show all results. |
| **Processing order** | `pattern` regex filter → `sort_by`/`sort_order``page`/`page_size` slice. |

View File

@@ -0,0 +1,138 @@
<!-- module: lookup -->
<!-- description: File lookup by name and unregistration -->
<!-- depends: 01_auth, 03_register -->
## File Lookup
### `GET /api/v1/files/lookup`
**Auth**: Required
**Scope**: file-level
Search registered files by file name. Performs a case-insensitive LIKE search on the file name column. Returns basic info about matching files.
#### Query Parameters
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `file_name` | string | Yes | File name to search for (partial matches supported) |
#### Example
```bash
# Look up a specific file
curl -s "$API/api/v1/files/lookup?file_name=video.mp4" \
-H "X-API-Key: $KEY"
# Partial name search
curl -s "$API/api/v1/files/lookup?file_name=charade" \
-H "X-API-Key: $KEY" | jq '.matches[].file_name'
```
#### Response (200)
```json
{
"file_name": "video.mp4",
"exists": true,
"matches": [
{
"file_uuid": "a03485a40b2df2d3",
"file_name": "video.mp4",
"file_type": "video",
"status": "completed"
}
],
"next_name": "video (2).mp4"
}
```
| Field | Type | Description |
|-------|------|-------------|
| `file_name` | string | Searched name |
| `exists` | boolean | Exact name match exists |
| `matches` | array | Array of matching registered files |
| `matches[].file_uuid` | string | 32-char hex UUID |
| `matches[].file_name` | string | Registered file name |
| `matches[].file_type` | string | `"video"`, `"audio"`, or `null` |
| `matches[].status` | string | Registration/processing status |
| `next_name` | string | Suggested name for avoiding conflicts |
---
## Unregister
### `POST /api/v1/unregister`
**Auth**: Required
**Scope**: file-level
Delete a registered file from the system. Supports single file by UUID, or batch by directory + regex pattern.
#### What gets deleted
| Removed (default) | Not removed |
|---------|-------------|
| Database records (videos, chunks, embeddings, processor_results, pre_chunks) | The original source video file on disk |
| Processor output JSON files (`{uuid}.*.json`) — unless `delete_output_files: false` | Temp/working directories |
| In-memory cache entries | |
| MongoDB cached lists | |
> ⚠️ Database deletion is **irreversible**. To keep output files, set `"delete_output_files": false`.
#### Request Parameters
At least one mode must be specified: either `file_uuid` alone, or `file_path` + `pattern` together.
| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `file_uuid` | string | * | — | Single file UUID to delete |
| `file_path` | string | * | — | Directory path (for batch delete) |
| `pattern` | string | * | — | Regex pattern (requires `file_path`) |
| `delete_output_files` | boolean | No | `true` | If `true`, also delete processor output JSON files (`{uuid}.*.json`). Set to `false` to keep them. |
#### Example
```bash
# Delete a single file by UUID (default: also deletes output JSON files)
curl -s -X POST "$API/api/v1/unregister" \
-H "Content-Type: application/json" \
-H "X-API-Key: $KEY" \
-d '{"file_uuid": "'"$FILE_UUID"'"}'
# Keep output JSON files, only delete DB records
curl -s -X POST "$API/api/v1/unregister" \
-H "Content-Type: application/json" \
-H "X-API-Key: $KEY" \
-d '{"file_uuid": "'"$FILE_UUID"'", "delete_output_files": false}'
# Batch delete all mp4 files in a directory
curl -s -X POST "$API/api/v1/unregister" \
-H "Content-Type: application/json" \
-H "X-API-Key: $KEY" \
-d '{"file_path": "/path/to/dir", "pattern": ".*\\.mp4$"}'
```
#### Response (200)
```json
{
"success": true,
"file_uuid": "a03485a40b2df2d3",
"message": "Video unregistered successfully"
}
```
| Field | Type | Description |
|-------|------|-------------|
| `success` | boolean | True if deletion succeeded |
| `file_uuid` | string | UUID of the deleted file (single mode) |
| `message` | string | Human-readable status |
#### Error Responses
| HTTP | When |
|------|------|
| `400` | Neither `file_uuid` nor `file_path`+`pattern` provided |
| `404` | File UUID not found |
| `401` | Missing or invalid API key |

View File

@@ -0,0 +1,236 @@
<!-- module: process -->
<!-- description: Processing pipeline — trigger, probe, progress, jobs -->
<!-- depends: 01_auth, 03_register -->
## Processing Pipeline
### `POST /api/v1/file/:file_uuid/process`
**Auth**: Required
**Scope**: file-level
Trigger the processing pipeline for a registered file. Creates a monitor job that the worker picks up and processes sequentially. Returns immediately with the job info—processing runs asynchronously in the background.
#### Request Parameters
| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `processors` | string[] | No | all | Specific processors to run: `["cut","asr","asrx","yolo","ocr","face","pose","visual_chunk","story","5w1h"]` |
| `rules` | string[] | No | all | Rule names to apply (currently unused) |
#### Example
```bash
# Run all processors
curl -s -X POST "$API/api/v1/file/$FILE_UUID/process" \
-H "Content-Type: application/json" \
-H "X-API-Key: $KEY" -d '{}'
# Run specific processors only
curl -s -X POST "$API/api/v1/file/$FILE_UUID/process" \
-H "Content-Type: application/json" \
-H "X-API-Key: $KEY" \
-d '{"processors": ["asr", "face", "yolo"]}'
```
#### Response (200)
```json
{
"success": true,
"job_id": 42,
"file_uuid": "3a6c1865...",
"status": "processing",
"pids": [12345, 12346],
"message": "Processing triggered for video.mp4"
}
```
| Field | Type | Description |
|-------|------|-------------|
| `success` | boolean | Always true on 200 |
| `job_id` | integer | Monitor job ID (for job tracking) |
| `file_uuid` | string | 32-char hex UUID of the file |
| `status` | string | `"processing"` |
| `pids` | integer[] | Process IDs of started processors |
| `message` | string | Human-readable status |
#### Error Responses
| HTTP | When |
|------|------|
| `404` | File UUID not found |
| `401` | Missing or invalid API key |
---
### `GET /api/v1/file/:file_uuid/probe`
**Auth**: Required
**Scope**: file-level
Get ffprobe metadata for a registered file. Returns video/audio stream info, codec details, duration, resolution, and frame rate.
#### Example
```bash
curl -s "$API/api/v1/file/$FILE_UUID/probe" -H "X-API-Key: $KEY"
```
#### Response (200)
```json
{
"file_uuid": "3a6c1865...",
"file_name": "video.mp4",
"file_size": 794863677,
"duration": 120.5,
"width": 1920,
"height": 1080,
"fps": 24.0,
"total_frames": 2892,
"cached": true,
"format": {
"filename": "/path/to/video.mp4",
"format_name": "mov,mp4,m4a,3gp",
"duration": "120.5",
"size": "12345678",
"bit_rate": "819200"
},
"streams": [
{
"index": 0,
"codec_name": "h264",
"codec_type": "video",
"width": 1920,
"height": 1080,
"r_frame_rate": "24/1",
"duration": "120.5"
}
]
}
```
| Field | Type | Description |
|-------|------|-------------|
| `file_uuid` | string | 32-char hex UUID |
| `file_name` | string | File name |
| `file_size` | integer | File size in bytes (from filesystem) |
| `duration` | float | Duration in seconds |
| `width` | integer | Video width in pixels |
| `height` | integer | Video height in pixels |
| `fps` | float | Frames per second |
| `total_frames` | integer | Estimated total frames |
| `cached` | boolean | True if result was from cached probe JSON |
| `format` | object | Container format info (ffprobe format section) |
| `streams` | array | Array of stream info objects |
---
### `GET /api/v1/progress/:file_uuid`
**Auth**: Required
**Scope**: file-level
Get real-time processing progress for a file via Redis pub/sub. Includes per-processor status, current/total frames, ETA, and system resource stats.
#### Pipeline Order
| Order | Processor | Dependencies | Description |
|-------|-----------|-------------|-------------|
| 1 | `cut` | — | Scene detection |
| 2 | `asr` | cut | Speech-to-text (per scene) |
| 3 | `asrx` | asr | Speaker diarization |
| 4 | `yolo` | — | Object detection |
| 5 | `ocr` | — | Text recognition |
| 6 | `face` | — | Face detection & embedding |
| 7 | `pose` | — | Pose estimation |
| 8 | `visual_chunk` | yolo | Visual scene chunks |
| 9 | `story` | asr, asrx, cut, yolo, face | Scene summaries (template) |
| 10 | `5w1h` | story | 5W1H analysis (Gemma4 LLM) |
All processors except `story` and `5w1h` run concurrently when their dependencies are met. Story and 5W1H run sequentially after their prerequisites.
#### Example
```bash
curl -s "$API/api/v1/progress/$FILE_UUID" -H "X-API-Key: $KEY" | jq '{overall_progress, processors: [.processors[] | {processor_type, status}]}'
```
#### Response (200)
```json
{
"file_uuid": "3a6c1865...",
"overall_progress": 71,
"cpu_percent": 45.2,
"gpu_percent": 30.1,
"memory_percent": 62.4,
"processors": [
{"processor_type": "asr", "status": "complete", "progress": 100},
{"processor_type": "yolo", "status": "running", "progress": 65},
{"processor_type": "face", "status": "pending", "progress": 0}
]
}
```
| Field | Type | Description |
|-------|------|-------------|
| `file_uuid` | string | 32-char hex UUID |
| `overall_progress` | integer | Overall progress percentage (0100) |
| `processors` | array | Per-processor status list |
| `processors[].processor_type` | string | Processor name (`asr`, `cut`, `yolo`, etc.) |
| `processors[].status` | string | `"pending"`, `"running"`, `"complete"`, or `"failed"` |
| `processors[].progress` | integer | Per-processor progress (0100) |
| `processors[].eta_seconds` | integer | Estimated seconds remaining (running processors) |
| `processors[].current` | integer | Current frame count |
| `processors[].total` | integer | Total frame count |
| `cpu_percent` | float | Current CPU usage |
| `gpu_percent` | float | Current GPU utilization |
| `memory_percent` | float | Current memory usage |
---
### `GET /api/v1/jobs`
**Auth**: Required
**Scope**: system-level
List all processing jobs (monitor jobs) in the system. Shows job status, which file each job is processing, and current processor info.
#### Example
```bash
curl -s "$API/api/v1/jobs" -H "X-API-Key: $KEY" | jq '{count, jobs: [.jobs[] | {uuid, status}]}'
```
#### Response (200)
```json
{
"jobs": [
{
"id": 42,
"uuid": "3a6c1865...",
"status": "running",
"current_processor": "yolo",
"created_at": "2026-05-16T12:00:00Z",
"started_at": "2026-05-16T12:01:00Z"
}
],
"count": 15,
"page": 1,
"page_size": 20
}
```
| Field | Type | Description |
|-------|------|-------------|
| `jobs` | array | Array of job info objects |
| `jobs[].id` | integer | Job ID |
| `jobs[].uuid` | string | File UUID being processed |
| `jobs[].status` | string | `"pending"`, `"running"`, `"completed"`, `"failed"` |
| `jobs[].current_processor` | string | Currently active processor, or null |
| `count` | integer | Total job count |
| `page` | integer | Current page number |
| `page_size` | integer | Jobs per page |

View File

@@ -0,0 +1,145 @@
<!-- module: search -->
<!-- description: Vector search, BM25, smart search, universal search, visual search -->
<!-- depends: 01_auth -->
## Search APIs
### `POST /api/v1/search/smart`
**Auth**: Required
**Scope**: file-level
Semantic vector search using EmbeddingGemma-300m. Generates a query embedding via EmbeddingGemma (port 11436), then searches pgvector `story_parent` and `llm_parent` chunks by cosine similarity.
#### Request Parameters
| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `file_uuid` | string | Yes | — | File UUID to search within |
| `query` | string | Yes | — | Search text |
| `limit` | integer | No | 5 | Max results to return |
| `page` | integer | No | 1 | Page number |
| `page_size` | integer | No | 5 | Items per page |
#### Example
```bash
curl -s -X POST "$API/api/v1/search/smart" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $JWT" \
-d '{"file_uuid": "'"$FILE_UUID"'", "query": "Audrey Hepburn"}'
```
#### Response (200)
```json
{
"query": "Audrey Hepburn",
"results": [
{
"parent_id": 1087822,
"scene_order": 1087822,
"start_frame": 104438,
"end_frame": 104538,
"fps": 24.0,
"start_time": 4351.6,
"end_time": 4355.76,
"summary": "[4352s-4356s, 4s] Cast: Audrey Hepburn. Total: 2 lines, 10 words. Speakers: Audrey Hepburn (2 lines)",
"similarity": 0.67
}
],
"page": 1,
"page_size": 5,
"strategy": "semantic_vector_search"
}
```
---
### `POST /api/v1/search/universal`
**Auth**: Required
**Scope**: file-level
Multi-type BM25 full-text search across chunks, frames, and persons. Uses PostgreSQL `tsvector`.
#### Request Parameters
| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `query` | string | Yes | — | Search text |
| `file_uuid` | string | No | — | Restrict to specific file |
| `types` | string[] | No | `["chunk","frame","person"]` | Search types |
| `limit` | integer | No | 10 | Max results per type |
| `page` | integer | No | 1 | Page number |
| `page_size` | integer | No | 20 | Items per page |
#### Example
```bash
curl -s -X POST "$API/api/v1/search/universal" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $JWT" \
-d '{"file_uuid": "'"$FILE_UUID"'", "query": "Cary Grant"}'
```
#### Response (200)
```json
{
"results": [
{
"type": "chunk",
"chunk_id": "bd80fec92b0b6963d177a2c55bf713e2_2",
"chunk_type": "story_child",
"start_frame": 5103,
"end_frame": 5127,
"start_time": 212.64,
"end_time": 213.64,
"text": "[213s-214s] Cary Grant: \"Olá!\"",
"score": 0.9
}
],
"total": 20,
"took_ms": 18
}
```
---
### `POST /api/v1/search/frames`
**Auth**: Required
**Scope**: file-level
Search face detection frames by identity name or trace ID.
---
### `POST /api/v1/search/identity_text`
**Auth**: Required
**Scope**: file-level
Search text chunks spoken by a specific identity.
---
### Visual Search
| Method | Endpoint | Description |
|--------|----------|-------------|
| POST | `/api/v1/search/visual` | Search visual chunks |
| POST | `/api/v1/search/visual/class` | Search by object class |
| POST | `/api/v1/search/visual/density` | Search by object density |
| POST | `/api/v1/search/visual/combination` | Search by object combination |
| POST | `/api/v1/search/visual/stats` | Visual chunk statistics |
#### Embedding Model
| Detail | Value |
|--------|-------|
| **Model** | EmbeddingGemma-300m |
| **Endpoint** | `POST /api/v1/embeddings` on port 11436 |
| **Dimension** | 768 |
| **Storage** | pgvector (`chunk.embedding` column) |

View File

@@ -0,0 +1,516 @@
<!-- module: identity -->
<!-- description: Global identities — CRUD, detail, files, faces, bind, unbind, search -->
<!-- depends: 01_auth -->
## Global Identities
### `GET /api/v1/identities`
**Auth**: Required
**Scope**: identity-level
List all registered identities with pagination.
#### Example
```bash
curl -s "$API/api/v1/identities?page=1&page_size=20" -H "X-API-Key: $KEY" | jq '{count, identities: [.identities[] | {name}]}'
```
---
### `GET /api/v1/identity/:identity_uuid`
**Auth**: Required
**Scope**: identity-level
Get detailed information for a specific identity, including metadata and TMDb references.
#### Example
```bash
curl -s "$API/api/v1/identity/$IDENTITY_UUID" -H "X-API-Key: $KEY"
```
#### Response (200)
```json
{
"success": true,
"identity_uuid": "a9a901056d6b46ff92da0c3c1a57dff4",
"name": "Cary Grant",
"identity_type": "people",
"source": "tmdb",
"status": "confirmed",
"tmdb_id": 112,
"tmdb_profile": "{output}/identities/{identity_uuid}/profile.jpg",
"metadata": {},
"reference_data": {},
"created_at": "2026-05-16T12:00:00Z",
"updated_at": null
}
```
| Field | Type | Description |
|-------|------|-------------|
| `identity_uuid` | string | Identity identifier |
| `name` | string | Identity name |
| `identity_type` | string | `"people"` or null |
| `source` | string | `.json`, `auto`, `tmdb`, `user_defined`, or `merged` |
| `status` | string | `"confirmed"`, `"pending"`, or `"inactive"` |
| `tmdb_id` | integer | TMDb person ID (only if source = tmdb) |
| `tmdb_profile` | string | Local profile image path (`{output}/identities/{uuid}/profile.jpg`) |
| `metadata` | object | Metadata JSON (tmdb_character, cast_order, etc.) |
| `created_at` | string | Creation timestamp |
---
### `DELETE /api/v1/identity/:identity_uuid`
**Auth**: Required
**Scope**: identity-level
Delete an identity permanently.
---
### `PATCH /api/v1/identity/:identity_uuid`
**Auth**: Required
**Scope**: identity-level
Partially update an identity. Only provided fields are modified. The `name` field is a display label and may repeat across identities. Aliases for multilingual display are stored in `metadata.aliases` (see BCP 47 reference below).
#### Request (JSON, all fields optional)
| Field | Type | Description |
|-------|------|-------------|
| `name` | string | New display name |
| `metadata` | object | Merged into existing metadata. Use `"aliases"` key for locale-tagged names |
| `status` | string | `"confirmed"`, `"pending"`, or `"skipped"` |
| `identity_type` | string | `"people"`, `"brand"`, `"object"`, `"concept"`, etc. |
#### Example
```bash
curl -s -X PATCH "$API/api/v1/identity/$IDENTITY_UUID" \
-H "X-API-Key: $KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "John Smith",
"metadata": {
"aliases": [
{"locale": "en", "name": "John Smith"},
{"locale": "zh-TW", "name": "約翰·史密斯"},
{"locale": "ja", "name": "ジョン・スミス"}
]
}
}'
```
#### Response (200)
```json
{
"success": true,
"identity_uuid": "a9a901056d6b46ff92da0c3c1a57dff4",
"updated_fields": ["name", "metadata"]
}
```
#### Error Responses
| HTTP | When |
|------|------|
| `400` | No fields to update or invalid UUID format |
| `404` | Identity not found |
---
### `GET /api/v1/identity/:identity_uuid/files`
**Auth**: Required
**Scope**: identity-level
Get all files where this identity appears. Returns per-file summary including face count, confidence, and appearance time range.
#### Example
```bash
curl -s "$API/api/v1/identity/$IDENTITY_UUID/files" -H "X-API-Key: $KEY"
```
#### Response (200)
```json
{
"success": true,
"identity_uuid": "c3545906c82d4b66aa1d150bc02decce",
"total": 1,
"page": 1,
"page_size": 20,
"data": [
{
"file_uuid": "aeed71342a899fe4b4c57b7d41bcb692",
"file_name": "Charade (1963) Cary Grant & Audrey Hepburn.mp4",
"file_path": "/path/to/videos/Charade.mp4",
"status": "completed",
"face_count": 19695,
"speaker_count": 0,
"first_appearance": 206.76,
"last_appearance": 6756.68,
"confidence": 0.803
}
]
}
```
#### Response Fields
| Field | Type | Description |
|-------|------|-------------|
| `file_uuid` | string | File identifier (full 32-char hex) |
| `file_name` | string | Video file name |
| `file_path` | string | Absolute path to video file |
| `status` | string | Video processing status (`"completed"`, `"processing"`, etc.) |
| `face_count` | int | Total face detections for this identity in this file |
| `speaker_count` | int | Speaker segments (reserved, always `0`) |
| `first_appearance` | float | First appearance time in seconds (computed from `frame_number / fps`) |
| `last_appearance` | float | Last appearance time in seconds |
| `confidence` | float | Average detection confidence |
---
### `GET /api/v1/identity/:identity_uuid/faces`
**Auth**: Required
**Scope**: identity-level
Get all face detection records associated with this identity.
#### Example
```bash
curl -s "$API/api/v1/identity/$IDENTITY_UUID/faces?page=1&page_size=20" -H "X-API-Key: $KEY"
```
#### Response (200)
```json
{
"success": true,
"identity_uuid": "c3545906c82d4b66aa1d150bc02decce",
"total": 19695,
"page": 1,
"page_size": 20,
"data": [
{
"id": 655704,
"file_uuid": "aeed71342a899fe4b4c57b7d41bcb692",
"frame_number": 5169,
"timestamp_secs": 206.76,
"face_id": "5169_0",
"bbox": {
"x": 706,
"y": 469,
"width": 618,
"height": 618
},
"confidence": 0.855
}
]
}
```
#### Response Fields
| Field | Type | Description |
|-------|------|-------------|
| `id` | int64 | Face detection record ID |
| `file_uuid` | string | File where face was detected |
| `frame_number` | int64 | Frame number (primary coordinate) |
| `timestamp_secs` | float | Time in seconds (computed as `frame_number / fps`) |
| `face_id` | string | Face ID (format: `{frame_number}_{detection_index}`) |
| `bbox` | object | Bounding box |
| `bbox.x` | float | Left coordinate |
| `bbox.y` | float | Top coordinate |
| `bbox.width` | float | Width in pixels |
| `bbox.height` | float | Height in pixels |
| `confidence` | float | Detection confidence (0.01.0) |
---
### `GET /api/v1/identity/:identity_uuid/chunks`
**Auth**: Required
**Scope**: identity-level
Get all text chunks (sentences) spoken while this identity's face was on screen. Useful for finding what a person said.
#### Example
```bash
curl -s "$API/api/v1/identity/$IDENTITY_UUID/chunks" -H "X-API-Key: $KEY"
```
#### Response (200)
```json
{
"success": true,
"identity_uuid": "a9a901056d6b46ff92da0c3c1a57dff4",
"data": [
{
"id": 0,
"file_uuid": "bd80fec92b0b6963d177a2c55bf713e2",
"chunk_id": "bd80fec92b0b6963d177a2c55bf713e2_2",
"chunk_type": "sentence",
"start_frame": 5103,
"end_frame": 5127,
"fps": 24.0,
"start_time": 212.64,
"end_time": 213.64,
"text_content": "[213s-214s] Cary Grant: \"Olá!\""
}
]
}
```
| Field | Type | Description |
|-------|------|-------------|
| `file_uuid` | string | File identifier |
| `chunk_id` | string | Sentence chunk identifier |
| `start_frame` | integer | Frame-accurate start position |
| `end_frame` | integer | Frame-accurate end position |
| `fps` | float | Frames per second |
| `start_time` | float | Start time in seconds |
| `end_time` | float | End time in seconds |
| `text_content` | string | Spoken text content |
---
### `POST /api/v1/identity/:identity_uuid/bind`
**Auth**: Required
**Scope**: identity-level
Bind a face detection to an identity. Associates the face trace with the identity for future search and recognition.
#### Request Parameters
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `file_uuid` | string | Yes | File where face is detected |
| `face_id` | string | Yes | Face ID (format: `{frame}_{idx}`) |
#### Example
```bash
curl -s -X POST "$API/api/v1/identity/$IDENTITY_UUID/bind" \
-H "X-API-Key: $KEY" \
-H "Content-Type: application/json" \
-d '{"file_uuid": "'"$FILE_UUID"'", "face_id": "1_5"}'
```
---
### `POST /api/v1/identity/:identity_uuid/unbind`
**Auth**: Required
**Scope**: identity-level
Unbind a face detection from an identity. Removes the identity association from the face record.
---
### `GET /api/v1/identities/search`
**Auth**: Required
**Scope**: identity-level
Search identities by name (ILIKE search). Returns matching identity records.
#### Example
```bash
curl -s "$API/api/v1/identities/search?q=Cary" -H "X-API-Key: $KEY"
```
| Field | Type | Description |
|-------|------|-------------|
| `name` | string | Identity name |
| `source` | string | Identity source |
| `tmdb_id` | integer | TMDb ID (if source = tmdb) |
| `file_uuid` | string | Associated file |
---
---
### `POST /api/v1/identity/upload`
**Auth**: Required
**Scope**: identity-level
Upload an identity.json file to create or update an identity. Accepts the same format as the identity.json files stored on disk.
If an identity with the same `identity_uuid` already exists, it will be updated with the new values.
#### Request
The request body is an `IdentityFile` object:
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `identity_uuid` | string | Yes | Identity identifier |
| `name` | string | Yes | Identity display name |
| `identity_type` | string | No | `"people"` or null |
| `source` | string | No | `.json`, `auto`, `tmdb`, `user_defined`, or `merged` |
| `status` | string | No | `"confirmed"`, `"pending"`, or `"inactive"` |
| `tmdb_id` | integer | No | TMDb person ID |
| `tmdb_profile` | string | No | TMDb profile image URL |
| `metadata` | object | No | Arbitrary metadata JSON |
| `file_bindings` | array | No | Array of `{ file_uuid, trace_ids, face_count }` (informational) |
#### Example
```bash
curl -s -X POST "$API/api/v1/identity/upload" \
-H "X-API-Key: $KEY" \
-H "Content-Type: application/json" \
-d '{
"version": 1,
"identity_uuid": "a9a901056d6b46ff92da0c3c1a57dff4",
"name": "Cary Grant",
"identity_type": "people",
"source": ".json",
"status": "confirmed",
"metadata": {},
"file_bindings": []
}'
```
#### Response (200)
```json
{
"success": true,
"identity_uuid": "a9a901056d6b46ff92da0c3c1a57dff4",
"name": "Cary Grant",
"message": "Identity uploaded successfully"
}
```
---
---
### `POST /api/v1/identity/:identity_uuid/profile-image`
**Auth**: Required
**Scope**: identity-level
Upload a profile image (JPEG or PNG) for an identity. The image is saved to `{output}/identities/{uuid}/profile.{ext}`.
Uses `multipart/form-data` with field name `image`.
#### Example
```bash
curl -s -X POST "$API/api/v1/identity/$IDENTITY_UUID/profile-image" \
-H "X-API-Key: $KEY" \
-F "image=@/path/to/photo.jpg"
```
#### Response (200)
```json
{
"success": true,
"identity_uuid": "a9a901056d6b46ff92da0c3c1a57dff4",
"path": "/path/to/output/identities/.../profile.jpg",
"message": "Profile image saved: profile.jpg"
}
```
#### Error Responses
| HTTP | When |
|------|------|
| `400` | Missing image field or unsupported format |
| `404` | Identity not found |
| `415` | Unsupported image type (use JPEG or PNG) |
---
### `GET /api/v1/identity/:identity_uuid/profile-image`
**Auth**: Required
**Scope**: identity-level
Retrieve the profile image for an identity. Returns the raw image data with appropriate Content-Type header.
```bash
curl -s "$API/api/v1/identity/$IDENTITY_UUID/profile-image" \
-H "X-API-Key: $KEY" -o profile.jpg
```
| Response Header | Value |
|----------------|-------|
| `content-type` | `image/jpeg` or `image/png` |
---
## Alias System (BCP 47 Locale Tags)
Identity aliases support multilingual display names. Aliases are stored in `metadata.aliases` as an array of `{locale, name}` objects.
### BCP 47 Locale Tags Reference
| Locale | Tag | Example |
|--------|-----|---------|
| English | `en` | John Smith |
| Traditional Chinese | `zh-TW` | 約翰·史密斯 |
| Simplified Chinese | `zh-CN` | 约翰·史密斯 |
| Japanese | `ja` | ジョン・スミス |
| Korean | `ko` | 존 스미스 |
| Cantonese | `yue` | 約翰·史密夫 |
| French | `fr` | John Smith (French spelling) |
| Spanish | `es` | Juan Smith |
| Arabic | `ar` | جون سميث |
| Russian | `ru` | Джон Смит |
| Thai | `th` | จอห์น สมิธ |
BCP 47 is the IETF standard for language tags. Format: `language` (e.g. `en`, `ja`) or `language-Region` (e.g. `zh-TW`, `zh-CN`).
### Frontend Display Logic
```javascript
function getDisplayName(identity, preferredLocale) {
const match = identity.metadata?.aliases?.find(a => a.locale === preferredLocale);
if (match) return match.name;
const lang = preferredLocale.split('-')[0];
const langMatch = identity.metadata?.aliases?.find(a => a.locale.startsWith(lang));
if (langMatch) return langMatch.name;
return identity.name;
}
```
### Updating Aliases via PATCH
```json
PATCH /api/v1/identity/:identity_uuid
{
"metadata": {
"aliases": [
{"locale": "en", "name": "John Smith"},
{"locale": "zh-TW", "name": "約翰·史密斯"}
]
}
}
```
---
*Updated: 2026-05-22*

View File

@@ -0,0 +1,65 @@
<!-- module: identity_agent -->
<!-- description: Identity agent — match from photo, match from trace -->
<!-- depends: 01_auth, 07_identity -->
## Identity Agent
### `POST /api/v1/agents/identity/match-from-photo`
**Auth**: Required
**Scope**: file-level
Upload a face photo to match against known identities. Detects face via InsightFace, extracts 512D embedding via CoreML FaceNet, then searches pgvector for the closest identity.
#### Request
`multipart/form-data` with field `image` (JPEG/PNG) and optional `file_uuid`.
#### Example
```bash
curl -s -X POST "$API/api/v1/agents/identity/match-from-photo" \
-H "Authorization: Bearer $JWT" \
-F "image=@/path/to/face.jpg" \
-F "file_uuid=$FILE_UUID"
```
#### Response (200)
```json
{
"success": true,
"matches": [
{
"identity_uuid": "a9a90105...",
"name": "Cary Grant",
"similarity": 0.87
}
]
}
```
---
### `POST /api/v1/agents/identity/match-from-trace`
**Auth**: Required
**Scope**: file-level
Match a face trace (tracked face across frames) against known identities. Samples 3 angles from the trace, generates embeddings, and searches pgvector.
#### Request Parameters
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `file_uuid` | string | Yes | File containing the trace |
| `trace_id` | integer | Yes | Face trace ID to match |
#### Example
```bash
curl -s -X POST "$API/api/v1/agents/identity/match-from-trace" \
-H "Authorization: Bearer $JWT" \
-H "Content-Type: application/json" \
-d '{"file_uuid": "'"$FILE_UUID"'", "trace_id": 10}'
```

View File

@@ -0,0 +1,317 @@
<!-- module: media -->
<!-- description: Video streaming & frame extraction -->
<!-- depends: 01_auth -->
## Video Streaming & Frame Extraction
All video streaming endpoints support the following common query parameters:
| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `mode` | string | No | `normal` | `normal` or `debug` (draws detection overlays) |
| `audio` | string | No | `on` | `on` or `off` |
---
### `GET /api/v1/file/:file_uuid/video`
Stream the full video file with range support for seeking.
**Auth**: Required
**Scope**: file-level
#### Response
- **200**: Video stream (`Content-Type` based on file extension)
- **206**: Partial content (range request)
- Supports `Range` header for seeking
---
### `GET /api/v1/file/:file_uuid/trace/:trace_id/video`
Stream video with highlights for a specific face trace (follows a single person across frames with bounding box overlay).
**Auth**: Required
**Scope**: file-level
---
### `GET /api/v1/file/:file_uuid/trace/:trace_id/representative-face`
Find the best single face to represent this trace. Uses a two-stage selection: SQL (area × confidence → top 10) then FFmpeg `blurdetect` (sharpness → pick the least blurry).
**Auth**: Required
**Scope**: file-level
#### Example
```bash
curl -s "$API/api/v1/file/$FILE_UUID/trace/1939/representative-face" \
-H "X-API-Key: $KEY"
```
#### Response (200)
```json
{
"success": true,
"file_uuid": "aeed71342a899fe4b4c57b7d41bcb692",
"trace_id": 1939,
"face_count": 538,
"representative": {
"frame_number": 68193,
"timestamp_secs": 2727.72,
"bbox": { "x": 347, "y": 378, "width": 427, "height": 427 },
"confidence": 0.760,
"quality_score": 138516,
"blur_score": 9.46
}
}
```
#### Response Fields
| Field | Type | Description |
|-------|------|-------------|
| `trace_id` | integer | Face trace ID |
| `face_count` | integer | Total face detections in this trace |
| `representative.frame_number` | integer | Frame number of the selected face (primary coordinate) |
| `representative.timestamp_secs` | float | Time in seconds (derived from `frame_number / fps`) |
| `representative.bbox` | object | Bounding box `{x, y, width, height}` |
| `representative.confidence` | float | Detection confidence (0.01.0) |
| `representative.quality_score` | float | Pre-selection score (`area × confidence`) |
| `representative.blur_score` | float | FFmpeg blurdetect result (lower = sharper) |
#### Error Responses
---
### `GET /api/v1/file/:file_uuid/trace/:trace_id/thumbnail`
Extract the best face image for a trace as JPEG (320×320). Internally selects the face using the same two-stage algorithm as `representative-face`, then crops via FFmpeg. The result is cacheable for 24 hours.
**Auth**: Required
**Scope**: file-level
#### Example
```bash
curl -s "$API/api/v1/file/$FILE_UUID/trace/1939/thumbnail" \
-H "X-API-Key: $KEY" -o trace_1939_face.jpg
```
#### Response
- **200**: `image/jpeg` binary data (320×320 cropped face)
- **404**: File, trace not found, or no suitable face
- **500**: FFmpeg or database error
---
### `GET /api/v1/file/:file_uuid/identities/:identity_uuid_a/co-occur-with/:identity_uuid_b`
Find the first frame where two identities appear together, with representative face thumbnails for both.
**Auth**: Required
**Scope**: file-level
#### Example
```bash
# Audrey Hepburn & Cary Grant 第一次同框
curl -s "$API/api/v1/file/$FILE_UUID/identities/$AUDREY_UUID/co-occur-with/$CARY_UUID" \
-H "X-API-Key: $KEY" | jq '{identity_a: .identity_a.name, identity_b: .identity_b.name, first_frame: .first_cooccurrence.frame_number}'
```
#### Response (200)
```json
{
"success": true,
"file_uuid": "aeed71342a899fe4b4c57b7d41bcb692",
"identity_a": {
"identity_uuid": "c3545906-c82d-4b66-aa1d-150bc02decce",
"name": "Audrey Hepburn",
"trace_id": 920
},
"identity_b": {
"identity_uuid": "2b0ddefe-e2a9-4533-9308-b375594604d5",
"name": "Cary Grant",
"trace_id": 919
},
"first_cooccurrence": {
"frame_number": 38165,
"timestamp_secs": 1526.60,
"total_cooccurrence_frames": 3136,
"representative_face_a": {
"frame_number": 38199,
"bbox": { "x": 122, "y": 339, "width": 176, "height": 176 },
"confidence": 0.832,
"thumbnail_url": "/api/v1/file/aeed71342.../trace/920/thumbnail"
},
"representative_face_b": {
"frame_number": 38291,
"bbox": { "x": 511, "y": 315, "width": 192, "height": 192 },
"confidence": 0.791,
"thumbnail_url": "/api/v1/file/aeed71342.../trace/919/thumbnail"
}
}
}
```
#### Response Fields
| Field | Type | Description |
|-------|------|-------------|
| `identity_a.name` | string | First identity name |
| `identity_b.name` | string | Second identity name |
| `first_cooccurrence.frame_number` | int | First frame where both appear |
| `first_cooccurrence.timestamp_secs` | float | Time in seconds |
| `first_cooccurrence.total_cooccurrence_frames` | int | Total frames with both present |
| `first_cooccurrence.representative_face_a/b` | object | Best face thumbnail data for each identity |
#### Error Responses
| HTTP | When |
|------|------|
| `404` | File or identity not found |
| `404` | The two identities never co-occur in this file |
| `500` | Database or FFmpeg error |
### `GET /api/v1/file/:file_uuid/video/bbox`
Stream video with bounding box overlay for all detected objects/faces.
**Auth**: Required
**Scope**: file-level
Uses a built-in 5×7 bitmap font renderer to draw labels directly on video frames via FFmpeg `drawtext` filter.
---
### `GET /api/v1/file/:file_uuid/thumbnail`
Extract a single frame from a video as JPEG image. Uses FFmpeg `select` filter.
**Auth**: Required
**Scope**: file-level
#### Query Parameters
| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `frame` | integer | Yes | — | Zero-based frame number to extract |
| `x` | integer | No | — | Crop start X (left edge). Requires `y`, `w`, `h`. |
| `y` | integer | No | — | Crop start Y (top edge). Requires `x`, `w`, `h`. |
| `w` | integer | No | — | Crop width in pixels. Requires `x`, `y`, `h`. |
| `h` | integer | No | — | Crop height in pixels. Requires `x`, `y`, `w`. |
All four crop params (`x`, `y`, `w`, `h`) must be provided together or omitted.
#### Example
```bash
# Extract frame 1000 (full frame)
curl -s "$API/api/v1/file/bd80fec92b0b6963d177a2c55bf713e2/thumbnail?frame=1000" \
-H "Authorization: Bearer $JWT" -o frame_1000.jpg
# Extract and crop face region (x=320, y=240, w=160, h=160)
curl -s "$API/api/v1/file/bd80fec92b0b6963d177a2c55bf713e2/thumbnail?frame=1000&x=320&y=240&w=160&h=160" \
-H "Authorization: Bearer $JWT" -o face_crop.jpg
```
#### Response
- **200**: `image/jpeg` binary data
- **404**: File not found
- **500**: FFmpeg error (e.g., frame number exceeds video duration)
### `GET /api/v1/file/:file_uuid/clip`
Extract a video clip (time range) as MPEG-TS stream. Uses FFmpeg `-ss` fast seek.
**Auth**: Required
**Scope**: file-level
#### Query Parameters
| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `start_frame` | integer | No* | — | Start frame (zero-based). **Frame-accurate** — use this for precision. |
| `end_frame` | integer | No* | — | End frame (zero-based, inclusive). Requires `start_frame`. |
| `start_time` | float | No* | — | Start time in seconds. Approximate (FPS-dependent). Fallback if frames not given. |
| `end_time` | float | No* | — | End time in seconds. Approximate (FPS-dependent). Fallback if frames not given. |
| `fps` | float | No | video FPS | Override frames-per-second for frame↔time calculation. Defaults to video's detected FPS. |
| `mode` | string | No | `normal` | `normal` or `debug` (draws "CLIP" overlay) |
| `audio` | string | No | `on` | `on` or `off` |
Either (`start_frame`+`end_frame`) OR (`start_time`+`end_time`) must be provided.
#### Example
```bash
# Clip by frame range (primary)
curl -s "$API/api/v1/file/bd80fec92b0b6963d177a2c55bf713e2/clip?start_frame=0&end_frame=47" \
-H "Authorization: Bearer $JWT" -o clip.ts
# Clip by time range (fallback)
curl -s "$API/api/v1/file/bd80fec92b0b6963d177a2c55bf713e2/clip?start_time=30&end_time=45" \
-H "Authorization: Bearer $JWT" -o clip.ts
```
#### Response
- **200**: `video/mp2t` MPEG-TS stream
- **400**: Missing/invalid range parameters
- **404**: File not found
- **500**: FFmpeg error
#### Technical Notes
| Detail | Value |
|--------|-------|
| **Backend** | FFmpeg (`ffmpeg-full`) |
| **Seek** | `-ss` before `-i` (fast keyframe seek) |
| **Format** | MPEG-TS (`mpegts` muxer, pipe-safe) |
| **Codec** | H.264 + AAC |
| **Cache** | `Cache-Control: public, max-age=86400` (24h) |
### Video vs Clip: Quality & Format Comparison
Both endpoints support time range extraction, but serve different use cases:
| Feature | `/video` | `/clip` |
|---------|----------|---------|
| **No params** | Streams full file (Range seek) | Returns 400 (params required) |
| **HTTP Range** | ✅ Supported | ❌ Not supported |
| **Encoding** | `-c copy` (zero encoding) | `-c:v libx264 -c:a aac` (re-encode) |
| **Quality** | Original (bit-exact, zero loss) | Compressed (default CRF ≈ 23) |
| **Format** | `video/mp4` | `video/mp2t` (MPEG-TS) |
| **Speed** | Fast (no computation) | Slower (encoding required) |
| **Frame control** | Time-based (`dur = (ef-sf)/fps`) | Precise (`-vframes`) |
| **Debug mode** | ❌ | ✅ `mode=debug` overlay |
| **Cache** | ❌ | ✅ `max-age=86400` |
#### Usage Recommendation
| Scenario | Use |
|----------|-----|
| Full video streaming / player seek | `/video` |
| Quick preview clip (zero quality loss) | `/video?start_frame=...&end_frame=...` |
| Debug frame verification / text overlay | `/clip?mode=debug` |
| Precise frame count control | `/clip` |
| CDN cacheable clip | `/clip` |
---
| Detail | Value |
|--------|-------|
| **Backend** | FFmpeg (`ffmpeg-full`) |
| **Filter** | `select=eq(n\,FRAME)` to select frame, optional `crop=W:H:X:Y` |
| **Output** | Single JPEG via pipe (`image2pipe`, `mjpeg` codec) |
| **Cache** | `Cache-Control: public, max-age=86400` (24h) |
| **Frame number** | Zero-based (`frame=0` = first frame of video) |
---
*Updated: 2026-05-19 12:49:24*

View File

@@ -0,0 +1,109 @@
<!-- module: tmdb -->
<!-- description: TMDb enrichment endpoints — prefetch, probe, resource, check -->
<!-- depends: 01_auth, 03_register -->
## TMDb Enrichment
> **Offline operation**: TMDb prefetch now checks local identity files first (`identities/_index.json` + `*.tmdb.json`).
> If local files exist, no external API call is made. Internet is only needed for initial data seeding.
### Overview
TMDb enrichment is an optional identity enrichment step that can be run after Pipeline face detection completes. The workflow is:
1. **Prefetch** (requires internet): Download movie cast data from TMDb API → cache to `{file_uuid}.tmdb.json`
2. **Probe**: Read local cache → create identities for **all** cast members (`source='tmdb'`) + save `identity.json` + download profile image to `{OUTPUT}/identities/{uuid}/profile.jpg`
3. **Match**: The worker automatically matches video faces against TMDb identities when `MOMENTRY_TMDB_PROBE_ENABLED=true`
### `POST /api/v1/agents/tmdb/prefetch`
**Auth**: Required
**Scope**: file-level
Fetch TMDb cast data for a registered file and cache it locally. This is the only step requiring internet access.
#### Request Parameters
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `file_uuid` | string | Yes | File UUID to enrich |
#### Example
```bash
curl -s -X POST "$API/api/v1/agents/tmdb/prefetch" \
-H "Content-Type: application/json" \
-H "X-API-Key: $KEY" \
-d '{"file_uuid": "'"$FILE_UUID"'"}'
```
#### Response (200)
```json
{"success": true, "file_uuid": "...", "cache_path": "/output/...tmdb.json"}
```
### `POST /api/v1/file/:file_uuid/tmdb-probe`
**Auth**: Required
**Scope**: file-level
Read local TMDb cache and create/update identities. Requires prefetch to have been run first.
#### Example
```bash
curl -s -X POST "$API/api/v1/file/$FILE_UUID/tmdb-probe" \
-H "X-API-Key: $KEY" | jq '{identities_created, movie_title}'
```
#### Response (200 — identities created)
```json
{"success": true, "identities_created": 15, "movie_title": "Charade"}
```
#### Response (200 — no cache)
```json
{"success": false, "message": "No TMDb cache found. Run tmdb-prefetch first."}
```
### `GET /api/v1/resource/tmdb`
**Auth**: Required
**Scope**: system-level
View TMDb resource status including configuration, identity counts, and cache file count.
#### Example
```bash
curl -s "$API/api/v1/resource/tmdb" -H "X-API-Key: $KEY" \
| jq '{identities_seeded, cache_files}'
```
### `POST /api/v1/resource/tmdb/check`
**Auth**: Required
**Scope**: system-level
Ping the TMDb API to verify connectivity and measure latency.
#### Example
```bash
curl -s -X POST "$API/api/v1/resource/tmdb/check" \
-H "X-API-Key: $KEY" | jq '.status'
```
#### Response
```json
{
"api_key_configured": true,
"enabled": false,
"api_reachable": true,
"api_latency_ms": 120
}
```

View File

@@ -0,0 +1,178 @@
<!-- module: pipeline -->
<!-- description: Pipeline processors, ingestion status, stats endpoints -->
<!-- depends: 01_auth -->
## Pipeline
### Dependency Graph
```mermaid
flowchart TB
subgraph Processors["10 Processors"]
Cut[Cut] --> ASR[ASR]
ASR --> ASRX[ASRX]
ASRX --> Story[Story]
Cut --> Story
YOLO[YOLO] --> VisualChunk[VisualChunk]
VisualChunk --> Story
Face[Face] --> Story
Story --> FiveW1H[5W1H]
OCR[OCR]
Pose[Pose]
end
subgraph Ingestion["入庫 (Post-Processing)"]
ASR --> Rule1[Rule 1 Sentence]
ASRX --> Rule1
Rule1 --> Vectorize[Auto-Vectorize]
Rule1 --> Phase1[Phase 1 Pack]
Cut --> Rule3[Rule 3 Scene]
ASR --> Rule3
Face --> Trace[Face Trace]
Trace --> Qdrant[Qdrant Sync]
Trace --> TraceChunks[Trace Chunks]
Trace --> TKG[TKG Builder]
Face --> TMDbMatch[TMDb Match]
Face --> SceneMeta[Scene Metadata]
YOLO --> SceneMeta
Face --> IdentityAgent[Identity Agent]
ASRX --> IdentityAgent
Cut --> Agent5W1H[5W1H Agent]
ASR --> Agent5W1H
Agent5W1H --> Phase2[Phase 2 Pack]
end
style Processors fill:#1a1a2e,stroke:#e94560
style Ingestion fill:#16213e,stroke:#0f3460
```
### Pipeline Completion Flow
The pipeline is **not complete** until both the 10 processors AND the 入庫 (ingestion) steps have finished. The worker polls every 3 seconds and only marks the job as `completed` when all ingestion steps verify OK.
```
10 processors done
↓ (job status stays "running")
Algorithm 1 Trigger: Rule 1 + Vectorize + Phase 1 Pack
↓ (job runs in parallel)
Algorithm 2 Trigger: Face Trace → TKG, Scene Metadata, Identity Agent, 5W1H Agent
↓ (poll checks every 3s)
Ingestion verification: rule1 ✓ vectorize ✓ rule3 ✓ face_trace ✓ tkg ✓ scene_meta ✓ 5w1h ✓
job status = "completed"
```
### 10 Processor Stages
| # | Processor | Depends On | Description |
|---|-----------|------------|-------------|
| 1 | `Cut` | — | Scene boundary detection (PySceneDetect) |
| 2 | `ASR` | Cut | Automatic speech recognition (faster-whisper) |
| 3 | `ASRX` | ASR | Speaker diarization + ASR refinement |
| 4 | `YOLO` | — | Object detection (YOLOv8) |
| 5 | `OCR` | — | Optical character recognition |
| 6 | `Face` | — | Face detection + recognition (InsightFace + CoreML) |
| 7 | `Pose` | — | Pose estimation |
| 8 | `VisualChunk` | YOLO | Visual object chunking |
| 9 | `Story` | ASRX + Cut + YOLO + Face | Narrative scene summarization (LLM, with embedding) |
| 10 | `5W1H` | Story | Who/What/When/Where/Why extraction (LLM, with embedding) |
### 入庫 (Post-Processing / Ingestion)
These steps run after the 10 processors and are **required for pipeline completion**. The worker checks all of them before marking the job as done.
| # | Step | Triggers When | Verification |
|---|------|--------------|-------------|
| 1 | **Rule 1 Sentence Chunking** | ASR + ASRX done | `chunk` table has rows with `chunk_type = 'sentence'` |
| 2 | **Auto-Vectorize** | Rule 1 done | `chunk.embedding` IS NOT NULL for sentence chunks |
| 3 | **Phase 1 Pack** | Rule 1 done | `release_pack.py --phase 1` executed |
| 4 | **Rule 3 Scene Chunking** | All 10 processors done + Cut + ASR | `chunk` table has rows with `chunk_type = 'cut'` |
| 5 | **Face Trace** | All 10 processors done + Face | `face_detections.trace_id` IS NOT NULL |
| 6 | **Qdrant Face Sync** | Face Trace done | Qdrant face_embedding collection populated |
| 7 | **Trace Chunks** | Face Trace done | `chunk` table has rows with `chunk_type = 'trace'` |
| 8 | **TKG Builder** | Face Trace done | `tkg_nodes` + `tkg_edges` tables have rows |
| 9 | **TMDb Face Matching** | TMDb enabled + Face done | `face_detections.identity_id` IS NOT NULL |
| 10 | **Heuristic Scene Metadata** | Face + YOLO done | `{file_uuid}.scene_meta.json` exists on disk |
| 11 | **Identity Agent** | Face + ASRX done | `identities` with `source = 'identity_agent'` |
| 12 | **5W1H Agent** | Cut + ASR done | `chunk.summary_text` IS NOT NULL for cut chunks |
| 13 | **Release Pack** | 5W1H Agent done | `release_pack.py --phase 2` executed |
### Ingestion Status
Check real-time ingestion status for a file:
```bash
curl "$API/api/v1/stats/ingestion-status/{file_uuid}"
```
Returns per-step `done` / `pending` status with detail counts.
#### Example
```bash
curl "http://localhost:3003/api/v1/stats/ingestion-status/bd80fec9c42afb0307eb28f22c64c76a" | jq '.steps[] | {name, status, detail}'
```
#### Response
```json
{
"file_uuid": "bd80fec9c42afb0307eb28f22c64c76a",
"steps": [
{ "name": "rule1_sentence", "status": "pending", "detail": "0 sentence chunks" },
{ "name": "auto_vectorize", "status": "pending", "detail": "0 embedded" },
{ "name": "rule3_scene", "status": "pending", "detail": "0 scene chunks" },
{ "name": "face_trace", "status": "pending", "detail": "0 traces" },
{ "name": "trace_chunks", "status": "pending", "detail": "0 trace chunks" },
{ "name": "tkg", "status": "pending", "detail": "0 nodes, 0 edges" },
{ "name": "identity_match", "status": "pending", "detail": "0 identities" },
{ "name": "scene_metadata", "status": "pending", "detail": null },
{ "name": "5w1h", "status": "pending", "detail": "0 scenes with 5W1H" }
]
}
```
### Stats Endpoints
| Method | Endpoint | Auth | Description |
|--------|----------|------|-------------|
| GET | `/api/v1/stats/sftpgo` | No | SFTPGo service status |
| GET | `/api/v1/stats/ingestion-status/:file_uuid` | No | Per-file ingestion checklist |
### Configuration
### `POST /api/v1/config/cache`
**Auth**: Required
**Scope**: system-level
Toggle the Redis cache on or off.
#### Request Parameters
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `enabled` | boolean | Yes | `true` to enable, `false` to disable |
#### Example
```bash
curl -s -X POST "$API/api/v1/config/cache" \
-H "Content-Type: application/json" \
-H "X-API-Key: $KEY" \
-d '{"enabled": false}'
```
### Unmounted Routes
The following routes are defined in source code but are **NOT** currently mounted in the router:
| Endpoint | Source file |
|----------|-------------|
| `/api/v1/search/persons` | `universal_search.rs` (not mounted) |
| `/api/v1/who` | `who.rs` |
| `/api/v1/who/candidates` | `who.rs` |

View File

@@ -0,0 +1,57 @@
<!-- module: error_codes -->
<!-- description: Standard API error codes -->
<!-- depends: -->
## Error Response Format
All API errors follow this JSON structure:
```json
{
"success": false,
"error": {
"code": "E001_NOT_FOUND",
"message": "Resource not found",
"details": {"resource": "file_uuid", "value": "abc"}
}
}
```
## Error Code List
### Generic Errors (E0xx)
| Code | HTTP | Description |
|------|------|-------------|
| `E001_NOT_FOUND` | 404 | Resource not found (file, identity, chunk) |
| `E002_DUPLICATE` | 409 | Resource already exists |
| `E003_VALIDATION` | 400 | Request parameter validation failed |
| `E004_UNAUTHORIZED` | 401 | Invalid API key or token |
| `E005_INTERNAL` | 500 | Internal server error |
### Processor Errors (E1xx)
| Code | HTTP | Description |
|------|------|-------------|
| `E101_PROCESSOR_FAIL` | 500 | Python script execution failed |
| `E102_TIMEOUT` | 504 | Processing timeout |
| `E103_RESUME_FAIL` | 500 | Resume failed (checkpoint not found) |
| `E104_NO_VIDEO` | 400 | Video file path not found |
### Identity Errors (E2xx)
| Code | HTTP | Description |
|------|------|-------------|
| `E201_FACE_NOT_FOUND` | 404 | Face detection not found |
| `E202_MERGE_CONFLICT` | 409 | Identity merge conflict |
| `E203_CANDIDATE_EMPTY` | 404 | No candidates available for confirmation |
### TMDb Errors (E3xx)
| Code | HTTP | Description |
|------|------|-------------|
| `E301_TMDB_NO_KEY` | 400 | `TMDB_API_KEY` environment variable not set |
| `E302_TMDB_UNREACHABLE` | 502 | TMDb API unreachable or timed out |
| `E303_TMDB_CACHE_NOT_FOUND` | 200 | No local TMDb cache; run prefetch first |
| `E304_TMDB_PROBE_FAILED` | 500 | TMDb probe execution failed |
| `E305_TMDB_MOVIE_NOT_FOUND` | 404 | No matching TMDb movie found from filename |

View File

@@ -0,0 +1,118 @@
# Agent Endpoints
Agent endpoints provide AI-powered capabilities including translation, identity analysis, and 5W1H extraction.
## POST /api/v1/agents/translate
Translate text between languages using Gemma4 (llama.cpp, port 8082).
### Request
```json
{
"text": "Hello, welcome to Momentry Core.",
"target_language": "Traditional Chinese",
"source_language": "English"
}
```
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `text` | string | ✅ | Text to translate |
| `target_language` | string | ✅ | Target language name (e.g. "Traditional Chinese", "Japanese") |
| `source_language` | string | ❌ | Source language (default: "auto") |
### Response
```json
{
"success": true,
"translated_text": "您好,歡迎使用 Momentry Core。",
"source_language_detected": "English",
"model_used": "google_gemma-4-26B-A4B-it-Q5_K_M.gguf"
}
```
### Supported Language Pairs (tested)
| Source | Target | Quality |
|--------|--------|---------|
| English | Traditional Chinese | ✅ |
| English | Japanese | ✅ |
| Chinese | English | ✅ |
| English | French | ✅ |
| Chinese | Japanese | ✅ |
### Model
- **Model**: Gemma4 26B (Q5_K_M)
- **Engine**: llama.cpp at `localhost:8082`
- **Endpoint**: `/v1/chat/completions` (OpenAI-compatible)
- **Temperature**: 0.1
- **Max tokens**: 1024
### Errors
| Status | Condition |
|--------|-----------|
| 500 | LLM unreachable or response parse failure |
| 401 | Missing/invalid auth |
---
## POST /api/v1/agents/5w1h/analyze
Extract 5W1H (Who, What, When, Where, Why, How) from a scene. Uses Gemma4 LLM on port 8082.
### Request
```json
{
"file_uuid": "3abeee81d94597629ed8cb943f182e94",
"scene_id": 42
}
```
### Response
```json
{
"success": true,
"5w1h": {
"who": ["Cary Grant"],
"what": ["discussing plans"],
"when": ["1963"],
"where": ["Paris"],
"why": ["vacation"],
"how": ["in person"]
}
}
```
## POST /api/v1/agents/5w1h/batch
Batch analyze all scenes in a file for 5W1H extraction. Uses the pipeline's `parent_chunk_5w1h.py --mode llm`.
### Request
```json
{
"file_uuid": "3abeee81d94597629ed8cb943f182e94"
}
```
## GET /api/v1/agents/5w1h/status
Get status of the 5W1H agent pipeline for a file.
---
## Embedding Model
| Detail | Value |
|--------|-------|
| **Model** | EmbeddingGemma-300m |
| **Endpoint** | `POST /v1/embeddings` on port 11436 |
| **Dimension** | 768 |
| **Used by** | `parent_chunk_5w1h.py --embed`, story, 5W1H, search |

View File

@@ -0,0 +1,63 @@
# {Module Name} — API Workspace Module
> Use this template when adding or editing API endpoint documentation modules.
## Module Metadata
Every module MUST start with:
```markdown
<!-- module: <short_name> -->
<!-- description: One-line description of what this module covers -->
<!-- depends: <comma-separated list of dependency module names> -->
```
## Endpoint Template
Each endpoint MUST use this structure:
### `METHOD /path/to/endpoint`
**Auth**: Required / Optional / Public
**Scope**: file-level / identity-level / system-level
#### Request Parameters
| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `param1` | string | Yes | — | Description |
#### Example
```bash
# brief description of what this example demonstrates
curl -s -X METHOD "$API/path" \
-H "X-API-Key: $KEY" \
-H "Content-Type: application/json" \
-d '{"param1": "value"}'
```
#### Response (200)
```json
{ "success": true }
```
| Field | Type | Description |
|-------|------|-------------|
| `success` | boolean | Always true on 200 |
#### Error Codes
| Code | HTTP | When |
|------|------|------|
| E0xx | 4xx | Description |
## Rules
1. Each module file covers ONE topic group (e.g., `09_tmdb.md` = all TMDb endpoints)
2. Use `$API` and `$KEY` in all curl examples
3. Use `$FILE_UUID`, `$IDENTITY_UUID` variables for UUID examples
4. Module filename = `NN_topic.md` (NN = execution order, 01-99)
5. `depends` metadata = which modules must be assembled before this one

View File

@@ -0,0 +1,225 @@
#!/opt/homebrew/bin/python3.11
"""Build HTML documentation from module source files."""
import os, markdown, re, glob, shutil
MODULES_DIR = os.path.join(os.path.dirname(__file__), "..", "docs_v1.0", "API_WORKSPACE", "modules")
DOC_DIR = os.path.join(os.path.dirname(__file__), "..", "docs_v1.0", "doc")
DOC_DEV_DIR = os.path.join(os.path.dirname(__file__), "..", "docs_v1.0", "doc_developer")
# User-facing modules (no developer content)
USER_MODULES = {
"01_auth", "02_health", "03_register", "04_lookup", "05_process",
"06_search", "07_identity", "08_identity_agent", "08_media",
"09_tmdb", "10_pipeline", "12_agent",
}
def md_to_html(md_text: str) -> str:
"""Convert Markdown to HTML."""
html = markdown.markdown(md_text, extensions=['fenced_code', 'tables', 'codehilite'])
# Wrap tables
html = re.sub(r'<table>', '<table class="table">', html)
return html
def build_index(files, dev=False):
"""Build index.html."""
links = []
for fname in sorted(files):
name = os.path.splitext(fname)[0]
label = MODULE_LABELS.get(name, name.replace("_", " ").title())
if "" in label:
cn, en = label.split("", 1)
else:
cn, en = label, ""
html_name = fname.replace(".md", ".html")
links.append(f'<tr onclick="window.location=\'{html_name}\'" style="cursor:pointer"><td class="cn">{cn}</td><td class="en">{en}</td></tr>')
title = "Momentry API 開發者文件" if dev else "Momentry API 文件"
subtitle = "開發者專用" if dev else "API 參考手冊 — 登入後可瀏覽各模組文件"
return f"""<!DOCTYPE html>
<html lang="zh-TW">
<head>
<meta charset="UTF-8">
<title>{title}</title>
<style>
* {{ margin: 0; padding: 0; box-sizing: border-box; }}
body {{ font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; background: #f5f5f5; color: #333; padding: 40px; }}
.container {{ max-width: 900px; margin: 0 auto; background: white; border-radius: 12px; box-shadow: 0 2px 12px rgba(0,0,0,0.08); padding: 40px; }}
h1 {{ font-size: 28px; margin-bottom: 8px; }}
p.subtitle {{ color: #666; margin-bottom: 24px; }}
table {{ width: 100%; border-collapse: collapse; }}
tr {{ border-bottom: 1px solid #eee; }}
tr:last-child {{ border: none; }}
td {{ padding: 10px 0; }}
td.cn {{ width: 140px; font-weight: 600; color: #333; }}
td.en {{ color: #666; font-size: 14px; }}
a {{ color: #0066cc; text-decoration: none; display: block; }}
a:hover td {{ background: #f8f8f8; border-radius: 4px; }}
</style>
</head>
<body>
<div class="container">
<h1>{title}</h1>
<p class="subtitle">{subtitle}</p>
<table>{"".join(links)}</table>
</div>
</body>
</html>"""
MODULE_LABELS = {
"01_auth": "安全認證Authentication",
"02_health": "健康檢查Health",
"03_register": "檔案註冊File Registration",
"04_lookup": "檔案屬性查詢File Lookup",
"05_process": "處理流程Processing",
"06_search": "搜尋功能Search",
"07_identity": "身份識別Identity",
"08_identity_agent": "智能身份綁定Smart Identity Binding",
"08_media": "串流與截圖Streaming & Thumbnails",
"09_tmdb": "TMDb 整合TMDb Integration",
"10_pipeline": "生產線Pipeline",
"11_error_codes": "錯誤碼Error Codes",
"12_agent": "智慧代理AI Agents",
}
def build_html(md_text: str, title: str) -> str:
"""Wrap MD content in HTML page."""
content = md_to_html(md_text)
return f"""<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>{title} - Momentry API Docs</title>
<style>
* {{ margin: 0; padding: 0; box-sizing: border-box; }}
body {{ font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; background: #f5f5f5; color: #333; padding: 40px; }}
.container {{ max-width: 960px; margin: 0 auto; background: white; border-radius: 12px; box-shadow: 0 2px 12px rgba(0,0,0,0.08); padding: 40px; }}
h1 {{ font-size: 24px; margin: 24px 0 12px; }}
h2 {{ font-size: 20px; margin: 20px 0 10px; color: #222; }}
h3 {{ font-size: 16px; margin: 16px 0 8px; color: #444; }}
p {{ line-height: 1.6; margin: 8px 0; }}
table {{ border-collapse: collapse; width: 100%; margin: 12px 0; font-size: 14px; }}
th, td {{ border: 1px solid #ddd; padding: 8px 12px; text-align: left; }}
th {{ background: #f0f0f0; font-weight: 600; }}
code {{ background: #f0f0f0; padding: 2px 6px; border-radius: 3px; font-size: 13px; }}
pre {{ background: #f8f8f8; border: 1px solid #ddd; border-radius: 6px; padding: 12px; overflow-x: auto; margin: 12px 0; }}
pre code {{ background: none; padding: 0; }}
a {{ color: #0066cc; }}
.back {{ display: inline-block; margin-bottom: 20px; color: #666; }}
.back:hover {{ color: #333; }}
</style>
</head>
<body>
<div class="container">
<a class="back" href="index.html">&larr; Back to index</a>
{content}
</div>
</body>
</html>"""
def login_page() -> str:
return """<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Login - Momentry Docs</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; background: #f5f5f5; display: flex; justify-content: center; align-items: center; height: 100vh; }
.card { background: white; border-radius: 12px; box-shadow: 0 2px 12px rgba(0,0,0,0.08); padding: 40px; width: 360px; }
h1 { font-size: 24px; margin-bottom: 24px; text-align: center; }
input { width: 100%; padding: 10px 12px; margin-bottom: 12px; border: 1px solid #ddd; border-radius: 6px; font-size: 14px; }
button { width: 100%; padding: 10px; background: #0066cc; color: white; border: none; border-radius: 6px; font-size: 16px; cursor: pointer; }
button:hover { background: #0052a3; }
.error { color: #cc0000; font-size: 13px; margin-bottom: 12px; display: none; }
</style>
</head>
<body>
<div class="card">
<h1>Momentry Docs</h1>
<form id="loginForm">
<input type="text" id="username" placeholder="Username" value="demo" required>
<input type="password" id="password" placeholder="Password" value="demo" required>
<div class="error" id="error">Invalid credentials</div>
<button type="submit">Login</button>
</form>
</div>
<script>
document.getElementById('loginForm').onsubmit = async function(e) {
e.preventDefault();
const resp = await fetch('/api/v1/auth/login', {
method: 'POST',
headers: {'Content-Type': 'application/json'},
body: JSON.stringify({
username: document.getElementById('username').value,
password: document.getElementById('password').value
})
});
if (resp.ok) {
window.location.href = '/doc/index.html';
} else {
document.getElementById('error').style.display = 'block';
}
};
</script>
</body>
</html>"""
def main():
# Clean and recreate doc dirs
for d in [DOC_DIR, DOC_DEV_DIR]:
if os.path.exists(d):
shutil.rmtree(d)
os.makedirs(d)
md_files = sorted(glob.glob(os.path.join(MODULES_DIR, "*.md")))
if not md_files:
print(f"No MD files found in {MODULES_DIR}")
return
user_html = []
dev_html = []
for md_path in md_files:
with open(md_path) as f:
md_text = f.read()
fname = os.path.basename(md_path)
stem = os.path.splitext(fname)[0]
# Skip template
if stem == "_template":
continue
# Skip error codes (developer-only)
if stem == "11_error_codes":
dev_only = True
else:
dev_only = stem not in USER_MODULES
title = stem.replace("_", " ").title()
html = build_html(md_text, title)
if dev_only:
out_path = os.path.join(DOC_DEV_DIR, fname.replace(".md", ".html"))
with open(out_path, "w") as f:
f.write(html)
dev_html.append(fname)
print(f" [dev] {fname}")
else:
out_path = os.path.join(DOC_DIR, fname.replace(".md", ".html"))
with open(out_path, "w") as f:
f.write(html)
user_html.append(fname)
print(f" [doc] {fname}")
# Build indexes + login page
for d, files, label in [(DOC_DIR, user_html, "User"), (DOC_DEV_DIR, dev_html, "Dev")]:
index = build_index(files)
with open(os.path.join(d, "index.html"), "w") as f:
f.write(index)
with open(os.path.join(d, "login.html"), "w") as f:
f.write(login_page())
print(f" {label}: {len(files)} pages -> {d}")
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,148 @@
#!/bin/bash
# sync_dev_to_public.sh — 比對 dev/public schema同步 pipeline 資料
# Usage: ./sync_dev_to_public.sh [check|sync] [file_uuid]
PSQL="/opt/homebrew/opt/libpq/bin/psql"
set -euo pipefail
SCHEMA="${MOMENTRY_DB_SCHEMA:-dev}"
DB_URL="${DATABASE_URL:-postgres://accusys@localhost:5432/momentry}"
MODE="${1:-check}"
FILE_UUID="${2:-}"
TABLES=("videos" "chunk" "face_detections" "processor_results" "monitor_jobs"
"identities" "identity_bindings" "tkg_nodes" "tkg_edges")
TARGET="public"
if [ -z "$FILE_UUID" ]; then
echo "Usage: $0 [check|sync] <file_uuid>"
echo ""
echo "Examples:"
echo " $0 check bd80fec92b0b6963d177a2c55bf713e2"
echo " $0 sync bd80fec92b0b6963d177a2c55bf713e2"
exit 1
fi
echo "=== Schema Sync: $SCHEMA$TARGET ==="
echo "File UUID: $FILE_UUID"
echo "Mode: $MODE"
echo ""
check_table() {
local table=$1
local col=$2
local src_count dev_count pub_count
dev_count=$($PSQL -At "$DB_URL" -c "SELECT COUNT(*) FROM ${SCHEMA}.${table} WHERE ${col} = '${FILE_UUID}';" 2>/dev/null || echo "ERROR")
pub_count=$($PSQL -At "$DB_URL" -c "SELECT COUNT(*) FROM ${TARGET}.${table} WHERE ${col} = '${FILE_UUID}';" 2>/dev/null || echo "ERROR")
if [ "$dev_count" = "ERROR" ] || [ "$pub_count" = "ERROR" ]; then
echo " ⚠️ $table — query error (table may not exist in $TARGET)"
return 1
fi
if [ "$dev_count" -eq "$pub_count" ]; then
echo "$table$dev_count rows (match)"
return 0
else
echo "$table — dev=$dev_count pub=$pub_count (MISMATCH)"
return 1
fi
}
sync_table() {
local table=$1
local col=$2
local src_count dev_count pub_count
dev_count=$($PSQL -At "$DB_URL" -c "SELECT COUNT(*) FROM ${SCHEMA}.${table} WHERE ${col} = '${FILE_UUID}';" 2>/dev/null || echo "0")
pub_count=$($PSQL -At "$DB_URL" -c "SELECT COUNT(*) FROM ${TARGET}.${table} WHERE ${col} = '${FILE_UUID}';" 2>/dev/null || echo "0")
if [ "$dev_count" = "0" ]; then
echo " ⏭️ $table — dev has 0 rows, skipping"
return
fi
if [ "$dev_count" -eq "$pub_count" ]; then
echo "$table — already synced ($dev_count rows)"
return
fi
echo " 🔄 Syncing $table: dev=$dev_count → pub=$pub_count ..."
# Delete existing public rows, insert from dev
$PSQL "$DB_URL" -q -c "DELETE FROM ${TARGET}.${table} WHERE ${col} = '${FILE_UUID}';" 2>/dev/null || true
# Get columns list (excluding id for SERIAL)
COLS=$($PSQL -At "$DB_URL" -c "
SELECT string_agg(column_name, ', ' ORDER BY ordinal_position)
FROM information_schema.columns
WHERE table_schema='${SCHEMA}' AND table_name='${table}'
AND column_name != 'id'
AND is_updatable='YES';
")
$PSQL "$DB_URL" -q -c "
INSERT INTO ${TARGET}.${table} (${COLS})
SELECT ${COLS}
FROM ${SCHEMA}.${table}
WHERE ${col} = '${FILE_UUID}';
" 2>/dev/null && echo "$table synced" || echo "$table sync FAILED"
}
echo "=== Checking Tables ==="
echo ""
MISMATCH=0
for table in "${TABLES[@]}"; do
# Determine the UUID column name for each table
case "$table" in
videos) col="file_uuid" ;;
chunk) col="file_uuid" ;;
face_detections) col="file_uuid" ;;
processor_results) col="file_uuid" ;;
monitor_jobs) col="uuid" ;;
identities) col="uuid" ;; # identities.uuid is UUID type
identity_bindings) col="uuid" ;;
tkg_nodes) col="file_uuid" ;;
tkg_edges) col="file_uuid" ;;
*) col="file_uuid" ;;
esac
if ! check_table "$table" "$col"; then
MISMATCH=$((MISMATCH + 1))
fi
done
echo ""
if [ "$MISMATCH" -eq 0 ]; then
echo "✅ All tables in sync"
exit 0
fi
if [ "$MODE" != "sync" ]; then
echo "⚠️ $MISMATCH table(s) have mismatches. Run '$0 sync $FILE_UUID' to fix."
exit 1
fi
echo "=== Syncing Tables ==="
echo ""
for table in "${TABLES[@]}"; do
case "$table" in
videos) col="file_uuid" ;;
chunk) col="file_uuid" ;;
face_detections) col="file_uuid" ;;
processor_results) col="file_uuid" ;;
monitor_jobs) col="uuid" ;;
identities) col="uuid" ;;
identity_bindings) col="uuid" ;;
tkg_nodes) col="file_uuid" ;;
tkg_edges) col="file_uuid" ;;
*) col="file_uuid" ;;
esac
sync_table "$table" "$col"
done
echo ""
echo "✅ Sync complete"

View File

@@ -0,0 +1,174 @@
#!/usr/bin/env python3
"""批量更新 Qdrant collection 中的 file_uuid (舊→新)"""
import json
import subprocess
import sys
QDRANT_URL = "http://localhost:6333"
# UUID mapping: 舊 → 新
UUID_MAP = {
"aeed71342a899fe4b4c57b7d41bcb692": [
"bd80fec92b0b6963d177a2c55bf713e2",
],
}
# Collections to process
COLLECTIONS = [
"momentry_dev_v1",
"momentry_dev_stories",
"momentry_dev_voice",
"momentry_dev_rule1_v2",
"momentry_dev_faces",
"sentence_story",
"sentence_summary",
]
def qdrant_get(path: str) -> dict:
res = subprocess.run(
["curl", "-s", "-X", "GET", f"{QDRANT_URL}{path}"],
capture_output=True, text=True
)
return json.loads(res.stdout) if res.stdout.strip() else {}
def qdrant_post(path: str, body: dict) -> dict:
tmp = "/tmp/qdrant_post.json"
with open(tmp, "w") as f:
json.dump(body, f)
res = subprocess.run(
["curl", "-s", "-X", "POST", f"{QDRANT_URL}{path}",
"-H", "Content-Type: application/json", "-d", f"@{tmp}"],
capture_output=True, text=True
)
return json.loads(res.stdout) if res.stdout.strip() else {}
def qdrant_put(path: str, body: dict) -> dict:
tmp = "/tmp/qdrant_update.json"
with open(tmp, "w") as f:
json.dump(body, f)
res = subprocess.run(
["curl", "-s", "-X", "PUT", f"{QDRANT_URL}{path}",
"-H", "Content-Type: application/json", "-d", f"@{tmp}"],
capture_output=True, text=True
)
return json.loads(res.stdout) if res.stdout.strip() else {}
def scroll_all(collection: str, filter_old: dict) -> list:
"""Scroll all matching points from a collection"""
points = []
offset = None
while True:
body = {
"limit": 1000,
"with_payload": True,
"with_vector": True,
"filter": filter_old,
}
if offset:
body["offset"] = offset
result = qdrant_post(f"/collections/{collection}/points/scroll", body)
batch = result.get("result", {}).get("points", [])
points.extend(batch)
next_offset = result.get("result", {}).get("next_page_offset")
if next_offset is None:
break
offset = next_offset
return points
def update_points(collection: str, points: list, old_uuid: str, new_uuid: str):
"""Update file_uuid in payload for the given points"""
if not points:
return 0
updated = []
for p in points:
pl = p.get("payload", {})
# Check both 'uuid' and 'file_uuid' fields
changed = False
if pl.get("uuid") == old_uuid:
pl["uuid"] = new_uuid
changed = True
if pl.get("file_uuid") == old_uuid:
pl["file_uuid"] = new_uuid
changed = True
if changed:
updated.append({
"id": p["id"],
"vector": p["vector"],
"payload": pl,
})
if not updated:
return 0
# Update in batches of 500
total = len(updated)
for i in range(0, total, 500):
batch = updated[i:i+500]
result = qdrant_put(
f"/collections/{collection}/points?wait=true",
{"points": batch}
)
if result.get("status") != "ok":
print(f" Error at {i}: {result}")
return i
return total
def main():
for collection in COLLECTIONS:
# Check if collection exists
info = qdrant_get(f"/collections/{collection}")
if "result" not in info:
continue
for old_uuid, new_uuids in UUID_MAP.items():
for new_uuid in new_uuids:
# Scroll all points with this old UUID
filter_body = {
"must": [
{"should": [
{"key": "uuid", "match": {"value": old_uuid}},
{"key": "file_uuid", "match": {"value": old_uuid}},
]}
]
}
points = scroll_all(collection, filter_body)
if not points:
continue
print(f"{collection}: {len(points)} points with UUID {old_uuid[:8]}...")
updated = update_points(collection, points, old_uuid, new_uuid)
print(f"{updated} points updated to {new_uuid[:8]}...")
# Verify
print("\n=== Verification ===")
for collection in COLLECTIONS:
for old_uuid, new_uuids in UUID_MAP.items():
for what, uuid in [("old", old_uuid), ("new", new_uuids[0])]:
filter_body = {
"must": [
{"should": [
{"key": "uuid", "match": {"value": uuid}},
{"key": "file_uuid", "match": {"value": uuid}},
]}
]
}
result = qdrant_post(
f"/collections/{collection}/points/count",
{"filter": filter_body}
)
cnt = result.get("result", {}).get("count", 0)
if cnt > 0:
print(f" {collection}: {cnt} points with {what} UUID")
print("✅ Done")
if __name__ == "__main__":
main()

224
doc_wasm/Cargo.lock generated Normal file
View File

@@ -0,0 +1,224 @@
# This file is automatically @generated by Cargo.
# It is not intended for manual editing.
version = 4
[[package]]
name = "bitflags"
version = "2.11.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c4512299f36f043ab09a583e57bceb5a5aab7a73db1805848e8fef3c9e8c78b3"
[[package]]
name = "bumpalo"
version = "3.20.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5d20789868f4b01b2f2caec9f5c4e0213b41e3e5702a50157d699ae31ced2fcb"
[[package]]
name = "cfg-if"
version = "1.0.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9330f8b2ff13f34540b44e946ef35111825727b38d33286ef986142615121801"
[[package]]
name = "doc_wasm"
version = "0.1.0"
dependencies = [
"pulldown-cmark",
"serde",
"serde_json",
"wasm-bindgen",
]
[[package]]
name = "getopts"
version = "0.2.24"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "cfe4fbac503b8d1f88e6676011885f34b7174f46e59956bba534ba83abded4df"
dependencies = [
"unicode-width",
]
[[package]]
name = "itoa"
version = "1.0.18"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8f42a60cbdf9a97f5d2305f08a87dc4e09308d1276d28c869c684d7777685682"
[[package]]
name = "memchr"
version = "2.8.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f8ca58f447f06ed17d5fc4043ce1b10dd205e060fb3ce5b979b8ed8e59ff3f79"
[[package]]
name = "once_cell"
version = "1.21.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9f7c3e4beb33f85d45ae3e3a1792185706c8e16d043238c593331cc7cd313b50"
[[package]]
name = "proc-macro2"
version = "1.0.106"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8fd00f0bb2e90d81d1044c2b32617f68fcb9fa3bb7640c23e9c748e53fb30934"
dependencies = [
"unicode-ident",
]
[[package]]
name = "pulldown-cmark"
version = "0.11.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "679341d22c78c6c649893cbd6c3278dcbe9fc4faa62fea3a9296ae2b50c14625"
dependencies = [
"bitflags",
"getopts",
"memchr",
"pulldown-cmark-escape",
"unicase",
]
[[package]]
name = "pulldown-cmark-escape"
version = "0.11.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "007d8adb5ddab6f8e3f491ac63566a7d5002cc7ed73901f72057943fa71ae1ae"
[[package]]
name = "quote"
version = "1.0.45"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "41f2619966050689382d2b44f664f4bc593e129785a36d6ee376ddf37259b924"
dependencies = [
"proc-macro2",
]
[[package]]
name = "rustversion"
version = "1.0.22"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b39cdef0fa800fc44525c84ccb54a029961a8215f9619753635a9c0d2538d46d"
[[package]]
name = "serde"
version = "1.0.228"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9a8e94ea7f378bd32cbbd37198a4a91436180c5bb472411e48b5ec2e2124ae9e"
dependencies = [
"serde_core",
"serde_derive",
]
[[package]]
name = "serde_core"
version = "1.0.228"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "41d385c7d4ca58e59fc732af25c3983b67ac852c1a25000afe1175de458b67ad"
dependencies = [
"serde_derive",
]
[[package]]
name = "serde_derive"
version = "1.0.228"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d540f220d3187173da220f885ab66608367b6574e925011a9353e4badda91d79"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "serde_json"
version = "1.0.149"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "83fc039473c5595ace860d8c4fafa220ff474b3fc6bfdb4293327f1a37e94d86"
dependencies = [
"itoa",
"memchr",
"serde",
"serde_core",
"zmij",
]
[[package]]
name = "syn"
version = "2.0.117"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e665b8803e7b1d2a727f4023456bbbbe74da67099c585258af0ad9c5013b9b99"
dependencies = [
"proc-macro2",
"quote",
"unicode-ident",
]
[[package]]
name = "unicase"
version = "2.9.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "dbc4bc3a9f746d862c45cb89d705aa10f187bb96c76001afab07a0d35ce60142"
[[package]]
name = "unicode-ident"
version = "1.0.24"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e6e4313cd5fcd3dad5cafa179702e2b244f760991f45397d14d4ebf38247da75"
[[package]]
name = "unicode-width"
version = "0.2.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b4ac048d71ede7ee76d585517add45da530660ef4390e49b098733c6e897f254"
[[package]]
name = "wasm-bindgen"
version = "0.2.121"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "49ace1d07c165b0864824eee619580c4689389afa9dc9ed3a4c75040d82e6790"
dependencies = [
"cfg-if",
"once_cell",
"rustversion",
"wasm-bindgen-macro",
"wasm-bindgen-shared",
]
[[package]]
name = "wasm-bindgen-macro"
version = "0.2.121"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8e68e6f4afd367a562002c05637acb8578ff2dea1943df76afb9e83d177c8578"
dependencies = [
"quote",
"wasm-bindgen-macro-support",
]
[[package]]
name = "wasm-bindgen-macro-support"
version = "0.2.121"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d95a9ec35c64b2a7cb35d3fead40c4238d0940c86d107136999567a4703259f2"
dependencies = [
"bumpalo",
"proc-macro2",
"quote",
"syn",
"wasm-bindgen-shared",
]
[[package]]
name = "wasm-bindgen-shared"
version = "0.2.121"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c4e0100b01e9f0d03189a92b96772a1fb998639d981193d7dbab487302513441"
dependencies = [
"unicode-ident",
]
[[package]]
name = "zmij"
version = "1.0.21"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b8848ee67ecc8aedbaf3e4122217aff892639231befc6a1b58d29fff4c2cabaa"

18
doc_wasm/Cargo.toml Normal file
View File

@@ -0,0 +1,18 @@
[package]
name = "doc_wasm"
version = "0.1.0"
edition = "2021"
[lib]
crate-type = ["cdylib", "rlib"]
[dependencies]
wasm-bindgen = "0.2"
pulldown-cmark = "0.11"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
[profile.release]
lto = true
opt-level = "s"
strip = true

29
doc_wasm/src/lib.rs Normal file
View File

@@ -0,0 +1,29 @@
use wasm_bindgen::prelude::*;
#[wasm_bindgen]
pub fn render_markdown(md: &str) -> String {
let parser = pulldown_cmark::Parser::new(md);
let mut html = String::new();
pulldown_cmark::html::push_html(&mut html, parser);
// wrap tables
html = html.replace("<table>", "<table class=\"table\">");
html
}
#[wasm_bindgen]
pub fn module_list() -> String {
serde_json::to_string(&[
("01_auth", "安全認證", "Authentication"),
("02_health", "健康檢查", "Health"),
("03_register", "檔案註冊", "File Registration"),
("04_lookup", "檔案屬性查詢", "File Lookup"),
("05_process", "處理流程", "Processing"),
("06_search", "搜尋功能", "Search"),
("07_identity", "身份識別", "Identity"),
("08_identity_agent", "智能身份綁定", "Smart Identity Binding"),
("08_media", "串流與截圖", "Streaming & Thumbnails"),
("09_tmdb", "TMDb 整合", "TMDb Integration"),
("10_pipeline", "生產線", "Pipeline"),
("12_agent", "智慧代理", "AI Agents"),
]).unwrap_or_default()
}

View File

@@ -0,0 +1,70 @@
# 3002/3003 Schema Separation Status
Date: 2026-05-17
Status: ✅ Pipeline tables created in `public`; schema incompatibilities remain
## Summary
| Schema | Has pipeline tables | Has auth tables | Used by |
|--------|-------------------|-----------------|---------|
| `public` | ✅ (newly created) | ✅ (original) | 3002 (production) — currently using `dev` as workaround |
| `dev` | ✅ (full, working) | ✅ (synced) | 3003 (playground) |
## What Was Done
### Pipeline tables created in `public` schema (11 tables)
- `videos`, `chunk`, `chunk_vectors`, `cuts`, `frames`
- `monitor_jobs`, `processor_results`, `processor_versions`
- `parent_chunks`, `tkg_edges`, `tkg_nodes`
All include proper sequences, indexes, and constraints matching the `dev` schema.
## Remaining Blockers
### Schema incompatibilities between `dev` and `public`
| Table | dev cols | public cols | Status |
|-------|---------|------------|--------|
| identities | 17 | 16 | ⚠️ Different columns (e.g. `name` vs `real_name`/`actor_name`) |
| face_detections | 16 | 17 | ⚠️ Column count mismatch |
| identity_bindings | 7 | 8 | ⚠️ Column count mismatch |
| person_identities | 16 | 15 | ⚠️ Column count mismatch |
| pre_chunks | 19 | 10 | ⚠️ Significantly different |
| api_keys | 19 | 19 | ✅ Match |
| resources | 9 | 9 | ✅ Match |
| users | 8 | 8 | ✅ Match |
### Identities table key differences
- `public.identities` uses `real_name` + `actor_name` (old schema)
- `dev.identities` uses `name` (new unified schema)
- `dev.identities` has `tmdb_poster`, `file_uuid`, `face_embedding`, `voice_embedding`, `identity_embedding`
- `public.identities` only has `face_embedding`, `voice_embedding` (no `identity_embedding`)
## Options
### Option A: Full data migration (recommended for later)
1. Dump data from old public tables
2. Drop old public tables
3. Recreate from dev schema DDL
4. Migrate data with column mapping
5. Switch 3002 to `DATABASE_SCHEMA=public`
### Option B: Keep current workaround (simplest for now)
- 3002 continues with `DATABASE_SCHEMA=dev`
- 3003 uses `DATABASE_SCHEMA=dev`
- Both share the same schema, but have separate Redis key prefixes + ports
### Option C: Rename dev → public (requires downtime)
1. Stop all services
2. Rename `dev` schema to something else
3. Rename `public` to `public_old`
4. Rename `dev` to `public`
5. Update references
## Current Status
✅ Pipeline tables exist in both schemas
✅ auth tables (users, sessions, jwt_blacklist) exist in both
✅ Redis key prefixes separate (`momentry:` vs `momentry_dev:`)
⚠️ 3002 still uses `DATABASE_SCHEMA=dev` workaround
⛔ Shared tables need migration before 3002 can use `public` schema

View File

@@ -1,102 +0,0 @@
# Momentry Core API 文件總覽
| 項目 | 內容 |
|------|------|
| 版本 | V2.1 |
| 日期 | 2026-03-25 |
---
## 文件架構
```
docs/
├── API_INDEX.md ← 本文件:總覽與入口
├── API_ENDPOINTS.md ← API 端點完整說明
├── API_EXAMPLES.md ← 完整範例總覽curl / n8n / WordPress
├── DEMO_MANUAL.md ← ⭐ 示範手冊(含 Demo API Key
├── API_N8N_GUIDE.md ← n8n 詳細指南
├── API_WORDPRESS_GUIDE.md ← WordPress 詳細指南
├── API_CURL_EXAMPLES.md ← curl 快速範例
└── API_REFERENCE.md ← 詳細技術參考
```
---
## 快速選擇指南
| 需求 | 閱讀文件 |
|------|----------|
| **我要快速開始測試** | ⭐ [DEMO_MANUAL.md](./DEMO_MANUAL.md) |
| **我要查看所有範例** | [API_EXAMPLES.md](./API_EXAMPLES.md) |
| 我想了解有哪些 API 端點 | [API_ENDPOINTS.md](./API_ENDPOINTS.md) |
| 我要在 n8n workflow 中呼叫 API | [DEMO_MANUAL.md](./DEMO_MANUAL.md#2-n8n-範例) |
| 我要在 WordPress 中呼叫 API | [DEMO_MANUAL.md](./DEMO_MANUAL.md#3-wordpress-範例) |
| 我要用 curl 快速測試 | [DEMO_MANUAL.md](./DEMO_MANUAL.md#1-curl-範例) |
---
## 認證
### Demo API Key
```
API Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69
Key ID: muser_68600856036340bcafc01930eb4bd839
過期日: 2027-03-25
```
### 使用方式
```bash
curl -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
http://localhost:3002/api/v1/videos
```
---
## API URL 選擇
| 環境 | URL | 使用時機 |
|------|-----|----------|
| **本地開發** | `http://localhost:3002` | 開發/測試、繞過反向代理 |
| **外部訪問** | `https://api.momentry.ddns.net` | n8n、WordPress、遠端系統 |
### 何時用哪個
**使用 `localhost:3002`**
- 本地終端機測試
- 當反向代理有問題時
- 快速除錯
**使用 `api.momentry.ddns.net`**
- n8n workflow
- WordPress 網站
- 外部系統整合
---
## 常見問題
### Q: API 返回 401 錯誤?
API Key 無效或過期。請使用 Demo API Key 或建立新的 API Key。
### Q: API 返回 502 錯誤?
```bash
# 檢查服務狀態
launchctl list | grep momentry.api
# 如未啟動
sudo launchctl load /Library/LaunchDaemons/com.momentry.api.plist
```
### Q: 兩個 URL 功能相同嗎?
是的,所有端點完全相同,只是訪問路徑不同。
---
## 相關文件
- [DEMO_MANUAL.md](./DEMO_MANUAL.md) - ⭐ 示範手冊(推薦新手)
- [INSTALL_MOMENTRY_API.md](./INSTALL_MOMENTRY_API.md) - API 服務安裝指南
- [PENDING_ISSUES.md](./PENDING_ISSUES.md) - 待解決問題追蹤

View File

@@ -1,447 +0,0 @@
# Momentry Core API 安裝指南
| 項目 | 內容 |
|------|------|
| 建立者 | Warren |
| 建立時間 | 2026-03-18 |
| 文件版本 | V1.0 |
---
## 版本歷史
| 版本 | 日期 | 目的 | 操作人 | 工具/模型 |
|------|------|------|--------|-----------|
| V1.0 | 2026-03-18 | 創建文件 | Warren | OpenCode / MiniMax M2.5 |
| V1.1 | 2026-03-23 | 更新端點與實際一致 | OpenCode | - |
---
## Base URL
| 環境 | URL | 說明 |
|------|-----|------|
| **本地開發** | `http://localhost:3002` | 直接訪問 API繞過反向代理 |
| **外部訪問** | `https://api.momentry.ddns.net` | 通過 Caddy 反向代理訪問,需網路可達 |
> **Note:** Port 3000 is used by Gitea. Momentry API server runs on **port 3002**.
### URL 使用時機
| 情境 | 建議 URL |
|------|----------|
| 本地開發/測試 | `http://localhost:3002` |
| n8n workflow | `https://api.momentry.ddns.net` |
| 外部系統整合 | `https://api.momentry.ddns.net` |
| 反向代理有問題時 | `http://localhost:3002` (繞過代理) |
## Authentication
Currently no authentication is required.
---
## Endpoints
### 1. Register Video
Register a video file to the system.
**Endpoint:** `POST /api/v1/register`
**Request Body:**
```json
{
"path": "/path/to/video.mp4"
}
```
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `path` | string | Yes | Absolute path to video file |
**Response (200):**
```json
{
"uuid": "5dea6618a606e7c7",
"video_id": 1,
"file_name": "video.mp4",
"duration": 120.5,
"width": 1920,
"height": 1080
}
```
**Example:**
```bash
curl -X POST http://localhost:3002/api/v1/register \
-H "Content-Type: application/json" \
-d '{"path": "/Users/accusys/test_video/BigBuckBunny_320x180.mp4"}'
```
---
### 2. Process Video (CLI)
Process video to generate ASR, CUT, YOLO, OCR, Face, Pose data.
**Note:** This is a CLI command, not an HTTP endpoint.
```bash
# Process video by UUID
cargo run --bin momentry -- process 5dea6618a606e7c7
# Or process by file path
cargo run --bin momentry -- process /path/to/video.mp4
```
---
### 3. Get Progress
Get real-time processing progress via Redis.
**Endpoint:** `GET /api/v1/progress/:uuid`
| Parameter | Type | Description |
|-----------|------|-------------|
| `uuid` | path | Video UUID (16 characters) |
**Response (200):**
```json
{
"uuid": "5dea6618a606e7c7",
"processors": [
{
"name": "asr",
"status": "complete",
"current": 0,
"total": 0,
"message": "7 segments"
},
{
"name": "cut",
"status": "complete",
"current": 134,
"total": 134,
"message": "134 scenes"
},
{
"name": "yolo",
"status": "progress",
"current": 5000,
"total": 14315,
"message": "frame 5000"
},
{
"name": "ocr",
"status": "pending",
"current": 0,
"total": 0,
"message": ""
}
]
}
```
**Processor Status Values:**
- `pending` - Not started
- `info` - Starting/info message
- `progress` - In progress
- `complete` - Finished
- `error` - Failed
**Example:**
```bash
# Get progress for specific video
curl http://localhost:3002/api/v1/progress/5dea6618a606e7c7
```
---
### 4. Natural Language Search
Search video chunks using natural language queries (RAG).
**Endpoint:** `POST /api/v1/search`
**Request Body:**
```json
{
"query": "What is the person saying about machine learning?",
"limit": 10,
"uuid": "5dea6618a606e7c7"
}
```
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `query` | string | Yes | Natural language search query |
| `limit` | integer | No | Max results (default: 10) |
| `uuid` | string | No | Filter by specific video UUID |
**Response (200):**
```json
{
"results": [
{
"uuid": "5dea6618a606e7c7",
"chunk_id": "0",
"chunk_type": "sentence",
"start_time": 5.5,
"end_time": 8.2,
"text": "Machine learning is a subset of artificial intelligence...",
"score": 0.85
}
],
"query": "What is the person saying about machine learning?"
}
```
**Example:**
```bash
curl -X POST http://localhost:3002/api/v1/search \
-H "Content-Type: application/json" \
-d '{"query": "machine learning", "limit": 5}'
```
---
### 4a. N8N Search (n8n Workflow Integration)
N8n-compatible search endpoint with standardized response format for direct workflow integration.
**Endpoint:** `POST /api/v1/n8n/search`
**Request Body:**
```json
{
"query": "sunset",
"limit": 10,
"uuid": "5dea6618a606e7c7"
}
```
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `query` | string | Yes | Natural language search query |
| `limit` | integer | No | Max results (default: 10) |
| `uuid` | string | No | Filter by specific video UUID |
**Response (200):**
```json
{
"query": "sunset",
"count": 2,
"hits": [
{
"id": "c_001",
"vid": "5dea6618a606e7c7",
"start": 5.5,
"end": 8.2,
"title": "Sunset Scene",
"text": "The sun slowly sets over the ocean...",
"score": 0.92,
"media_url": "https://wp.momentry.ddns.net/video.mp4"
}
]
}
```
| Field | Type | Description |
|-------|------|-------------|
| `query` | string | Original search query |
| `count` | integer | Number of results |
| `hits[].id` | string | Chunk ID |
| `hits[].vid` | string | Video UUID |
| `hits[].start` | number | Start time in seconds |
| `hits[].end` | number | End time in seconds |
| `hits[].title` | string | Chunk title (from metadata or auto-generated) |
| `hits[].text` | string | Text content |
| `hits[].score` | number | Relevance score (0-1) |
| `hits[].media_url` | string | Full media URL (optional) |
**Example:**
```bash
curl -X POST http://localhost:3002/api/v1/n8n/search \
-H "Content-Type: application/json" \
-d '{"query": "sunset", "limit": 5}'
```
**Environment Variables:**
| Variable | Default | Description |
|----------|---------|-------------|
| `MOMENTRY_MEDIA_BASE_URL` | `https://wp.momentry.ddns.net` | Base URL for constructing media URLs |
---
### 5. Lookup Video
Lookup video UUID by path or get video details by UUID.
**Endpoint:** `GET /api/v1/lookup`
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `path` | query | No* | Video file path |
| `uuid` | query | No* | Video UUID |
*One of `path` or `uuid` is required.
**Response (200):**
```json
{
"uuid": "5dea6618a606e7c7",
"file_path": "/path/to/video.mp4",
"file_name": "video.mp4",
"duration": 120.5
}
```
**Example:**
```bash
# Lookup by path
curl "http://localhost:3002/api/v1/lookup?path=/path/to/video.mp4"
# Lookup by UUID
curl "http://localhost:3002/api/v1/lookup?uuid=5dea6618a606e7c7"
```
---
### 6. List Videos
List all registered videos.
**Endpoint:** `GET /api/v1/videos`
**Response (200):**
```json
{
"videos": [
{
"uuid": "5dea6618a606e7c7",
"file_path": "/path/to/video.mp4",
"file_name": "video.mp4",
"duration": 120.5,
"width": 1920,
"height": 1080
}
]
}
```
**Example:**
```bash
curl http://localhost:3002/api/v1/videos
```
---
## Data Flow
```
┌─────────────────────────────────────────────────────────────────────┐
│ 完整工作流程 │
└─────────────────────────────────────────────────────────────────────┘
1. Register Video
POST /api/v1/register
└── UUID: 5dea6618a606e7c7
2. Process Video (CLI)
cargo run -- process 5dea6618a606e7c7
├── ASR (WhisperX) → 7 segments
├── CUT (PySceneDetect) → 134 scenes
├── YOLO (YOLOv8) → 10483 frames with objects
├── OCR (EasyOCR) → 40 frames with text
├── Face (OpenCV) → 44 frames with faces
└── Pose (YOLOv8-Pose) → 14315 frames
3. Monitor Progress (Real-time)
GET /api/v1/progress/:uuid
└── Redis Pub/Sub + Hash
4. Chunk (CLI)
cargo run -- chunk 5dea6618a606e7c7
└── Create chunks in database
5. Vectorize (CLI)
cargo run -- vectorize 5dea6618a606e7c7
└── Generate embeddings in Qdrant
6. Search (API)
POST /api/v1/search
└── Natural language query
```
---
## Processor Reference
| Processor | Model | Description |
|-----------|-------|-------------|
| **ASR** | WhisperX (faster-whisper) | Speech recognition + diarization |
| **CUT** | PySceneDetect | Scene detection/segmentation |
| **ASRX** | WhisperX | Speaker diarization |
| **YOLO** | YOLOv8n | Object detection |
| **OCR** | EasyOCR | Text recognition |
| **Face** | OpenCV Haar Cascade | Face detection |
| **Pose** | YOLOv8n-Pose | Pose estimation |
---
## Error Responses
**400 Bad Request**
```json
{
"error": "Invalid request body"
}
```
**404 Not Found**
```json
{
"error": "Resource not found"
}
```
**500 Internal Server Error**
```json
{
"error": "Internal server error"
}
```
---
## Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `DATABASE_URL` | `postgres://accusys@localhost:5432/momentry` | PostgreSQL connection |
| `REDIS_URL` | `redis://localhost:6379` | Redis connection |
| `REDIS_PASSWORD` | `accusys` | Redis password |
| `QDRANT_URL` | `http://localhost:6333` | Qdrant vector DB URL |
| `QDRANT_API_KEY` | - | Qdrant API key |
| `QDRANT_COLLECTION` | `chunks` | Qdrant collection name |
| `MOMENTRY_MEDIA_BASE_URL` | `https://wp.momentry.ddns.net` | Base URL for n8n search media URLs |
---
## Starting the Server
```bash
# Default (port 3002, since 3000 is Gitea)
cargo run --bin momentry -- server
# Custom host and port
cargo run --bin momentry -- server --host 127.0.0.1 --port 3002
```
---
## Quick Reference
| Task | Command |
|------|---------|
| Register video | `POST /api/v1/register` |
| Process video | `cargo run -- process <uuid>` |
| Check progress | `GET /api/v1/progress/<uuid>` |
| Search | `POST /api/v1/search` |
| List videos | `GET /api/v1/videos` |
| Lookup | `GET /api/v1/lookup?uuid=<uuid>` |

View File

@@ -0,0 +1,133 @@
# ASR Model Selection Report
**Date:** 2026-05-10
**Video:** Charade (1963), 113min
**Test setup:** faster-whisper on M5 MacBook Pro (Apple Silicon, CPU int8)
## Test Clips
| Clip | Time range | Duration | Characteristics |
|------|-----------|----------|-----------------|
| A — Rapid | 25:4028:40 | 3 min | Fast back-and-forth dialogue, Cary & Audrey |
| B — Normal | 10:0013:00 | 3 min | Normal conversation pace |
| C — Complex | 73:2076:20 | 3 min | Multi-person scene, background audio |
## Test Matrix
| Variable | Values |
|----------|--------|
| Model | tiny, base, small, medium, large-v3 |
| VAD min_silence | 200ms, 500ms |
| Beam size | 5 (fixed) |
## Results Summary
### Clip A — Rapid Dialogue
| Model | VAD | Segments | Chars | Runtime | Δ chars vs best |
|-------|-----|----------|-------|---------|-----------------|
| tiny | 200 | **55** | **1618** | **4.8s** | — |
| tiny | 500 | **59** | 1582 | **4.8s** | 36 |
| base | 200 | 50 | 1543 | 9.7s | 75 |
| base | 500 | 51 | 1547 | 11.6s | 71 |
| small | 200 | 47 | 1538 | 15.0s | 80 |
| small | 500 | 47 | 1538 | 14.5s | 80 |
| medium | 200 | 45 | 1241 | 34.0s | 377 |
| medium | 500 | 45 | 1241 | 34.9s | 377 |
| large-v3 | 200 | 14 | 916 | 42.1s | 702 |
| large-v3 | 500 | 14 | 916 | 42.0s | 702 |
**Winner: tiny** — 5559 segments, most text captured, 4.8s (3× faster than small)
### Clip B — Normal Dialogue
| Model | VAD | Segments | Chars | Runtime | Δ chars vs best |
|-------|-----|----------|-------|---------|-----------------|
| tiny | 200 | 57 | 1875 | 11.9s | 40 |
| tiny | 500 | **59** | 1801 | 10.9s | 114 |
| base | 200 | 23 | 1695 | **5.1s** | 220 |
| base | 500 | 23 | 1695 | **5.1s** | 220 |
| small | 200 | **62** | 1731 | 15.7s | 184 |
| small | 500 | **62** | 1731 | 16.4s | 184 |
| medium | 200 | 59 | 1758 | 44.9s | 157 |
| medium | 500 | 59 | 1758 | 44.8s | 157 |
| large-v3 | 200 | 32 | **1915** | 95.6s | — |
| large-v3 | 500 | — | — | — | — (slow) |
**Winner: small** — 62 segments (most), good balance of speed vs accuracy
**Note:** large-v3 captured 1915 chars (most text) but at 95.6s (6× slower than small)
### Clip C — Complex Scene
| Model | VAD | Segments | Chars | Runtime | Δ chars vs best |
|-------|-----|----------|-------|---------|-----------------|
| tiny | 200 | 54 | 1817 | 12.2s | 336 |
| tiny | 500 | 52 | 1788 | 10.5s | 365 |
| base | 200 | 51 | 2018 | 10.1s | 135 |
| base | 500 | 51 | 2006 | 9.2s | 147 |
| small | 200 | **64** | 1902 | 22.5s | 251 |
| small | 500 | 61 | **2041** | 21.2s | 112 |
| medium | 200 | 57 | 2044 | 999.3s | 109 |
| medium | 500 | — | — | — | — (hang) |
| large-v3 | 200 | — | — | — | — (hang) |
| large-v3 | 500 | — | — | — | — (hang) |
**Winner: base** — 51 segments, 2018 chars, 9.2s fastest reliable
**Note:** medium and large-v3 both hang/timeout on complex audio in this scene
## Aggregate Scores
Weighted ranking (higher = better, equal weight: segment count, char count, inverse runtime):
| Model | Segments (avg) | Chars (avg) | Runtime (avg) | Score | Rank |
|-------|---------------|-------------|---------------|-------|------|
| **tiny** | 56.0 | 1730 | **9.2s** | **8.5** | 🥇 |
| **small** | 54.7 | 1704 | 17.6s | **7.8** | 🥈 |
| base | 41.5 | 1751 | 10.1s | 7.0 | 🥉 |
| medium | 51.5 | 1627 | 339.6s | 3.5 | 4 |
| large-v3 | 20.0 | 1249 | 68.8s | 2.0 | 5 |
## VAD Comparison (200ms vs 500ms)
Averaged across all models and clips:
| VAD | Segments | Chars | Runtime |
|-----|----------|-------|---------|
| 200ms | 45.9 | 1683 | 86.1s |
| 500ms | 46.6 | 1685 | 69.2s |
**Difference:** Negligible. VAD 200ms vs 500ms produces essentially identical results across all models.
## Conclusions
### 1. Smaller is better for this use case
Contrary to expectations, **tiny and small** consistently outperform medium and large-v3 on every metric for Charade's dialogue:
| Metric | tiny | large-v3 | Δ |
|--------|------|----------|---|
| Segments/clip | 56 | 20 | **+180%** |
| Text captured | 98% | 72% | **+26%** |
| Speed | 9.2s | 68.8s | **7.5× faster** |
### 2. Large models lose text, not gain it
medium and large-v3 produce fewer, longer segments that **merge multiple utterances together**, resulting in less total text. This is the opposite of what we need for segment-level speaker diarization.
### 3. VAD parameter has minimal impact
Changing `min_silence_duration_ms` between 200 and 500 produces <2% difference in all metrics. The current default (500ms) is fine.
### 4. Recommendation
**Keep current model: faster-whisper small (VAD 500ms)**
| Reason | Detail |
|--------|--------|
| Segment quality | 4764 segs/clip, clean sentence boundaries |
| Speed | 1422s per 3-min clip (real-time 0.1×) |
| Stability | Never hangs, consistent across all scenes |
| Text capture | 9098% of best model |
| Current integration | Already production-tested |
The missing text problem for rapid dialogue is not solvable by model size — even tiny captures more text than large-v3. The root cause is Whisper's **lack of speaker turn detection** in its segment boundary logic, which is what ASRX (ECAPA-TDNN) is meant to solve.

View File

@@ -0,0 +1,133 @@
# ASR Segmentation Enhancement Report
**Date:** 2026-05-10
**Movie:** Charade (1963), 113 min
**Goal:** Fix merged-speaker segments in ASR output by detecting speaker change points within ASR segments.
## Problem
Whisper ASR produces segments at sentence boundaries, but during rapid back-and-forth dialogue (common in Charade), a single ASR segment may contain utterances from **multiple speakers**:
```
ASR segment [1550.0-1554.0] (4.0s):
"What's she saying now?"
Actual dialogue:
1552.7: Audrey: "What's she saying now?"
1553.4: Cary: "That she's innocent."
```
The old ASRX pipeline (ECAPA-TDNN on ASR boundaries) assigned one speaker per ASR segment, losing the turn boundary.
## Solution: Sliding-Window Speaker Change Detection
### Detection Method
Instead of relying on ASR segment boundaries, we:
1. **Slide a 1.5s window (0.75s stride)** across the entire audio
2. **Extract ECAPA-TDNN 192D embeddings** per window (239 windows per 3 min of audio)
3. **Classify each window** against reference centroids built from the full movie's known speaker assignments
4. **Smooth** with a 3-window majority filter (eliminates single-window noise)
5. **Detect change points** where the classified speaker changes between adjacent windows
6. **Split** the original ASR segment at each change point
### Reference Centroids
Built from the existing 3417 ASRX embedding set:
- **Cary Grant**: centroid from 1420 known segments
- **Audrey Hepburn**: centroid from 1689 known segments
- **Unknown**: centroid from 308 segments (background/minor characters)
Classification uses cosine similarity to nearest centroid, giving ~0.8+ similarity for main characters.
### Validation: Gender Classification
Each speaker cluster was independently validated via gender classification:
| Cluster | Assigned | Voice Gender | Confidence |
|---------|----------|-------------|------------|
| SPEAKER_0 | Audrey Hepburn | FEMALE | 0.71 |
| SPEAKER_1 | Cary Grant | MALE | 0.71 |
| SPEAKER_2 | Unknown | MIXED | — |
2 small clusters (10 segs each) initially showed MALE voice → "Audrey" assignment. These were segments where a male voice speaks while Audrey is on screen (old face-based matching was wrong). The fine-grained segmentation correctly resolves these.
### Results
| Metric | Before (ASR) | After (Fine) | Change |
|--------|-------------|-------------|--------|
| Total segments | 3,417 | **4,188** | **+771 (+22.6%)** |
| Cary Grant | 1,420 | **2,033** | +613 |
| Audrey Hepburn | 1,689 | **1,658** | 31 |
| Unknown | 308 | **497** | +189 |
| Avg segment duration | 2.0s | **1.6s** | 20% |
### Effect on Problem Zone (1544-1565s)
```
BEFORE — ASR segments (47 total for 3min clip):
[1544.0-1546.0] "Who's that with the hat?" → single speaker
[1546.0-1548.0] "That's the policeman." → single speaker
[1548.0-1550.0] "He wants to arrest Judy for Punch." → single speaker
[1550.0-1554.0] "What's she saying now?" → merged! multiple speakers
[1554.0-1557.5] "That she's innocent. She didn't do it." → merged
[1557.5-1560.7] "Oh, she did it all right." → merged
...
AFTER — Fine segments (64 total for 3min clip):
[1550.3-1551.0] "He wants to arrest Judy..." → Audrey Hepburn
[1552.7-1553.4] "What's she saying now?" → Audrey Hepburn
[1553.4-1554.2] "now? That" → Cary Grant
[1554.2-1559.3] "That she's innocent. She didn't..." → Cary Grant
[1559.3-1560.5] "Oh, she did it all right." → Audrey Hepburn
[1560.5-1561.6] "right. I" → Cary Grant
[1561.6-1562.8] "I believe her." → Cary Grant
```
12 long ASR segments (>3s) were detected; 78% were successfully split into multi-speaker groups.
### Text Acquisition
Split segments needed their own text (since the parent ASR segment's text covers a different time range). Three approaches were tested:
1. **Proportional split** (failed): Split text by time ratio → produces broken words
2. **Word-timestamp ASR** (partially succeeded): faster-whisper with `word_timestamps=True` → 87% coverage; remaining gaps from ASR word boundary mismatches
3. **Per-segment ASR** (fallback): Individual faster-whisper on empty segments → filled remaining 13%
Final result: **4,188/4,188 segments with text.**
### Voice Embeddings
ECAPA-TDNN 192D embeddings were extracted per segment:
- Runtime: 63s for 4,188 segments
- Stored in `asrx_fine.json` alongside segment metadata
### Data Files
| File | Size | Description |
|------|------|-------------|
| `asrx_fine.json` | ~45 MB | 4,188 fine segments + 4,188 embeddings |
| `asrx_fine.json → segments[].speaker_name` | — | Centroid-matched identity |
| `asrx_fine.json → segments[].speaker_id` | — | SPEAKER_0/1/2 |
| `asrx_fine.json → segments[].text` | — | ASR text (word-timestamp mapped) |
| `asrx_fine.json → embeddings[]` | — | 192D ECAPA-TDNN per segment |
### Continued Limitations
1. **Word boundary alignment**: Split segment text sometimes has ±1 word due to sliding-window vs. ASR boundary mismatch (cosmetic, not semantic)
2. **ASR merge in silence zones**: Very short utterances (<0.5s) merged into adjacent segments
3. **Background speakers**: Multiple background speakers grouped as "Unknown"
### Pipeline Integration
The `asrx_fine.json` file serves as the new ASRX output. The original `asr.json` (3,417 segments with text) remains the primary text source, while `asrx_fine.json` provides superior speaker diarization at 4,188 segments.
Speaker assignments in DB `dev.chunks` metadata were updated with `fine_speaker_name` and `fine_speaker_id` fields. Qdrant collections `momentry_dev_v1`, `sentence_story`, `sentence_summary` payloads were batch-updated with new speaker_name/speaker_id.
### Hardware & Performance
- Machine: M5 MacBook Pro, 48GB, Apple Silicon
- Model: faster-whisper small (int8 CPU)
- Embedding: ECAPA-TDNN via SpeechBrain
- Total processing time: ~5 min for the full 113-min movie

View File

@@ -0,0 +1,255 @@
# Charade 臉部匹配經驗總結
## 背景
Charade (1963) 影片 `a6fb22eebefaef17e62af874997c5944` 有 62,298 個人臉偵測結果,分布在 4,378 個 trace 中TKG face tracker 輸出)。目標是將每張臉匹配到正確的 TMDb 演員 identity。
## 問題
### 1. Rust Pipeline (`face_agent.rs`) 的 Snowball 效應
原始 pipeline 透過多輪 propagation 來匹配:
- Seed embedding 匹配 → propagation rounds (2-10 輪)
- 每輪把已匹配的 face 當作新 seed 繼續擴散
- 結果:**Antonio Passalia 被匹配 18,821 張臉**(實際應 < 50
- 原因propagation 會放大初始匹配中的假陽性
### 2. Dev 資料庫污染
`dev` schema 的 `identity_bindings` 表:
- 所有 trace-type binding 的 `file_uuid` 都是 NULL12,828 行)
- 這些 binding 只對應已刪除的 CCBN 檔案 (`63acd3bb`)
- **完全無法用於 sync 到 public schema**
### 3. TMDb Seed Embedding 品質不均
22/23 個 TMDb identity 有 face_embeddingThomas Chelimsky 因無 TMDb 照片而缺少)。但這些 seed 來自單一 TMDb 照片,品質差異大:
| Identity | Seed 品質 | 問題 |
|----------|:---------:|:----:|
| Audrey Hepburn | ✅ 高 | 特徵明顯,易區分 |
| Cary Grant | ✅ 中 | 但 Charade 造型與 seed 照片有差異 |
| Walter Matthau | ❌ 低 | Seed 照片與 Charade 形象差異大 |
| Bernard Musson | ❌ 泛用 | 「典型白人男性」— seed 太泛用 |
| Antonio Passalia | ❌ 泛用 | 同上 |
## 解決方案演進
### V1直接 pgvector 比對 (threshold 0.50)
```sql
CROSS JOIN LATERAL (
SELECT i.id FROM identities i
WHERE 1 - (embedding <=> i.face_embedding) >= 0.50
ORDER BY 1 - (embedding <=> i.face_embedding) DESC LIMIT 1
)
```
**結果**17,066 匹配 (27.4%)
- ✅ Audrey 9,550 (正確)
- ✅ Antonio 降為 151 (不再 snowball)
- ❌ Bernard Musson 847Paul Bonifas 273 — generic seed 假陽性
- ❌ trace-level 衝突(同一 trace 多個 identity
- ❌ Walter Matthau 僅 535seed 不準導致 recall 低)
### V2Trace Conflict Cleanup
在 V1 之後,對每個 conflict trace 做多數決 → 清除 minority identity。
**結果**:移除 836 個污染臉
- ✅ trace-level 衝突降為 0
- ❌ Bernard Musson 仍保留 847trace 內獨佔)
- ❌ 無法解決 generic seed 的根本問題
### V3雙階段 Centroid Matching
設計:
```
Phase 1: Seed matching @ 0.55 (stricter) → 乾淨 base set
Phase 2: Centroid matching @ 0.45 → 用電影內平均臉擴張 recall
```
**結果**27,375 匹配 (43.9%) → trace cleanup → 24,286 (39.0%)
- ✅ Audrey 11,347 (+19%)
- ✅ Cary Grant 3,107 (+56%)
- ✅ Walter Matthau 1,200 (+124%) — centroid 修正 seed!
-**Bernard Musson 2,903 (+243%)** — centroid 放大 generic seed
-**Antonio Passalia 898 (+642%)** — 同上
**教訓**Generic seed 的 centroid 更泛用。Phase 2 的低 threshold 讓問題惡化。
### V4雙重驗證 (Dual Gate)
在 V3 的 Phase 2 加上 seed_sim >= 0.40 條件:
```
centroid_sim >= 0.45 AND seed_sim >= 0.40
```
**結果**23,023 匹配 → gap cleanup → trace cleanup → **22,548 (36.2%)**
- ✅ Bernard / Paul / Antonio / Michel / Clément / Raoul / Roger 仍偏高但 avg_seed_sim 改善
### V5最終版排除 7 個 Generic Identity
核心洞察:**與其過濾假陽性,不如不讓 generic seed 參賽**。
只保留 11 個可靠的 TMDb identity排除 7 個:
- 排除Bernard Musson · Paul Bonifas · Michel Thomass · Antonio Passalia · Clément Harari · Raoul Delfosse · Roger Trapp
- 保留Audrey · Cary · James Coburn · Jacques Marin · Walter Matthau · George Kennedy · Dominique Minot · Monte Landis · Stanley Donen · Ned Glass · Louis Viret
流程:
```
1. Clear all assignments
2. Phase 1 @ 0.55 — only against 11 identities
3. Compute centroids
4. Phase 2 — centroid>=0.45 AND seed>=0.40 (11 centroids)
5. Ambiguity gate (top2 gap < 0.04 → NULL)
6. Trace conflict cleanup
```
**最終結果**
| Identity | 最終 faces | traces | fpt | avg_sim |
|----------|:----------:|:------:|:---:|:-------:|
| Audrey Hepburn | 11,325 | 438 | 25.9 | 0.608 |
| Cary Grant | **5,101** ≪ 大幅增加 | 269 | 19.0 | 0.497 |
| James Coburn | 1,508 | 92 | 16.4 | 0.588 |
| Jacques Marin | 1,438 | 84 | 17.1 | 0.631 |
| Walter Matthau | 1,250 | 55 | 22.7 | 0.494 |
| George Kennedy | 869 | 60 | 14.5 | 0.590 |
| 排除的 7 個 | **0** ✅ | — | — | — |
| Unassigned | 39,750 | — | — | — |
**Cary Grant 從 3,107→5,101 (+64%)**:之前被 Bernard/Antonio 攔截的臉全部釋放。
## 關鍵教訓
### 1. Generic Seed 辨識
可以透過以下指標辨識 generic seed
- **Phase 1 faces / traces 比例低**< 5 fpt
- **被分配到大量短 trace**(表示非連續場景)
- **avg_seed_sim 偏低但 face count 異常高**
### 2. Propagation 是雙面刃
Rust pipeline 的 propagation 可以增加 recall但前提是 seed 要夠純。Generic seed + propagation = snowball。
### 3. Seed 數量 vs 品質
> 不是 identity 越多越好。11 個好 seed 勝過 22 個(含 7 個壞的)。
壞 seed 會攔截好 seed 的配對。排除壞 seed 後,那些臉自然會配到正確的人。
### 4. Centroid Matching 的適用條件
Centroid matching 只有在以下情況才有效:
- Centroid 來自高信賴的 Phase 1 配對threshold >= 0.55
- Centroid 的 Phase 1 base set > 200 faces
- 搭配 seed_sim dual gate 防止 centroid 飄移
### 5. Trace Context 的重要性
- 一個 trace = 同一人face tracker 保證)
- Trace-level conflict cleanup 是必要的後處理
- 但無法解決 trace 層級以下(同一 trace 內)的 contamination
## 可改進的方向
### 短期
1. **手動檢查 Cary Grant 的 5,101 faces**avg_sim 0.497 偏低,部分可能是假陽性
2. **補回已被排除的 identity**:對 Bernard Musson 等用更高 threshold如 0.60 seed只看能否 match 到少數高信賴臉
3. **降低 Ambiguity Gate threshold**:從 0.04 降到 0.03 可再清除一批邊緣配對
### 中期
4. **多 seed 策略**:對每個 identity 用 3-5 張 TMDb 照片,取 centroid 作為 seed
5. **場景約束**:利用 shot boundary 資訊限制跨場景的 identity 分配
6. **雙向驗證**:同時用 face→identity 和 identity→trace 兩種方向互相驗證
### 長期
7. **取代 pgvector face-level matching**:改用 trace-level embedding同一 trace 的所有 face 取平均),再對 trace 做 identity 匹配,減少 single-frame noise
## SQL 核心語法
### pgvector Nearest Neighbor
```sql
SELECT fd.id, m.identity_id
FROM eligible fd
CROSS JOIN LATERAL (
SELECT i.id FROM identities i
WHERE 1 - (fd.embedding::vector <=> i.face_embedding) >= {threshold}
ORDER BY 1 - (fd.embedding::vector <=> i.face_embedding) DESC
LIMIT 1
) m
```
### Centroid 計算
```sql
CREATE TABLE centroids AS
SELECT identity_id, AVG(embedding::vector) as centroid
FROM face_detections
WHERE file_uuid = '{uuid}' AND identity_id IS NOT NULL
GROUP BY identity_id
HAVING COUNT(*) >= 5;
```
### Trace Conflict Cleanup
```sql
WITH conflict_traces AS (
SELECT trace_id FROM face_detections
WHERE file_uuid = '{uuid}' AND identity_id IS NOT NULL
GROUP BY trace_id HAVING COUNT(DISTINCT identity_id) > 1
),
trace_majority AS (
SELECT DISTINCT ON (ct.trace_id) ct.trace_id, fd.identity_id
FROM conflict_traces ct
JOIN face_detections fd ON fd.trace_id = ct.trace_id
WHERE fd.file_uuid = '{uuid}' AND fd.identity_id IS NOT NULL
GROUP BY ct.trace_id, fd.identity_id
ORDER BY ct.trace_id, COUNT(*) DESC
)
UPDATE face_detections fd SET identity_id = NULL
FROM trace_majority tm
WHERE fd.file_uuid = '{uuid}' AND fd.trace_id = tm.trace_id
AND fd.identity_id != tm.identity_id;
```
### Ambiguity Gate
```sql
WITH all_sims AS (
SELECT fd.id, c.identity_id,
1 - (fd.embedding::vector <=> c.centroid) as sim
FROM face_detections fd
CROSS JOIN centroids c
WHERE fd.file_uuid = '{uuid}' AND fd.identity_id IS NOT NULL
),
ranked AS (
SELECT id, sim, LEAD(sim) OVER (PARTITION BY id ORDER BY sim DESC) as sim2
FROM all_sims
),
ambiguous AS (
SELECT id FROM ranked
WHERE rn = 1 AND sim - COALESCE(sim2, 0) < 0.04
)
UPDATE face_detections fd SET identity_id = NULL
FROM ambiguous a WHERE fd.id = a.id;
```
## 資料庫備份
每次關鍵操作都有備份:
| Backup | Rows | 內容 |
|--------|:----:|:------|
| `fd_charade_bak` | 62,298 | 原始無 identity 的 Charade face_detections |
| `fd_state_bak2` | 24,286 | V5 執行前的 assignment snapshot |
| `wp_snippets_backup_20260601_11940.sql` | — | WordPress snippets 備份 |

View File

@@ -1,674 +0,0 @@
# Momentry Core API 示範手冊
| 項目 | 內容 |
|------|------|
| 版本 | V1.0 |
| 日期 | 2026-03-25 |
| 狀態 | 完成 |
---
## 快速開始
### Demo API Key
```
API Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69
Key ID: muser_68600856036340bcafc01930eb4bd839
過期日: 2027-03-25
```
### 測試連線
```bash
curl http://localhost:3002/health
```
```json
{"status":"ok","version":"0.1.0","uptime_ms":456464}
```
### 測試認證
```bash
curl -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
http://localhost:3002/api/v1/videos | jq '.videos | length'
```
```json
13
```
---
## 環境 URL
| 環境 | URL | 用途 |
|------|-----|------|
| **本地開發** | `http://localhost:3002` | 本機開發測試 |
| **外部訪問** | `https://api.momentry.ddns.net` | n8n/WordPress/curl 生產環境 |
---
## 端點總覽
| 方法 | 端點 | 說明 | 認證 |
|------|------|------|------|
| GET | `/health` | 健康檢查 | 公開 |
| GET | `/health/detailed` | 詳細健康檢查 | 公開 |
| POST | `/api/v1/register` | 註冊影片 | 需要 |
| POST | `/api/v1/probe` | 探測影片資訊 | 需要 |
| POST | `/api/v1/search` | 語意搜尋 | 需要 |
| POST | `/api/v1/n8n/search` | n8n 格式搜尋 | 需要 |
| POST | `/api/v1/search/hybrid` | 混合搜尋 | 需要 |
| GET | `/api/v1/videos` | 列出所有影片 | 需要 |
| GET | `/api/v1/lookup` | 查詢影片 UUID | 需要 |
| GET | `/api/v1/progress/:uuid` | 處理進度 | 需要 |
| GET | `/api/v1/jobs` | 任務列表 | 需要 |
| GET | `/api/v1/jobs/:uuid` | 任務詳情 | 需要 |
---
## 1. curl 範例
### 基本格式
```bash
curl -H "X-API-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
URL
```
### 1.1 健康檢查(公開)
```bash
# 基本健康檢查
curl http://localhost:3002/health
# 詳細健康檢查(含服務狀態)
curl http://localhost:3002/health/detailed
```
### 1.2 列出影片
```bash
curl -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
http://localhost:3002/api/v1/videos | jq '.'
```
```json
{
"videos": [
{
"uuid": "952f5854b9febad1",
"file_name": "ExaSAN PCIe series - Director Ou Yu-Zhi Shares His Experience.mp4",
"duration": 159.637188,
"width": 640,
"height": 360
},
...
]
}
```
### 1.3 搜尋影片
```bash
curl -X POST \
-H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
-H "Content-Type: application/json" \
-d '{"query": "ExaSAN", "limit": 5}' \
http://localhost:3002/api/v1/search | jq '.'
```
```json
{
"results": [
{
"uuid": "952f5854b9febad1",
"chunk_id": "...",
"text": "...",
"score": 0.85,
"start_time": 0.0,
"end_time": 5.0
}
],
"total": 1,
"query": "ExaSAN",
"took_ms": 123
}
```
### 1.4 查詢進度
```bash
curl -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
http://localhost:3002/api/v1/progress/952f5854b9febad1 | jq '.'
```
```json
{
"uuid": "952f5854b9febad1",
"overall_progress": 67,
"current_processor": "yolo",
"processors": [
{"name": "asr", "status": "completed"},
{"name": "cut", "status": "completed"},
{"name": "yolo", "status": "running"}
]
}
```
---
## 2. n8n 範例
### 2.1 HTTP Request 節點設定
```
Method: POST
URL: https://api.momentry.ddns.net/api/v1/search
Authentication: None (使用 Header)
Headers:
┌─────────────────────┬──────────────────────────────────────────────────┐
│ Name │ Value │
├─────────────────────┼──────────────────────────────────────────────────┤
│ X-API-Key │ muser_68600856036340bcafc01930eb4bd839_... │
│ Content-Type │ application/json │
└─────────────────────┴──────────────────────────────────────────────────┘
Body Content (JSON):
{
"query": "{{ $json.search_term }}",
"limit": 5
}
```
### 2.2 n8n 搜尋 Workflow
```json
{
"nodes": [
{
"name": "Manual Trigger",
"type": "n8n-nodes-base.manualTrigger",
"position": [250, 300]
},
{
"name": "Set Search Term",
"type": "n8n-nodes-base.set",
"parameters": {
"values": {
"json": {
"search_term": "ExaSAN"
}
}
},
"position": [450, 300]
},
{
"name": "Search Videos",
"type": "n8n-nodes-base.httpRequest",
"parameters": {
"method": "POST",
"url": "https://api.momentry.ddns.net/api/v1/search",
"authentication": "genericCredentialType",
"genericAuthType": "httpHeaderAuth",
"sendHeaders": true,
"headerParameters": {
"parameters": [
{
"name": "X-API-Key",
"value": "muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
}
]
},
"sendBody": true,
"bodyContentType": "json",
"specifyBody": "json",
"jsonBody": "={{ { \"query\": $json.search_term, \"limit\": 5 } }}"
},
"position": [650, 300]
},
{
"name": "Process Results",
"type": "n8n-nodes-base.code",
"parameters": {
"jsCode": "// Extract video results\nconst results = $input.first().json.results;\nreturn results.map(r => ({\n uuid: r.uuid,\n text: r.text,\n score: r.score,\n time: `${r.start_time}s - ${r.end_time}s`\n}));"
},
"position": [850, 300]
}
],
"connections": {
"Manual Trigger": {
"main": [[{"node": "Set Search Term"}]]
},
"Set Search Term": {
"main": [[{"node": "Search Videos"}]]
},
"Search Videos": {
"main": [[{"node": "Process Results"}]]
}
}
}
```
### 2.3 n8n 列出影片 Workflow
```json
{
"nodes": [
{
"name": "Get Videos",
"type": "n8n-nodes-base.httpRequest",
"parameters": {
"method": "GET",
"url": "https://api.momentry.ddns.net/api/v1/videos",
"sendHeaders": true,
"headerParameters": {
"parameters": [
{
"name": "X-API-Key",
"value": "muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
}
]
}
},
"position": [450, 300]
},
{
"name": "Extract Video List",
"type": "n8n-nodes-base.code",
"parameters": {
"jsCode": "const videos = $input.first().json.videos;\nreturn videos.map(v => ({\n json: {\n uuid: v.uuid,\n name: v.file_name,\n duration: Math.round(v.duration) + 's',\n resolution: `${v.width}x${v.height}`\n }\n}));"
},
"position": [650, 300]
},
{
"name": "Slack Notification",
"type": "n8n-nodes-base.slack",
"parameters": {
"channel": "#momentry",
"text": "=Found {{ $json.length }} videos:\n{{ $json.map(v => `• ${v.name} (${v.duration})`).join(`\n`) }}"
},
"position": [850, 300]
}
]
}
```
### 2.4 n8n 定時同步 Workflow
```json
{
"nodes": [
{
"name": "Schedule Trigger",
"type": "n8n-nodes-base.scheduleTrigger",
"parameters": {
"rule": {
"interval": [{"field": "hours", "hours": 1}]
}
},
"position": [250, 300]
},
{
"name": "Get Pending Videos",
"type": "n8n-nodes-base.httpRequest",
"parameters": {
"method": "GET",
"url": "https://api.momentry.ddns.net/api/v1/videos"
},
"position": [450, 300]
},
{
"name": "Filter Processing",
"type": "n8n-nodes-base.filter",
"parameters": {
"conditions": {
"options": {"caseSensitive": true},
"conditions": [
{"id": "status", "leftValue": "{{ $json.status }}", "rightValue": "processing"}
]
}
},
"position": [650, 300]
}
]
}
```
---
## 3. WordPress 範例
### 3.1 PHP 函數庫
```php
<?php
/**
* Momentry API Client
*/
class Momentry_API {
private const API_URL = 'https://api.momentry.ddns.net';
private const API_KEY = 'muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69';
/**
* 發送 API 請求
*/
private function request(string $endpoint, array $data = [], string $method = 'GET'): array {
$url = self::API_URL . $endpoint;
$args = [
'headers' => [
'X-API-Key' => self::API_KEY,
'Content-Type' => 'application/json',
],
'timeout' => 30,
];
if ($method === 'POST') {
$args['method'] = 'POST';
$args['body'] = json_encode($data);
}
$response = wp_remote_request($url, $args);
if (is_wp_error($response)) {
throw new Exception($response->get_error_message());
}
return json_decode(wp_remote_retrieve_body($response), true);
}
/**
* 列出所有影片
*/
public function list_videos(): array {
return $this->request('/api/v1/videos');
}
/**
* 搜尋影片內容
*/
public function search(string $query, int $limit = 10): array {
return $this->request('/api/v1/search', [
'query' => $query,
'limit' => $limit,
], 'POST');
}
/**
* 取得影片進度
*/
public function get_progress(string $uuid): array {
return $this->request("/api/v1/progress/{$uuid}");
}
/**
* 檢查健康狀態
*/
public function health_check(): array {
return $this->request('/health');
}
}
```
### 3.2 短代碼 (Shortcode)
```php
<?php
/**
* WordPress 短代碼範例
*/
// 註冊短代碼
add_shortcode('momentry_videos', function($atts) {
$atts = shortcode_atts([
'limit' => 10,
], $atts);
$api = new Momentry_API();
try {
$result = $api->list_videos();
$videos = array_slice($result['videos'], 0, $atts['limit']);
ob_start();
?>
<div class="momentry-videos">
<h3>影片列表</h3>
<ul>
<?php foreach ($videos as $video): ?>
<li>
<strong><?= esc_html($video['file_name']) ?></strong>
<br>
<small>
UUID: <?= esc_html($video['uuid']) ?>
| 時長: <?= gmdate("H:i:s", $video['duration']) ?>
</small>
</li>
<?php endforeach; ?>
</ul>
</div>
<?php
return ob_get_clean();
} catch (Exception $e) {
return '<p class="error">載入失敗: ' . esc_html($e->getMessage()) . '</p>';
}
});
// 搜尋短代碼
add_shortcode('momentry_search', function($atts, $content = '') {
$query = sanitize_text_field($content);
if (empty($query)) {
return '<p>請提供搜尋關鍵字</p>';
}
$api = new Momentry_API();
try {
$result = $api->search($query);
ob_start();
?>
<div class="momentry-search-results">
<h3>「<?= esc_html($query) ?>」搜尋結果</h3>
<?php if (empty($result['results'])): ?>
<p>沒有找到相關結果</p>
<?php else: ?>
<ul>
<?php foreach ($result['results'] as $item): ?>
<li>
<a href="/video/<?= esc_attr($item['uuid']) ?>?t=<?= (int)$item['start_time'] ?>">
<?= esc_html($item['text']) ?>
</a>
<br>
<small>相似度: <?= round($item['score'] * 100) ?>%</small>
</li>
<?php endforeach; ?>
</ul>
<?php endif; ?>
</div>
<?php
return ob_get_clean();
} catch (Exception $e) {
return '<p class="error">搜尋失敗: ' . esc_html($e->getMessage()) . '</p>';
}
});
```
### 3.3 使用方式
在 WordPress 頁面或文章中:
```
[momentry_videos limit="5"]
[momentry_search]ExaSAN[/momentry_search]
```
### 3.4 REST API 整合
```php
<?php
/**
* 註冊 WordPress REST API 端點
*/
add_action('rest_api_init', function() {
register_rest_route('momentry/v1', '/search', [
'methods' => 'GET',
'callback' => function(WP_REST_Request $request) {
$query = sanitize_text_field($request->get_param('q'));
if (empty($query)) {
return new WP_Error('missing_query', '需要搜尋關鍵字', ['status' => 400]);
}
$api = new Momentry_API();
$result = $api->search($query);
return new WP_REST_Response($result, 200);
},
'permission_callback' => '__return_true',
]);
});
// 使用方式: GET /wp-json/momentry/v1/search?q=ExaSAN
```
---
## 4. 疑難排解
### 4.1 常見錯誤
| 錯誤 | 原因 | 解決方案 |
|------|------|----------|
| `401 Unauthorized` | API Key 無效或過期 | 檢查 API Key 是否正確 |
| `500 Internal Server Error` | 伺服器錯誤 | 檢查 `/health/detailed` 服務狀態 |
| `Connection Timeout` | 網路問題 | 確認 `api.momentry.ddns.net` 可達 |
### 4.2 測試腳本
```bash
#!/bin/bash
# test_api.sh - Momentry API 測試腳本
API_KEY="muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
BASE_URL="http://localhost:3002"
echo "=== 1. 健康檢查 ==="
curl -s "$BASE_URL/health" | jq .
echo ""
echo "=== 2. 列出影片 ==="
curl -s -H "X-API-Key: $API_KEY" "$BASE_URL/api/v1/videos" | jq '.videos | length'
echo ""
echo "=== 3. 搜尋測試 ==="
curl -s -X POST -H "X-API-Key: $API_KEY" \
-H "Content-Type: application/json" \
-d '{"query": "test", "limit": 3}' \
"$BASE_URL/api/v1/search" | jq '.results | length'
echo ""
echo "=== 完成 ==="
```
### 4.3 驗證腳本
```bash
#!/bin/bash
# verify_auth.sh - 驗證 API Key
API_KEY="muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
BASE_URL="http://localhost:3002"
# 測試 1: 無 API Key
echo "測試 1: 無 API Key"
RESULT=$(curl -s -o /dev/null -w "%{http_code}" "$BASE_URL/api/v1/videos")
[ "$RESULT" = "401" ] && echo "✅ 正確拒絕 (401)" || echo "❌ 預期 401實際 $RESULT"
# 測試 2: 有 API Key
echo "測試 2: 有 API Key"
RESULT=$(curl -s -H "X-API-Key: $API_KEY" "$BASE_URL/api/v1/videos")
echo "$RESULT" | jq -e '.videos' > /dev/null && echo "✅ 成功取得資料" || echo "❌ 取得資料失敗"
# 測試 3: 無效 API Key
echo "測試 3: 無效 API Key"
RESULT=$(curl -s -o /dev/null -w "%{http_code}" -H "X-API-Key: invalid_key" "$BASE_URL/api/v1/videos")
[ "$RESULT" = "401" ] && echo "✅ 正確拒絕 (401)" || echo "❌ 預期 401實際 $RESULT"
```
---
## 5. API Key 管理
### 5.1 建立新 API Key
```bash
# 本地建立
./target/release/momentry api-key create "My App" --key-type user --ttl 90
```
### 5.2 列出 API Keys
```bash
./target/release/momentry api-key list
```
### 5.3 驗證 API Key
```bash
./target/release/momentry api-key validate --key "YOUR_API_KEY"
```
### 5.4 撤銷 API Key
```bash
./target/release/momentry api-key revoke --key "YOUR_API_KEY"
```
---
## 附錄
### A. 影片 UUID 說明
UUID 是基於檔案路徑的 SHA256 哈希前 16 位:
```
/Users/accusys/momentry/var/sftpgo/data/demo/video.mp4
SHA256 Hash
9760d0820f0cf9a7
```
### B. 處理器狀態
| 狀態 | 說明 |
|------|------|
| `pending` | 等待處理 |
| `running` | 處理中 |
| `completed` | 已完成 |
| `failed` | 失敗 |
### C. 支援的處理器
- **ASR**: 語音識別
- **CUT**: 場景剪切
- **YOLO**: 物件偵測
### D. 聯絡支援
- Email: support@momentry.ddns.net
- 文件: https://docs.momentry.ddns.net
- GitHub: https://github.com/anomalyco/momentry

View File

@@ -0,0 +1,45 @@
# 槍枝檢測模型 Charade 評估報告
**Date:** 2026-05-10
**模型:** YOLOv8n fine-tuned on Roboflow gun dataset (905 images)
**Classes:** grenade (0), knife (1), pistol (2), rifle (3)
**Weights:** `models/gun/gun_detector/weights/best.pt` (6MB)
## 訓練
- **Dataset**: 905 images, Roboflow CC BY 4.0
- **Validation mAP50**: 0.813
- **問題**: 訓練資料全為近距離槍枝特寫,與 Charade 電影中的中遠景畫面分布完全不同
## Charade 測試結果
### 系統掃描24 取樣點 @ 每 300s
| 時間 | 類別 | 信心 | 判定 |
|------|------|------|------|
| t=600s | pistol×2, rifle | 0.160.30 | ❌ FP |
| t=1200s | knife | 0.37 | ❌ FP |
| t=1800s | pistol | 0.19 | ❌ FP |
| t=2400s | knife | 0.18 | ❌ FP |
| t=3000s | pistol | 0.16 | ❌ FP |
| t=5400s | pistol×2 | 0.45, 0.17 | ❌ FP郵票被誤判為槍 |
| t=6600s | grenade | 0.22 | ❌ FP |
### 密集掃描ASR trigger
在 ASR dialogue 提到 "gun" 的時間點附近跑 gun detector找到 5 個 pistol/gun 觸發3188s / 5461s / 6309s / 6377s / 6479sconfidence 0.300-0.387。
**結果:全部為 false positive。** 訓練效果非常不好 — 模型在電影中遠景畫面完全失效。
## 結論
1. 訓練資料與推論場景 distribution mismatch 嚴重
2. 905 張 Roboflow 近距離特寫 → Charade 的中遠景手持/部分遮蔽槍枝 → 模型無法泛化
3. 建議收集電影真實槍枝畫面200-500 張動作片片段)重新訓練
4. 在此之前,槍枝搜尋只能靠 ASR dialogue keyword matching + 人工確認
## 相關檔案
- `models/gun/gun_detector/weights/best.pt` — 模型權重(效果不佳)
- `output_dev/gun_detections/` — 偵測截圖(全部 FP
- `scripts/object_search_agent.py` — 整合搜尋 agentgun detector 偵測結果僅供參考)

View File

@@ -0,0 +1,73 @@
# Gun Detector Scan Report — YOLOv8n on Charade (1963)
**Date:** 2026-05-10
**Model:** `models/gun/gun_detector/weights/best.pt`
**Base:** YOLOv8n fine-tuned on Roboflow gun dataset (905 images)
**Classes:** grenade, knife, pistol, rifle
**Scan script:** `scripts/gun_detector_scan.py`
## Scan Method
- **121 scan points**: 2 ASR "gun" mentions + 114 fixed intervals (60s) + 5 original hit timestamps
- **Per point**: scan ±30 frames at every 3rd frame = ~20 frames per point
- **Total frames processed**: ~2,420
- **Runtime**: ~2 min
## Results
| Class | Detections | Top Confidence |
|-------|-----------|---------------|
| pistol | **82** | 0.887 |
| rifle | 55 | 0.822 |
| grenade | 35 | 0.797 |
| knife | 38 | 0.810 |
| **Total** | **210** (after dedup) | — |
## Original 5 Pistol Timestamps
| Timestamp | Original | This Scan | Delta |
|-----------|----------|-----------|-------|
| 3188s (53:08) | pistol 0.387 | ✅ **0.474** | +22% |
| 5461s (91:01) | pistol 0.355 | ✅ **0.346** | 3% |
| 6309s (1:45:09) | pistol 0.374 | ❌ Not found | — |
| 6377s (1:46:17) | gun 0.316 | ✅ **0.757** | +140% |
| 6479s (1:47:59) | pistol 0.300 | ✅ **0.815** | +172% |
## Top Pistol Detections
| Time | Confidence | Image |
|------|-----------|-------|
| 84:00 (5040s) | **0.887** | `5040s_pistol_0.887.jpg` |
| 90:00 (5400s) | **0.816** | `5400s_pistol_0.816.jpg` |
| 108:00 (6480s) | **0.815** | `6480s_pistol_0.815.jpg` |
| 48:59 (2939s) | **0.805** | `2939s_pistol_0.805.jpg` |
| 53:07 (3187s) | **0.474** | `3187s_pistol_0.474.jpg` |
| 91:00 (5459s) | **0.346** | `5459s_pistol_0.346.jpg` |
## Analysis
### Model Performance
Compared to the original evaluation (May 7, 24 sample points, all FP):
- This scan found **significantly more detections** (210 vs 7)
- Confidence values are **much higher** (0.887 vs 0.45 max)
- 4/5 original pistol timestamps recovered
### Cautions
1. **Training data mismatch**: Model was trained on 905 close-up gun photos, NOT movie frames. High confidence ≠ real gun.
2. **Stamp false positive confirmed**: t=5400s (identified in original eval as stamp → pistol) continues to fire at 0.816
3. **Pattern suggests overconfidence**: Many detections at regular intervals (every 60s, same objects) suggest the model is detecting non-gun objects with high confidence
### Verified Findings
The original 5 pistol images from the gun_detections/ directory (3188s, 5461s, 6309s, 6377s, 6479s) were all produced by the same YOLOv8n model. The user previously stated that none of these have been confirmed as real guns.
## Files
| File | Description |
|------|-------------|
| `output_dev/gun_detections/gun_detections.json` | All 210 deduped detections |
| `output_dev/gun_detections/*.jpg` | Annotated screenshots (one per detection) |
| `scripts/gun_detector_scan.py` | Scan script (reproducible) |

View File

@@ -0,0 +1,50 @@
# M4 / M5 協作協議
## 核心原則:檔案是 source of truth
所有 processor 的產出是 `{uuid}.{processor}.json` 檔案。
**檔案存在 = 處理完成**,優先於 DB 或 Redis 的任何狀態記錄。
## 絕對禁止
### 1. 不可刪除已存在的輸出檔
- 任何 `{uuid}.{processor}.*` 檔案,無論是 `.json``.json.tmp``.json.partial``.json.err`
- 一律不允許 `rm``unlink``delete`
- 唯一例外:明確的人工指令 `rm` / `Delete this file`
### 2. 不可覆蓋已存在的輸出檔
- 重新執行 processor 前,必須先 **copy非 rename** 加上時間戳備份
- 備份命名:`{uuid}.{processor}.{timestamp}.{original_extension}`
- 若備份名已存在,跳過(不覆蓋不 counter
- 原檔保留不動
### 3. 不可跨域操作
- M4 只能在 M4 機器Mac Mini上操作
- M5 只能在 M5 機器MacBook Pro上操作
- 禁止任何跨機器的檔案操作或 cleanup
## 重跑 processor 的正確流程
1. Worker 檢查 `{uuid}.{processor}.json` 是否存在
2. **存在 → 跳過**(無論 DB/Redis 狀態)
3. 不存在 → copy 備份既有 `{uuid}.{processor}.*` → 執行 processor
4. Processor 輸出寫入 `.tmp` → 完成後 rename 為 `.json`
## 例外處理
| 狀態 | 行為 |
|------|------|
| `.json` 存在 | 跳過,視為完成 |
| `.json.tmp` 存在(無 `.json` | 視為未完成,備份後重跑 |
| `.json.partial` 存在(無 `.json` | 視為未完成,備份後重跑 |
| `.json.err` 存在(無 `.json` | 視為未完成,備份後重跑 |
| Process 被 killSIGKILL | partial 存為 `.json.partial`(非 `.json` |
## 違規後果
2026-05-09 事故M4 release 打包未含 .json → 跨域操作 → M5 cleanup 誤刪 asr.json
→ 導致 ASR 需重跑(完整電影約 1.5hr
→ YOLO 需重跑
→ 損失已完成的 pipeline 進度
此類違規不可再發生。

View File

@@ -0,0 +1,31 @@
# M4 Release Incident — 2026-05-09
## Summary
M4 在進行 release 打包作業時,未依照計畫包含 output `.json` 檔案,僅在 database 中保留 records。此外 M4 違反操作邊界進入 M5 管轄範圍M5 執行 cleanup 時將已完成的 `asr.json` 一併刪除。
## Impact
| 檔案 | 狀態 | 說明 |
|------|------|------|
| `{uuid}.asr.json` | ❌ 遺失 | 已完成的 ASR 輸出被 M5 cleanup 誤刪 |
| `{uuid}.yolo.json` | ❌ 損毀 | JSON parse error需重跑 |
| DB records | ⚠️ 不一致 | processor_results 狀態與實際檔案不符 |
## Root Cause
1. **M4 release 打包遺漏**: Release 流程未將 `.json` 輸出檔納入打包範圍,只保留了 DB。
2. **M4 越界操作**: M4 在 M5 的目錄/範圍內執行操作,違反開發隔離原則。
3. **M5 cleanup 誤刪**: M5 的 cleanup 機制未預期 M4 的產出,將 `asr.json` 視為無用檔案清除。
## 處理
- ASR: 重跑中asr_processor.py完整電影約 6780s
- YOLO: 重跑中yolo_processor.py
- 已修改 worker 邏輯:開機後以 `.json` 檔案存在為 source of truth不再僅依賴 DB/Redis 狀態
## 預防措施
- Release 流程需明確定義 deliverables 包含 `.json` 檔案
- M4/M5 操作邊界需嚴格遵守,禁止跨域操作
- Cleanup 機制應先確認檔案是否為有效 processor output

View File

@@ -0,0 +1,77 @@
# M4 vs M5 Max Comparison
## Hardware
| Spec | M4 (Mac Mini) | M5 (MacBook Pro) |
|------|--------------|-------------------|
| **Model** | Mac Mini (M4) | MacBook Pro (M5 Max) |
| **Hostname** | `accusys-Mac-mini-M4-2.local` | `Accusyss-MacBook-Pro.local` |
| **macOS** | 26.4.1 (Sequoia) | 26.4.1 (Sequoia) |
| **RAM** | 16 GB | **48 GB** |
| **CPU Cores** | 10 | **18** |
| **Disk** | 2TB (est.) | **1.8TB (12GB used, 97% free)** |
| **Network** | 192.168.110.210, 192.168.110.200 | 192.168.110.201, 192.168.31.182 |
## Installed Services
| Service | M4 | M5 |
|---------|-----|------|
| **PostgreSQL** | 18.1 (Homebrew) | **18.3 (Source build)** |
| **pgvector** | Homebrew | **0.8.2 (Source build)** |
| **Redis** | 8.4.0 (Homebrew) | **7.4.3 (Source build)** |
| **Qdrant** | Homebrew/pre-built | **1.17.1 (Source build, `cargo`)** |
| **MongoDB** | Homebrew | 8.2.7 (Homebrew) |
| **MariaDB** | ✗ via brew | **12.2.2 (Homebrew, for WordPress)** |
| **PHP** | ✗ via brew | **8.5.5 (Homebrew, WordPress ext. ✅)** |
| **SFTPGo** | Pre-built binary | **2.7.1 (Source build, patched dep)** |
| **FFmpeg** | 8.1 (Homebrew) | **8.1.1 (Homebrew)** |
| **OpenCode** | 1.14.39 | **1.14.39** |
| **Gemma4 LLM** | ✗ (not enough RAM) | **31B Q5_K_M @ 8081** |
## Build Approach
| Aspect | M4 | M5 |
|--------|-----|-----|
| **PostgreSQL** | `brew install postgresql@18` | `./configure && make && make install` |
| **Redis** | `brew install redis` | `make && cp src/redis-server ~/redis/bin/` |
| **Qdrant** | `brew install qdrant` | `cargo build --release --bin qdrant` (from GitHub) |
| **SFTPGo** | `brew install sftpgo` | `git clone && go build` (patched `go-m1cpu`) |
| **Philosophy** | Mixed (Homebrew + binary) | **Source-first** (GitHub source, checksums recorded) |
## Data Migration (M4 → M5)
| Data | Size | Status |
|------|------|--------|
| **Database (dev schema)** | 837MB dump | ✅ Restored (16 tables) |
| **Video file** | 2.2GB | ✅ Transferred |
| **output_dev JSON** | 2.9GB (462 files) | ✅ Transferred |
| **output JSON** | 65MB (2523 files) | ✅ Transferred |
| **Configs** | small | ✅ Transferred |
## Database Row Counts (M5)
| Table | Rows |
|-------|------|
| `pre_chunks` | 494,339 |
| `face_detections` | 6,211 |
| `tkg_nodes` | 2,414 |
| `identity_bindings` | 2,347 |
| `tkg_edges` | 1,320 |
## Key Differences
### 1. RAM (16GB vs 48GB)
- **M4 (16GB)**: Cannot run Gemma4 31B LLM locally. Memory pressure during concurrent pipeline processing.
- **M5 (48GB)**: Can run Gemma4 31B (Q5_K_M, ~20GB) + databases + playground simultaneously.
### 2. Build Philosophy
- **M4**: Quick setup via Homebrew bottles (pre-compiled).
- **M5**: **Source-first** — every service built from GitHub/official source. `SHA256` checksums recorded. Dependencies patched as needed (SFTPGo `go-m1cpu`).
### 3. Unique M5 Services
- **MariaDB + PHP**: Installed for WordPress/marcom portal development.
- **Gemma4 LLM**: Running on port 8081, accessible for RAG/identity clustering.
- **OpenCode**: Configured with Gemma4 provider for AI-assisted development.
### 4. Data Freshness
- M5 is a **snapshot** of M4's state at 2026-05-06 (commit `bac6c2d`). Changes made on M4 after sync date must be re-synced.

259
docs/M5_SETUP_LOG.md Normal file
View File

@@ -0,0 +1,259 @@
# M5 Dev Environment Setup Log
**Machine**: M5 MacBook Pro (MacOS 26.4.1, Apple M5 Max, 48GB)
**User**: accusys (admin group, sudo with password)
**Date**: 2026-05-06
**Setup by**: OpenCode
---
## 1. Source Code
| Item | Detail |
|------|--------|
| Repo | `https://gitea.momentry.ddns.net/warren/momentry_core.git` |
| Branch | `main` |
| Commit | `bac6c2d` (feat: identity clustering V3.0) |
| Sync method | rsync from M4 (192.168.110.210) |
| Path | `~/momentry_core_0.1/` |
---
## 2. Installed Services
### 2.1 PostgreSQL 18.3
| Field | Value |
|-------|-------|
| **Source** | [https://ftp.postgresql.org/pub/source/v18.3/postgresql-18.3.tar.gz](https://ftp.postgresql.org/pub/source/v18.3/postgresql-18.3.tar.gz) |
| **GitHub** | [https://github.com/postgresql/postgresql](https://github.com/postgresql/postgresql) |
| **Build method** | Manual `./configure && make && make install` |
| **Prefix** | `~/pgsql/18.3/` |
| **Data dir** | `~/pgsql/data/` |
| **Port** | 5432 |
| **Version** | PostgreSQL 18.3 |
| **SHA256** | `ab04939aafdb9e8487c2f13dda91e6a4a7f4c83368f5bedd23ee4ad1fda64afb` |
| **Start command** | `pg_ctl -D ~/pgsql/data -l ~/pgsql/pg.log start` |
| **Configure flags** | `--prefix=$HOME/pgsql/18.3 --with-uuid=e2fs --with-icu --with-openssl` |
| **Build date** | 2026-05-06 |
| **Notes** | `--with-uuid=e2fs` used (requires Homebrew `e2fsprogs`). macOS built-in UUID not detected by configure. |
### 2.2 pgvector 0.8.2
| Field | Value |
|-------|-------|
| **Source** | [https://github.com/pgvector/pgvector](https://github.com/pgvector/pgvector) |
| **Version** | v0.8.2 |
| **Build method** | `git clone && make && make install` |
| **SHA256** | `65dec31ec078d60ee9d8e1dac59be8a41edf8c79bf380cd0093691b0afd257a8` |
| **Build date** | 2026-05-06 |
| **Notes** | Built against PostgreSQL 18.3 source installation |
### 2.3 Redis 7.4.3
| Field | Value |
|-------|-------|
| **Source** | [https://github.com/redis/redis/archive/refs/tags/7.4.3.tar.gz](https://github.com/redis/redis/archive/refs/tags/7.4.3.tar.gz) |
| **GitHub** | [https://github.com/redis/redis](https://github.com/redis/redis) |
| **Version** | 7.4.3 |
| **Build method** | `make -j$(sysctl -n hw.ncpu)` |
| **Binary path** | `~/redis/bin/redis-server` |
| **Port** | 6379 |
| **SHA256** | `87b6a9ea145c56c1ace724acbb9906b7be4abddd44041545adf44ce9f4d0a615` |
| **Start command** | `redis-server --daemonize yes --port 6379` |
| **Build date** | 2026-05-06 |
### 2.4 Qdrant 1.17.1
| Field | Value |
|-------|-------|
| **Source** | [https://github.com/qdrant/qdrant.git](https://github.com/qdrant/qdrant.git) |
| **Version** | v1.17.1 |
| **Build method** | `cargo build --release --bin qdrant` |
| **Binary path** | `~/momentry_core_0.1/services/qdrant/target/release/qdrant` |
| **Storage dir** | `~/qdrant_storage` |
| **Port** | 6333 (HTTP), 6334 (gRPC) |
| **SHA256** | `8f8aa63840a0f948b43f9b95f784ace69595892de5dc581bb66bd62fd86d6c66` |
| **Build date** | 2026-05-06 |
| **Config** | `~/qdrant_config.yaml` |
| **Start command** | `qdrant --config-path ~/qdrant_config.yaml &` |
| **Build deps** | protoc (Homebrew protobuf), cmake |
### 2.5 MongoDB 8.2.7
| Field | Value |
|-------|-------|
| **Source** | Homebrew `mongodb/brew/mongodb-community` |
| **Version** | 8.2.7 |
| **Port** | 27017 |
| **Start command** | `brew services start mongodb/brew/mongodb-community` |
| **Install date** | 2026-05-06 |
### 2.6 MariaDB 12.2.2
| Field | Value |
|-------|-------|
| **Source** | Homebrew `mariadb` |
| **Version** | 12.2.2-MariaDB |
| **Port** | 3306 |
| **Start command** | `brew services start mariadb` |
| **Install date** | 2026-05-06 |
### 2.7 PHP 8.5.5
| Field | Value |
|-------|-------|
| **Source** | Homebrew `php` |
| **Version** | 8.5.5 |
| **WordPress extensions** | mysqli, pdo_mysql, gd, xml, mbstring, curl, zip, json, intl, bcmath, gmp, openssl |
| **Start command** | `brew services start php` |
| **Install date** | 2026-05-06 |
### 2.8 FFmpeg / FFprobe 8.1.1
| Field | Value |
|-------|-------|
| **Source** | Homebrew `ffmpeg` |
| **Version** | 8.1.1 |
| **SHA256** | `00d01197255300c02122c783dd0126a9e7f47d6c6a19faafae2e6610efd071d3` |
| **Install date** | 2026-05-06 |
### 2.9 SFTPGo 2.7.1
| Field | Value |
|-------|-------|
| **Source** | [https://github.com/drakkan/sftpgo.git](https://github.com/drakkan/sftpgo.git) |
| **Version** | v2.7.1 |
| **Build method** | `git clone && go build -o sftpgo_bin ./` |
| **Binary path** | `~/momentry_core_0.1/services/sftpgo_bin` |
| **SHA256** | `550b6653f8f2cd7c58620e128e85be571a6702c79cf374824ad9b420ca039db1` |
| **Build date** | 2026-05-06 |
| **Patch** | Upgraded `go-m1cpu` from v0.2.0 → v0.2.1 to fix SIGTRAP crash on macOS 26.4.1 |
| **Notes** | Pre-built binary from GitHub releases crashed with `go-m1cpu` cgo compatibility issue. Source build with patched dependency resolved. |
### 2.10 OpenCode 1.14.39
| Field | Value |
|-------|-------|
| **Source** | [https://opencode.ai/install](https://opencode.ai/install) |
| **Version** | 1.14.39 |
| **Binary path** | `~/.opencode/bin/opencode` |
| **SHA256** | `def4a786c257bd6a965e46a2b069802496681b9eea20261d7d1b55629af3d1da` |
| **Install date** | 2026-05-06 |
### 2.11 Python 3.11 + Packages
| Field | Value |
|-------|-------|
| **Source** | Homebrew `python@3.11` |
| **Version** | 3.11.15 |
| **Path** | `/opt/homebrew/bin/python3.11` |
| **Key packages** | coremltools, opencv-python, numpy, psycopg2, torch, transformers, whisperx, etc. |
| **Requirements** | `~/momentry_core_0.1/requirements.txt` |
| **Install date** | 2026-05-06 |
| **FaceNet model** | `models/facenet512.mlpackage` (512D CoreML, loads OK) |
### 2.12 Build Tools
| Tool | Version | Source |
|------|---------|--------|
| Rust | 1.95.0 | rustup (pre-installed) |
| Go | 1.26.2 | Homebrew `go` |
| cmake | 4.3.2 | Homebrew `cmake` |
| pkg-config | - | Homebrew `pkg-config` |
---
## 3. Momentry Configuration
### 3.1 Environment Files
| File | Purpose |
|------|---------|
| `.env` | Production config (port 3002) |
| `.env.development` | Development config (port 3003) |
Key settings:
- `DATABASE_URL=postgres://accusys@localhost:5432/momentry`
- `REDIS_URL=redis://:accusys@localhost:6379`
- `DATABASE_SCHEMA=dev`
- `MOMENTRY_SERVER_PORT=3003` (dev) / `3002` (prod)
- `MOMENTRY_API_KEY=muser_test_apikey`
- `MOMENTRY_PYTHON_PATH=/opt/homebrew/bin/python3.11`
- `MOMENTRY_SCRIPTS_DIR=/Users/accusys/momentry_core_0.1/scripts`
### 3.2 Database Tables Created
| Table | Created by |
|-------|-----------|
| `dev.videos` | Manual SQL |
| `dev.chunks` | Manual SQL |
| `dev.monitor_jobs` | Manual SQL |
| `dev.processor_results` | Manual SQL |
| `dev.talents` | Manual SQL |
| `dev.identity_bindings` | Manual SQL |
| `dev.api_keys` | Manual SQL |
### 3.3 API Key
- Key: `muser_test_apikey`
- Hash (SHA256): `3f2fa16e44ff74267786fdf979b9c33dac0cad515282e4937a0776756a61e821`
- Status: active
---
## 4. Running Services (Verified)
| Service | Port | Status |
|---------|------|--------|
| PostgreSQL | 5432 | ✅ |
| Redis | 6379 | ✅ |
| Qdrant | 6333 | ✅ |
| MongoDB | 27017 | ✅ |
| MariaDB | 3306 | ✅ |
| Momentry Playground | 3003 | ✅ |
| Gemma4 LLM | 8081 | ✅ (pre-installed) |
---
## 5. PATH Configuration
`.zshrc`:
```zsh
export PATH="/opt/homebrew/bin:/opt/homebrew/opt/postgresql@18/bin:$HOME/.opencode/bin:$PATH"
```
Also available:
- `$HOME/pgsql/18.3/bin` — source-built PostgreSQL tools
- `$HOME/redis/bin` — source-built Redis
- `$HOME/.cargo/bin` — Rust/Cargo tools
---
## 6. M5 End-to-End Test Results (Charade Full Movie)
Run date: 2026-05-06 20:38-20:57
| Stage | Time | Result |
|-------|------|--------|
| **Swift_face** (Vision ANE detection) | 867s (14.5 min) | 3999 frames (interval=30) |
| **CoreML FaceNet** (512D embedding) | 271s (4.5 min) | 6186 face embeddings |
| **Face tracker** (scene-cut aware) | ~30s | 1538 traces |
| **DB store** | ~5s | 6186 detections in `dev.face_detections` |
| **Total** | ~19 min | 1 long video (412k frames, 2.2GB) |
**Scene-cut effect**: 1538 traces (vs 379 without scene-cut reset in M4 data). Scene boundaries correctly split traces.
**Models used**:
- Face detection: Apple Vision (ANE) via `swift_face`
- Face embedding: CoreML FaceNet 512D via `facenet512.mlpackage`
- Text embedding: `mxbai-embed-large` (1024D) via Ollama
---
## 7. Known Issues
1. **Momentry API status `degraded`**: Expected on fresh setup. Some cache/processing dependencies not fully initialized.
2. **SFTPGo startup requires config**: Binary built from source, needs config file for production use.
3. **Migration scripts not all run**: Base tables created manually. Some migration files (017+) reference tables/columns that need verification.
4. **OpenCode config**: `~/.config/opencode/config.json` not yet configured for M5 Gemma4 provider.

View File

@@ -0,0 +1,94 @@
# Non-Human Sound Detection — Tool Selection Report
**Date:** 2026-05-10
**Movie:** Charade (1963), 113 min
**Audio:** 16kHz mono WAV
**Goal:** Detect non-human sound events (gunshots, impacts, doors, music, etc.)
## Tested Approaches
### Approach A: AST AudioSet (HuggingFace)
| Item | Detail |
|------|--------|
| Model | `MIT/ast-finetuned-audioset-10-10-0.4593` |
| Method | Audio Spectrogram Transformer, fine-tuned on AudioSet-2M (527 classes) |
| Dependencies | `transformers`, `torch` ✅ (no torchcodec needed) |
| Load time | ~1s on M5 |
| Inference time | ~0.5s per 3-second clip (805k params, float32) |
| Accuracy | Good — correctly distinguishes speech vs. door vs. music |
**Test results on Charade:**
| Time | Energy-based said | AST AudioSet said | Verdict |
|------|------------------|-------------------|---------|
| 0:10 | — | Environmental noise (26%) | Background noise, plausible |
| 10:32 | Gunshot candidate (43x) | **Speech (76%)** | ✅ AST correct |
| 57:00 | Gunshot candidate (49x) | **Door (62%) + Slam (5%)** | ✅ AST correct |
| 65:13 | Gunshot candidate (50x) | **Speech (58%)** | ✅ AST correct |
| 85:12 | Gunshot candidate (39x) | **Speech (68%)** | ✅ AST correct |
**Conclusion**: Energy-based impulse detection has **100% false positive rate** for gunshot detection. AST AudioSet correctly classifies all candidates as non-gunshot.
### Approach B: Custom Energy + Spectral Features
| Item | Detail |
|------|--------|
| Method | RMS energy + spectral centroid + sub-band energy ratios |
| Speed | ~3s for full 113-min movie (every 10th window) |
| Accuracy | Poor — cannot distinguish gunshot from speech, door, music |
| Result | 1 "gunshot_candidate" from 453 test windows; all false positives on verification |
**Conclusion**: Useful as a **coarse pre-filter** (Stage 1), not as a standalone classifier.
## Two-Stage Design
```
Stage 1 (Energy filter, ~1 min):
Full audio → sliding window RMS + centroid → ~200 candidate windows
|
v
Stage 2 (AST classifier, ~2 min):
Extract 3-sec audio for each candidate → AST AudioSet classification
|
v
Non-speech events: gunshot, explosion, door slam, music, etc.
```
Estimated processing: ~3 min for full movie (vs. 75 min for full AST scan)
## Key AudioSet Classes Relevant to Charade
| Class | AudioSet ID | Relevance |
|-------|-------------|-----------|
| Gunshot, gunfire | 402 | **Primary target** |
| Explosion | 400 | Hand grenade in plot |
| Door slams | 404 | Scenes at hotel, apartment |
| Music | 130-133 | Background score |
| Speech | 0-3 | Already handled by ASR |
| Vehicle | 100-110 | Car sounds in Paris chase |
| Glass break | 424 | Window breaking scene |
## Actor-voice gender mismatches (resolved by fine-grained ASRX)
During the speaker mapping work, 20 segments where the old face→TMDb assignment said "Audrey Hepburn" but the new ASRX voice embedding clearly said "MALE". These segments were verified via video clips and confirmed to be scenes where:
1. A male speaker (Cary Grant or other) is speaking while Audrey Hepburn's face is on screen
2. The old pipeline incorrectly assigned the speaker name based on face identity
3. The fine-grained sliding window approach correctly resolves these
The 20 segments were from SPEAKER_5 (10 segs) and SPEAKER_9 (10 segs), both of which mapped to MALE voice clusters. These were re-assigned to "Cary Grant" or "Unknown" as appropriate.
## Recommendations
| Approach | Speed | Accuracy | Best for |
|----------|-------|----------|----------|
| Energy pre-filter | ✅ 1 min | ❌ Low | Stage 1: candidate selection |
| AST AudioSet | ⚠️ 2 min | ✅ High | Stage 2: event classification |
| Full AST scan | ❌ 75 min | ✅ High | N/A — two-stage is better |
**Design**: Two-stage pipeline: energy pre-filter → AST classifier
**Implementation path**:
1. Write `scripts/non_human_sound_detector.py` with the two-stage design
2. Output `{uuid}.sound_events.json` with typed events
3. Integrate into the sound_event_detector framework

View File

@@ -0,0 +1,150 @@
# Phase 1 Completion Report — v2 (fine-grained ASRX)
**File**: Charade (1963) Cary Grant & Audrey Hepburn
**UUID**: `aeed71342a899fe4b4c57b7d41bcb692`
**Date**: 2026-05-10
**System**: M5 (MacBook Pro, 48GB, Apple Silicon)
---
## 1. Processor Outputs
| File | Size | Description |
|------|------|-------------|
| `asr.json` | 413KB | 3,417 segments, full movie coverage (Whisper small) |
| `asrx.json` | **18MB** | **4,188 segments** (fine-grained, ECAPA-TDNN) |
| `asrx_fine.json` | 45MB | 4,188 fine segments + voice embeddings (intermediate) |
| `cut.json` | 329KB | 2,260 scenes |
| `yolo.json` | 181MB | 169,625 frames with object detections |
| `face.json` | **106MB** | 4,550 frames, 5,910 faces @ 8Hz (CoreML 512D) |
| `face_traced.json` | 110MB | Traced faces with 423 identity traces |
| `lip.json` | 492KB | Lip openness analysis |
| `ocr.json` | 277KB | 606 OCR frames |
| `pose.json` | 26MB | 4,211 pose frames |
| `scene.json` | 403B | Scene classification |
## 2. Pipeline 8-Stage Checklist
| Stage | Status | Detail |
|-------|--------|--------|
| ASR | ✅ | 3,417 segments, last end 6,773s (100%) |
| ASRX | ✅ | **4,188 segments** (fine-grained, 10→3 speakers mapped) |
| Sentence Chunks | ✅ | **4,188 sentence chunks** with yolo_objects + face_ids |
| Vectorization | ✅ | 4,188 Qdrant (768D), all 3 collections updated |
| Face Trace | ✅ | 423 traces, 11,820 detections @ 8Hz |
| TKG Graph | ✅ | 498 nodes, 1,617 edges |
| Trace Chunks | ✅ | 423 trace chunks |
| Phase 1 Release | ✅ | 3.0GB package |
## 3. Speaker Identification
### ASRX Enhancement (3417 → 4188 segments)
The original Whisper ASR merges rapid back-and-forth dialogue into single segments. A sliding-window ECAPA-TDNN approach was developed to detect speaker change points within each ASR segment:
1. **Sliding window**: 1.5s window, 0.75s stride across full audio
2. **ECAPA-TDNN 192D embedding** per window
3. **Classification** against reference centroids (Cary Grant, Audrey Hepburn, Unknown)
4. **Majority-vote smoothing** over 3 adjacent windows
5. **Change point detection** where classified speaker changes
6. **Split** original ASR segment at each change point
**Result**: 3,417 → **4,188 segments** (+771, +22.6%). Validated via gender classification (ECAPA-TDNN → 92.3% agreement with character identity).
### Speaker Mapping (Centroid-based)
| Speaker ID | Name | Segments | Duration | Voice Gender |
|------------|------|----------|----------|-------------|
| SPEAKER_0 | Audrey Hepburn | 1,658 | 2,786s | FEMALE |
| SPEAKER_1 | Cary Grant | 2,033 | 3,962s | MALE |
| SPEAKER_2 | Unknown (minor) | 497 | 806s | MIXED |
Method: Reference centroids built from 3,107 known segments (1,420 Cary + 1,689 Audrey). Each fine segment classified by cosine similarity to nearest centroid. No cross-contamination between speaker clusters.
### Gender Validation
Two small clusters (SPEAKER_5: 10 segs, SPEAKER_9: 10 segs) initially showed MALE voice → Audrey assignment. Video clip verification confirmed these are segments where a male voice speaks while Audrey is on screen (old face-based matching was incorrect). The fine-grained segmentation correctly resolves these.
## 4. Sentence Chunks — Full Migration
All 4,188 fine segments were written to `dev.chunks` with complete data per chunk:
| Chunk Field | Value | Source |
|-------------|-------|--------|
| `start_time`/`end_time` | Fine segment boundaries | `asrx_fine.json` |
| `start_frame`/`end_frame` | time × 25fps | Calculated |
| `content` | `{data: {text, text_normalized}, rule: rule_1}` | ASR text |
| `metadata.yolo_objects` | Dedup class names in frame range | `pre_chunks(yolo)` |
| `metadata.face_ids` | Trace IDs in frame range | `face_detections` |
| `metadata.speaker_name` | Centroid-matched identity | `asrx_fine.json` |
- 4,158/4,188 chunks have YOLO objects (avg 3-5 object classes)
- 398/4,188 chunks have face IDs (face data covers first ~12 min only)
### Parent/Story Chunks
| Metric | Before (v1) | After (v2) |
|--------|-------------|------------|
| Children per parent | 15 (fixed) | 15 (fixed) |
| Total parents | 228 | **280** |
| LLM summaries | 228 (Gemma4) | **280** (Gemma4, regenerated) |
| Qdrant stories | 456 pts | **560 pts** |
## 5. Qdrant Vector Collections
| Collection | Dims | Points | Content | Status |
|-----------|------|--------|---------|--------|
| `momentry_dev_v1` | 768 | **4,188** | Sentence chunk embeddings (EmbeddingGemma) | ✅ |
| `momentry_dev_stories` | 768 | **560** | 280 dialogue + 280 LLM summary | ✅ |
| `momentry_dev_faces` | 512 | 5,910 | Face embeddings (8Hz CoreML) | ✅ |
| `momentry_dev_voice` | 192 | **4,188** | Voice embeddings (ECAPA-TDNN) | ✅ |
| `sentence_story` | 768 | **4,188** | Sentence template with speaker | ✅ |
| `sentence_summary` | 768 | **4,188** | Context-aware LLM sentence summary | ✅ |
## 6. ASR Model Selection
A comprehensive benchmark (5 models × 2 VAD settings × 3 test clips = 30 runs) showed:
| Model | Segments | Chars | Runtime | Verdict |
|-------|----------|-------|---------|---------|
| tiny | 56 avg | 1,730 | **9.2s** | Most segments, best text capture |
| **small** | **55 avg** | **1,704** | **17.6s** | **Best balance (current)** |
| base | 42 avg | 1,751 | 10.1s | Good but fewer segments |
| medium | 52 avg | 1,627 | 339.6s | Slow, loses text |
| large-v3 | 20 avg | 1,249 | 68.8s | **Worst**: merges utterances, loses 26% text |
**Conclusion**: Keep `faster-whisper small (VAD 500ms)`. The missing-text problem is not solvable by model size — even tiny captures more text than large-v3. Root cause is Whisper's lack of speaker turn detection in segment boundary logic, which is solved by the sliding-window ASRX approach above.
## 7. Release Package
| Component | Size |
|-----------|------|
| `output_json/` | 13 processor files |
| `chunks.csv` | 3.2MB |
| `vectors.csv` | 58MB |
| `identities.csv` | 1MB |
| `schema.sql` | 30KB |
| Qdrant snapshots (5 collections) | ~3GB |
| `RELEASE_INFO.txt` | Metadata |
| **Total** | **~3.0GB** |
## 8. Key Technical Decisions
| Decision | Rationale |
|----------|-----------|
| Sliding window 1.5s/0.75s | Optimal balance: captures turn boundaries without over-splitting |
| Centroid-based classification | 0.8+ similarity, no retraining needed, 100% consistent |
| Word-timestamp ASR for text | Re-run with `word_timestamps=True`, 87% coverage; remaining 13% → per-segment ASR fallback |
| Fixed 15 children/parent | Maintains Phase 1 design consistency |
| `yolo_objects` dedup | Only class names stored per chunk (not per-frame) |
| `face_ids` via `trace_id` | `face_id` column is NULL in DB; `trace_id` is the actual identifier |
| Keep ASR small model | Benchmarked 5 models; larger models lose text, not gain it |
| `app.run(threaded=True)` | Dashboard v2: single-threaded Flask was blocking on subprocess calls |
## 9. Phase 2 Preparation
Pending for Phase 2:
- Rule 3 scene chunking (cut-based parent chunks)
- 5W1H Agent (LLM-generated scene summaries)
- Full pipeline + 5W1H release packaging
- Source separation (Demucs/HPSS) for overlapping speech scenarios

View File

@@ -0,0 +1,63 @@
# Phase 1 Release Checklist
**UUID**: `aeed71342a899fe4b4c57b7d41bcb692`
**Model**: v2 (fine-grained ASRX, 4,188 segments)
**Date**: 2026-05-10
## 1. Processor Outputs
- [x] `asr.json` — faster-whisper small, 3,417 segments
- [x] `asrx.json` — ECAPA-TDNN fine-grained, 4,188 segments
- [x] `cut.json` — 2,260 scene cuts
- [x] `yolo.json` — 169,625 frames, object detections
- [x] `face.json` — 4,550 frames, 5,910 faces @ 8Hz
- [x] `face_traced.json` — 423 traced identities
- [x] `lip.json` — Lip openness per ASRX segment
- [x] `ocr.json` — 606 OCR frames
- [x] `pose.json` — 4,211 pose frames
- [x] `scene.json` — Scene classification
## 2. Pipeline Stages
- [x] ASR: 3,417 segments, full movie
- [x] ASRX: 4,188 segments (fine-grained), 3 speakers
- [x] Sentence chunks: 4,188 in `dev.chunks`
- [x] Vectorization: 4,188 in Qdrant `momentry_dev_v1`
- [x] Face trace: 423 traces, 11,820 detections
- [x] TKG: 498 nodes, 1,617 edges
- [x] Trace chunks: 423 in `dev.chunks`
- [x] All 8 stages passing
## 3. Qdrant Collections
- [x] `momentry_dev_v1` — 4,188 pts, 768D (EmbeddingGemma)
- [x] `momentry_dev_stories` — 560 pts, 768D (280 dialogue + 280 summary)
- [x] `momentry_dev_faces` — 5,910 pts, 512D (CoreML FaceNet)
- [x] `momentry_dev_voice` — 4,188 pts, 192D (ECAPA-TDNN)
- [x] `sentence_story` — 4,188 pts, 768D (sentence template)
- [x] `sentence_summary` — 4,188 pts, 768D (context-aware LLM)
## 4. Database (dev.chunks)
- [x] Sentence chunks: 4,188 with speaker_name, speaker_id
- [x] Story chunks: 280 with LLM summaries
- [x] Cut chunks: 1,130
- [x] Trace chunks: 423
- [x] YOLO objects in metadata: 4,158/4,188
- [x] Face IDs in metadata: 398/4,188
- [x] Parent-child relationships set
## 5. Speaker Mapping
- [x] SPEAKER_0 → Audrey Hepburn (1,658 segs, gender FEMALE ✅)
- [x] SPEAKER_1 → Cary Grant (2,033 segs, gender MALE ✅)
- [x] SPEAKER_2 → Unknown (497 segs, minor characters)
- [x] Voice embeddings validated via gender classification
## 6. Release Package
- [x] Phase 1 release packaged at `release/phase1/latest/`
- [x] Qdrant snapshots for all 5 collections
- [x] `chunks.csv`, `vectors.csv`, `identities.csv` exported
- [x] `schema.sql` from PostgreSQL
- [x] Dashboard v2 running at port 5050

View File

@@ -0,0 +1,134 @@
# Processor 產出機制檢討
## 三層機制定義
### 1. 中斷接續Interruption Resume
Process 被殺掉後,重啟時能接續進度。
**現狀**: 大部分 processor 有 `.tmp``.partial` 保護,但重跑時從頭開始。
### 2. 補充機制Supplement
完成度不足時,只補沒做完的部分,不重跑整個。
**現狀**: 全部從頭跑,無補充。
### 3. 糾錯機制Error Correction
輸出檔損毀時能自動偵測並修復。
**現狀**: file-existence check 只檢查檔案存在,不檢查內容是否有效。
---
## Processor 逐一檢討
### ASR
| 面向 | 現狀 | 問題 |
|------|------|------|
| 中斷接續 | ✅ `.tmp``.partial`executor | ✅ OK |
| 補充機制 | ❌ 每次從頭跑 | 若跑到 50% 被殺,下次從 0% 開始 |
| 糾錯機制 | ❌ 不驗證內容 | file-existence check 看到 `.json` 存在就跳過,不管內容 |
| Pipe | ✅ executor.run() | ✅ |
| Timeout | ✅ 已移除None | ✅ |
**改善方案**:
- 補充ASR 重跑時掃描 existing `.json``.partial`,找出最後 segment 的 `end_time`,傳入 `--resume-from` 給 Python script
- 糾錯file-existence check 對 `.json``serde_json::from_str` 驗證,無效 → 視為不存在
### ASRX
| 面向 | 現狀 | 問題 |
|------|------|------|
| 中斷接續 | ❌ **不用 executor**,直接寫 `.json` | 被殺掉時留下壞檔 |
| 補充機制 | ❌ 同 ASR | 依賴 ASRASR 不完整 ASRX 也不能跑 |
| 糾錯機制 | ❌ 不驗證內容 | 同上 |
| Pipe | ❌ **raw Command**,沒有 `.tmp` 保護 | 緊急 |
| Timeout | ⚠️ 7200s hardcode | 應改為 None同 ASR |
**改善方案**:
- **最優先**: 改為使用 `executor.run()`,獲得 `.tmp` 保護
- 其他同 ASR
### YOLO
| 面向 | 現狀 | 問題 |
|------|------|------|
| 中斷接續 | ✅ executor `.tmp` | ✅ |
| 補充機制 | ❌ 從頭跑 | 若跑到 frame 100,000 被殺,下次從 frame 0 |
| 糾錯機制 | ❌ 不驗證內容 | yolo.json 之前就是壞的但 file check 跳過 |
**改善方案**:
- 補充:掃描 `.partial` 的最後 frame傳入 `--resume-frame` 給 Python script
- 糾錯file-existence check 對 `.json` 做 JSON parse 驗證
### FACE / POSE / OCR
| 面向 | 現狀 | 問題 |
|------|------|------|
| 中斷接續 | ✅ executor `.tmp` | ✅ |
| 補充機制 | ❌ 從頭跑 | 同 YOLO |
| 糾錯機制 | ❌ 不驗證內容 | 同 YOLO |
**改善方案**: 同 YOLO
### CUT
| 面向 | 現狀 | 問題 |
|------|------|------|
| 中斷接續 | ✅ executor `.tmp` | ✅ |
| 補充機制 | ✅ register 階段已完成,直接載入 | ✅ |
| 糾錯機制 | ❌ 不驗證內容 | 同 YOLO |
**改善方案**: 糾錯即可
### SCENE
| 面向 | 現狀 | 問題 |
|------|------|------|
| 中斷接續 | ✅ **最完整**:檢查 `.err`/`.json`/`.tmp` 三種狀態 | ✅ |
| 補充機制 | ❌ 從頭跑 | ✅scene 很快) |
| 糾錯機制 | ⚠️ 有檢查 `.err` | ✅ |
### VISUAL_CHUNK
| 面向 | 現狀 | 問題 |
|------|------|------|
| 中斷接續 | ✅ executor `.tmp` | ✅ |
| 補充機制 | ❌ | ❌ |
| 糾錯機制 | ❌ **錯誤被吞掉**(回傳空結果) | 應回報 error 而非靜默失敗 |
**改善方案**: 不要吞錯誤,讓 error 往上傳
### STORY
| 面向 | 現狀 | 問題 |
|------|------|------|
| 中斷接續 | ✅ executor `.tmp` | ✅ |
| 補充機制 | ❌ | ❌ |
| 糾錯機制 | ❌ | ❌ |
---
## 優先級
### P0 — 立即修復
1. **ASRX 改用 executor.run()**
- 檔案:`src/core/processor/asrx.rs`
- 獲得 `.tmp` 保護、SIGKILL process group、`.partial` 保留
- 移除 hardcode timeout
### P1 — 糾錯機制
2. **File-existence check 加入 JSON 驗證**
- 檔案:`src/worker/job_worker.rs`
-`output_path.exists()` 之後,對 `.json``serde_json::from_str::<Value>`
- 若 parse 失敗 → 不 skip當作檔案不存在繼續跑
- 若 parse 成功但內容空(無 segments/frames→ 當不完整
### P2 — 補充機制
3. **ASR resume-from 補充**
- 檔案:`src/core/processor/asr.rs` + `scripts/asr_processor.py`
- Rust 端發現 `.partial` 存在,讀取最後 segment 的 end_time
- 傳入 `--resume-from {time}` 給 Python script
- Python script 跳過 `--resume-from` 之前的音訊
4. **YOLO/Face/Pose resume-frame 補充**
- 檔案:各 processor.rs + 對應 Python script
- 掃描 `.partial` 中的最後 frame_number
- 傳入 `--resume-frame {frame}` 給 Python script
### P3 — 其他
5. **VisualChunk 不吞錯誤**
6. **Executor SIGTERM → SIGKILL 兩段式關閉**

81
docs/RELEASE_PACKAGING.md Normal file
View File

@@ -0,0 +1,81 @@
# Release Packaging Design
三類包:**開發系統升級包** + **生產系統升級包** + **檔案內容包**,完全獨立。
## 1. 開發系統升級包 (System/Dev)
給 playgroundport 3003, dev schema使用。
```
release/system/dev/{version}/
├── RELEASE_INFO.txt
├── source.tar.gz ← Rust + scripts source code
├── .env.development ← DATABASE_SCHEMA=dev, port 3003
├── schema_dev.sql ← dev schema DDL
├── scripts/
│ ├── pipeline_status.py
│ ├── generate_asr1.py
│ ├── apply_asr_corrections.py
│ ├── clean_sentence_text.py
│ └── import_file_package.py ← 匯入檔案內容包
├── test/
│ └── api_test.sh
└── migration/
└── {prev}_to_{version}.sql
```
升級:覆蓋 code + 執行 migration → `cargo build --bin momentry_playground` → 重啟 3003
## 2. 生產系統升級包 (System/Prod)
給 productionport 3002, public schema使用。
```
release/system/prod/{version}/
├── RELEASE_INFO.txt
├── source.tar.gz ← Rust + scripts source code
├── .env ← DATABASE_SCHEMA=public, port 3002
├── schema_public.sql ← public schema DDL
├── scripts/ (same as dev)
├── test/
│ └── api_test.sh
└── migration/
└── {prev}_to_{version}.sql
```
## 3. 檔案內容包 (File)
一個影片的完整資料,開發與生產環境共用。
```
release/files/{file_uuid}/{version}/
├── metadata.json ← Registration info
├── RELEASE_INFO.txt
├── processors/ ← output_dev/{uuid}.*.json
│ ├── asr.json
│ ├── asrx.json
│ ├── asr-1.json
│ ├── yolo.json
│ ├── face.json
│ ├── pose.json
│ ├── ocr.json
│ ├── cut.json
│ └── scene.json
├── face_detections.csv ← 該檔案的所有 face detections
├── identities.csv ← 關聯的 identities
├── tkg_nodes.csv ← TKG nodes
├── tkg_edges.csv ← TKG edges
├── qdrant/ ← Qdrant snapshots for this file
│ ├── momentry_dev_v1.snapshot
│ ├── sentence_story.snapshot
│ └── ...
└── RELEASE_INFO.txt
```
### 匯入流程
```
1. POST /api/v1/files/register → 取得 file_uuid
2. python3 scripts/import_file_package.py --uuid {uuid} --package path/
3. 檔案狀態更新為「已註冊已處理」
```

240
docs/RELEASE_PHASES.md Normal file
View File

@@ -0,0 +1,240 @@
# Momentry Model — 分階段交付
## 核心架構
```
Pipeline (training)
│ 每個 processor 產出 .json
│ Rule 1/3 Ingestion → chunks + embeddings
momentry model for {video} ← 每部影片 = 一個 model
│ release/phase1/latest/
│ release/phase2/latest/
momentry core (inference engine) ← Rust API server
│ momentry_playground (dev)
│ momentry (production)
Search / Query / Identity APIs
```
- **Pipeline** = training phase影片 → processor output → chunks → embeddings
- **Model** = 每部影片的產出 packageoutput_json + chunks + vectors
- **Engine** = momentry core吃 model 提供 APIsearch, trace, identity
每個影片可有多個 model 版本,命名保留升級空間:
| Model 版本 | Qdrant Collection | 內容 | 觸發時機 |
|-----------|------------------|------|---------|
| `{uuid}_v1` | `momentry_dev_v1` | sentence chunk embeddingbase | ASR + ASRX + Rule 1 完成 |
| `{uuid}_v2` | `momentry_dev_v2` | 完整 pipeline + 5W1H | 全部完成 |
| `{uuid}_v3` | `momentry_dev_v3` | object identity + custom detector | v2 + object instance matching 完成 |
各版本共存不覆蓋。
## 階段劃分
### Phase 1Sentence Chunk Embeddingbase model
**觸發時機**: ASR + ASRX 完成 + Rule 1 Ingestion + vectorize 完成
**交付內容**:
- `{uuid}.asr.json`
- `{uuid}.asrx.json`
- chunkschunk_type = 'sentence'
- chunk_vectorssentence embedding
**用途**: 終端使用者可進行語意搜尋
### Phase 2完整 Pipelinev2 model
**觸發時機**: 全部 processor 完成 + Rule 3 Ingestion + 5W1H Agent
**交付內容**:
- Phase 1 全部內容
- 所有 `{uuid}.*.json`cut, yolo, face, pose, ocr, ...
- chunkschunk_type = 'cut', 'visual', 'trace', 'story'
- chunk_vectorssummary embedding
- identities / identity_bindings / face_detections
**用途**: 完整搜尋 + 摘要 + 人物識別
---
## Worker Pipeline
```
ASR 完成 → ASRX 完成
Rule 1 Ingestion (sentence chunks)
vectorize_chunks (sentence embedding)
📦 Phase 1 release ───→ release/phase1/latest/ (base model)
其他 processors 繼續 (yolo, face, pose, ocr, ...)
Rule 3 Ingestion + 5W1H Agent
📦 Phase 2 release ───→ release/phase2/latest/ (full model)
```
## 產出目錄結構
```
release/
├── phase1/
│ ├── {version}_{timestamp}/
│ │ ├── output_json/ ← 所有已完成的 .json
│ │ ├── chunks.csv ← sentence chunks
│ │ ├── vectors.csv ← sentence embeddings
│ │ ├── schema.sql ← chunks table DDL
│ │ └── RELEASE_INFO.txt
│ └── latest → {version}_{timestamp}
└── phase2/
├── {version}_{timestamp}/
│ ├── output_json/ ← 所有 .json
│ ├── chunks.csv ← 所有 chunks
│ ├── vectors.csv ← 所有 embeddings
│ ├── identities.csv ← 人物身分
│ ├── schema.sql ← 完整 schema
│ └── RELEASE_INFO.txt
└── latest → {version}_{timestamp}
```
## momentry model vs momentry core
| | momentry model | momentry core |
|---|---|---|
| 類比 | 訓練好的 weights | inference engine |
| 內容 | `.json` + chunks + vectors | Rust binary |
| 生命週期 | 每部影片產出一個 | 一個 binary 服務所有影片 |
| 版本 | `{uuid}_v1`base / `{uuid}_v2` / `{uuid}_v3` | `momentry_playground` / `momentry` |
| 交付對象 | 終端使用者 | 部署工程師 |
---
## Wiki 機制:每個 model 都可被調整
每個 momentry model`{uuid}_v1` / `v2` / `v3`)不只是唯讀的產出,而是可透過 wiki 機制持續改善。
### 與傳統 RAG 的區別
| | 傳統 RAG | momentry wiki |
|---|---|---|
| 知識儲存 | vector DBephemeral | model packagepermanent |
| 修正方式 | query 時 LLM 決定是否採用 | 使用者/Agent 直接編輯 |
| 修正持久性 | ❌ 下次 query 就消失 | ✅ 寫入 model版本化保存 |
| 模型改進 | 無(僅改變 prompt | 下次 version bump 時合併為 ground truth |
| 協作方式 | 單向retrieve → generate | 雙向(編輯 → 合併 → 改進) |
| 離線可用 | ❌ 需 vector DB + LLM | ✅ 離線查閱 wiki 目錄 |
**momentry wiki 不是 RAG 的替代品,而是 model 的生命週期管理機制。**
### 概念
```
momentry model (release package)
├── output_json/ ← 唯讀processor 產出
├── chunks.csv ← 唯讀ingestion 產出
├── vectors.csv ← 唯讀embedding 產出
└── wiki/ ← 可編輯,使用者貢獻知識
├── identities.json ← "trace 5 = Audrey Hepburn"
├── objects.json ← "object 42 = 郵票 #1"
├── corrections.json ← "ASR 'Hello' → 'Halo'"
└── changelog.json ← 編輯歷史
```
### 資料流向
```
使用者/Agent 編輯 wiki
DB wiki_entries + wiki_revisions 寫入
下次 release 打包時 merge 進 model
TKG label 更新 (tkg_nodes.label)
新版 model version bump
```
### 與 TKG 的關係
wiki 的 identity 和 object 標註會回寫到 TKG node label
```
(face_trace:5) label="Audrey Hepburn" ← wiki 編輯
(object_instance:42) label="郵票 #1" ← wiki 編輯
```
這些編輯累積後,可做為下一版 model training 的 ground truth。
### 實作方向
**DB 層** — 新 table `wiki_entries` + `wiki_revisions`
```sql
wiki_entries (target_type, target_id, title, body, summary, status, version, file_uuid)
wiki_revisions (entry_id, version, title, body, summary, change_summary, edited_by)
```
**API 層** — CRUD + 版本歷史:
```
GET /api/v1/wiki/{target_type}/{target_id}
PUT /api/v1/wiki/{target_type}/{target_id}
GET /api/v1/wiki/{target_type}/{target_id}/revisions
POST /api/v1/wiki/search
```
**打包層**`release_pack.py` 加入 wiki 匯出,與 model 共存
---
## Phase 3Object Identityv3 model
### 目標
從影片中提取關鍵物體(郵票、手槍、信封、放大鏡...),對同類物體做 instance-level 的跨畫面追蹤與辨識,達到類似 face trace 的效果 — 不只是 detect class還能區分「這一張郵票」vs「那一張郵票」。
### 現狀問題
1. **COCO 80 類不包含關鍵物體** — 郵票、手槍、信封、放大鏡等不在 COCO 資料集中
2. **YOLOv5nano 偵測率低** — 即使是 COCO 類別knife, cell phone在 nano 模型上 recall 不足
3. **無 object instance matching** — 目前只有 frame-level detection沒有跨 frame 的物體追蹤
### 技術方向
```
YOLOv8m/OWL-ViT → 改善 detection coverage
Object Tracker (IoU + embedding類似 face tracker)
object_trace → TKG CO_OCCURS_WITH edges
object identity → 同物體跨場景辨識
```
| 方向 | 方法 | 效果 |
|------|------|------|
| Model upgrade | `yolov5nu``yolov8s.pt` / `yolov8m.pt` | COCO recall 提升 |
| Custom fine-tune | 收集 stamps/guns 資料 fine-tune YOLO | 可偵測非 COCO 物件 |
| Zero-shot | OWL-ViT / Grounding DINO by text prompt | 不用 training但速度慢 |
| Object trace | IoU + embedding 跨 frame 匹配 | instance-level 追蹤 |
| Object identity | clustering 跨場景辨識同一物體 | 可在全片搜尋「這把槍」 |
### 與 TKG 整合
```
face_trace -[:CO_OCCURS_WITH]-> object_instance:5 (這把槍)
face_trace -[:CO_OCCURS_WITH]-> object_instance:42 (這張郵票)
查詢: "Audrey Hepburn 拿這把槍的畫面"
→ face_trace:5 -[:SPEAKS_AS]-> SPEAKER_0
→ face_trace:5 -[:CO_OCCURS_WITH]-> object_instance:5
```
### 交付順序
1. YOLO model upgrade低難度立即見效
2. Object tracker中難度參考 face tracker 實作)
3. Custom fine-tune / zero-shot高難度需資料或新模型

View File

@@ -0,0 +1,101 @@
# Trace Search API 設計
## 概念
trace 是一種 chunk。
現有的 chunk_type: `cut`, `sentence`, `visual`, `story`
新增 chunk_type: `trace`
每個 trace人物跨 frame 追蹤軌跡)就是一個時間區間 + 區間內的 ASR text。
跟其他 chunk 完全一樣,只是切分維度不同:
- cut chunk = 鏡頭切換
- sentence chunk = 語句邊界
- visual chunk = 畫面物體組合
- **trace chunk = 人物出現區間 + 當下 spoken text**
這樣 trace 可以直接放進現有的 `chunks` 表,共用 embedding、搜尋、Qdrant sync 整套機制,不需要任何新 table。
## chunks 表現有結構
```sql
chunks (
id, file_uuid, chunk_type, -- 'trace' 新增
start_frame, end_frame, start_time, end_time,
text_content, -- trace 區間的 ASR text
embedding, -- text_content 的 pgvector
metadata JSONB, -- { trace_id, face_count, identity_id, identity_name }
...
)
```
## 資料產生流程worker 擴充)
在 face processing + `store_traced_faces.py` 完成後:
1. 查詢 `face_detections` 聚合每個 trace 的 `MIN(frame)`, `MAX(frame)`, `COUNT(*)`
2. 對每個 trace查詢 `pre_chunks WHERE processor_type='asr'` 中與 trace time range 重疊的 text
3. 彙整 text → EmbeddingGemma 產生 `embedding`
4. 寫入 `chunks``chunk_type='trace'`metadata 含 `trace_id`, `face_count`, `identity_id`
5. embedding 自動進 Qdrant與既有 chunk 同一 collection
## Search API 擴充
Universal Search 的 `types` 原本就支援 `"chunk"`
在 chunk 搜尋中過濾 `chunk_type = 'trace'` 即可。
**Request**
```json
{
"query": "open the door",
"types": ["chunk"],
"filters": { "chunk_type": "trace" },
"uuid": "aeed71342a899fe4b4c57b7d41bcb692",
"page": 1,
"page_size": 20
}
```
**Response**(與既有 Chunk result 相同):
```json
{
"type": "chunk",
"chunk_id": "chunk_42",
"chunk_type": "trace",
"start_frame": 45200, "end_frame": 45900,
"start_time": 1808.0, "end_time": 1836.0,
"score": 0.87,
"text": "Open the door. Come on, hurry up.",
"metadata": {
"trace_id": 5,
"face_count": 42,
"identity_name": "Audrey Hepburn"
}
}
```
完全沿用既有的 `SearchResult::Chunk` variant不用新增 enum variant。
### 搜尋語法
```sql
SELECT c.*
FROM dev.chunks c
WHERE c.file_uuid = $1
AND c.chunk_type = 'trace'
AND c.embedding IS NOT NULL
ORDER BY c.embedding <=> $2
LIMIT $3;
```
## 總結
| 項目 | 作法 |
|------|------|
| 新 table | ❌ 不需要 |
| 新 enum variant | ❌ 不需要 |
| SearchResult 改動 | ❌ 不需要 |
| chunk_type 新增 | ✅ `'trace'` |
| worker 擴充 | ✅ 產生 trace chunk (face done 後) |
| SearchFilters 擴充 | ✅ 加 `chunk_type` filter |
| Qdrant | ✅ 自動(既有 chunk collection |

201
docs/VISION_AGENT_API.md Normal file
View File

@@ -0,0 +1,201 @@
# Momentry Eye API Reference
**Vision Agent** — Multi-model zero-shot object detection service.
Port: `5052` | Resource IDs: `eye-gdino`, `eye-paligemma`
---
## Models
| Model | ID | Params | Size | Confidence | Speed | License |
|-------|-----|--------|------|------------|-------|---------|
| Grounding DINO | `grounding-dino` | 232M | 891MB | ✅ 0-1 score | ~340ms | Apache 2.0 |
| PaliGemma 3B | `paligemma` | 2,923M | ~3GB | ❌ no score | ~80ms | Gemma license |
## Endpoints
### `GET /health`
System status and loaded models.
```bash
curl localhost:5052/health
```
Response:
```json
{
"status": "ok",
"models_loaded": ["grounding-dino"],
"models_available": ["grounding-dino", "paligemma"],
"device": "mps",
"port": 5052
}
```
### `GET /models`
List available models with specs.
```bash
curl localhost:5052/models
```
### `POST /detect`
Detect objects in a single video frame.
```bash
curl localhost:5052/detect \
-H "Content-Type: application/json" \
-d '{"time":5461, "prompt":"gun", "model":"grounding-dino"}'
```
**Parameters:**
| Param | Type | Default | Description |
|-------|------|---------|-------------|
| `uuid` | string | `aeed71342a...` | Video file UUID |
| `time` | float | `0` | Timestamp in seconds |
| `prompt` | string | `"gun"` | Object to detect |
| `model` | string | `"grounding-dino"` | Model: `grounding-dino`, `paligemma`, or `fusion` |
| `threshold` | float | `0.1` | Minimum confidence (GDINO only) |
| `weights` | object | — | Fusion weights, e.g. `{"grounding-dino":0.6,"paligemma":0.4}` |
**Fusion mode** runs both models and combines results with weighted scoring. Default weights: GDINO 0.6, PaliGemma 0.4.
```bash
# Fusion: run both models, combine results
curl localhost:5052/detect \
-d '{"time":206, "prompt":"water gun", "model":"fusion"}'
# Custom fusion weights
curl localhost:5052/detect \
-d '{"time":206, "prompt":"gun", "model":"fusion",
"weights":{"grounding-dino":0.5,"paligemma":0.5}}'
```
**Response:**
```json
{
"model": "grounding-dino",
"detections": [
{"bbox": [726.2, 567.4, 969.0, 694.6], "score": 0.476, "label": "gun"},
{"bbox": [686.7, 567.0, 969.6, 918.3], "score": 0.262, "label": "gun"}
],
"time_ms": 345.2,
"n_detections": 2,
"shot_url": "/shots/aeed7134_5461s_gun_grounding-dino.jpg"
}
```
**Fusion response** also includes `per_model` (detections per model) and `fusion` (deduplicated combined list with `fused_score`).
### `POST /search`
Search across a time range.
```bash
# Natural language query
curl localhost:5052/search \
-d '{"query":"find the gun", "range":"5400-5600", "interval":10}'
```
**Parameters:**
| Param | Type | Default | Description |
|-------|------|---------|-------------|
| `query` | string | `"find the gun"` | Natural language query (parsed to extract object) |
| `target` | string | — | `file_uuid:chunk_id` or `file_uuid:trace_id` — resolves to time range |
| `range` | string | `"0-6780"` | Manual time range |
| `interval` | int | `30` | Scan interval in seconds |
| `model` | string | `"grounding-dino"` | Detection model |
| `threshold` | float | `0.15` | Minimum confidence |
**Target resolution:**
| Format | Example | Resolves to |
|--------|---------|-------------|
| `file_uuid:chunk_id` | `uuid:uuid_story_90` | Chunk's time range |
| `file_uuid:trace_id` | `uuid:trace_5` | Trace's time range |
| `file_uuid:chunk_index` | `uuid:500` | Chunk index 500's range |
```bash
# Using target
curl localhost:5052/search \
-d '{"target":"aeed71342...:aeed71342..._story_90", "query":"gun"}'
# Using trace
curl localhost:5052/search \
-d '{"target":"aeed71342...:trace_5", "query":"person"}'
```
### `POST /multimodal`
Multi-modal search across sentence chunks — combines ASR text match + visual confirmation.
```bash
# Search for Jean-Louis: ASR match + GDINO child detection
curl localhost:5052/multimodal \
-d '{"keyword":"Jean-Louis", "prompt":"child"}'
# Search trace chunks visually (no ASR)
curl localhost:5052/multimodal \
-d '{"keyword":"", "prompt":"person", "chunk_type":"trace", "range":"3500-4000"}'
```
**Parameters:**
| Param | Type | Default | Description |
|-------|------|---------|-------------|
| `keyword` | string | — | ASR keyword to search in sentence text |
| `prompt` | string | same as keyword | Visual prompt for GDINO |
| `chunk_type` | string | `"sentence"` | `sentence`, `trace`, `story`, `cut` |
| `target` | string | — | Specific chunk target |
| `range` | string | `"0-6780"` | Time range (for non-sentence chunks) |
| `threshold` | float | `0.15` | Visual detection threshold |
### `GET /shots/<filename>`
Retrieve annotated detection images.
```bash
curl -o result.jpg localhost:5052/shots/aeed7134_5461s_gun_grounding-dino.jpg
```
## Object Detection Performance Summary
| Object type | Size in frame | GDINO | PaliGemma | Best prompt |
|-------------|--------------|-------|-----------|-------------|
| Gun (realistic) | 15-30% | ✅ 0.36-0.67 | ✅ | `pistol` / `handgun` |
| Water gun (toy) | 15-31% | ❌ 0 | ✅ | `water gun` (PaliGemma) |
| Child (Jean-Louis) | 30-60% | ⚠️ 0.3-0.9 | ❌ | `child` (high FP on adults) |
| Stamp | <5% | ❌ FP | ❌ | — |
| Passport | <10% | ❌ FP | ❌ | — |
| Magnifying glass | <5% | ❌ FP | ❌ | — |
| Cup / Bottle | 5-15% | ✅ 0.3-0.5 | — | `cup` / `bottle` |
| Cell phone | 5-10% | ✅ 0.3-0.5 | — | `cell phone` |
## Resource Registration
On startup, the agent auto-registers as resources in `dev.resources`:
| Resource ID | Type | Status |
|-------------|------|--------|
| `eye-gdino` | `vision_model` | `online` |
| `eye-paligemma` | `vision_model` | `online` |
Heartbeat updates every 60 seconds. Discover via:
```sql
SELECT * FROM dev.resources WHERE resource_type = 'vision_model';
```
## Files
| File | Description |
|------|-------------|
| `scripts/vision_agent.py` | Vision Agent server (port 5052) |
| `output_dev/vision_shots/` | Annotated detection screenshots |
| `docs/ZERO_SHOT_DETECTION_RESEARCH.md` | Full model research report |

View File

@@ -0,0 +1,190 @@
# Zero-Shot Object Detection Model Research Report
**Date:** 2026-05-10
**Goal:** Evaluate models for detecting arbitrary objects in Charade (1963)
**System:** M5 MacBook Pro (Apple Silicon MPS, 48GB)
---
## Tested Models
| Model | Params | Size | Resolution | Type | License |
|-------|--------|------|------------|------|---------|
| YOLOv8n fine-tune (gun) | 3.2M | 6MB | 640px | Closed-set (4 classes) | AGPL-3.0 |
| OWL-ViT base | 109M | 586MB | 384px | Zero-shot | Apache 2.0 |
| **Grounding DINO Base** | **232M** | **891MB** | **384px** | **Zero-shot** | **Apache 2.0** |
| Grounding DINO Large | 232M | 895MB | 384px | Zero-shot | Apache 2.0 |
| Florence-2 Base | 231M | ~3GB | 384px | Zero-shot (generative) | MIT |
| Florence-2 Large | 776M | ~6GB | 384px | Zero-shot (generative) | MIT |
| PaliGemma 3B mix-224 | 2,923M | ~3GB | 224px | Zero-shot (generative) | Gemma license |
| PaliGemma 3B mix-448 | 2,923M | ~6GB | 448px | Zero-shot (generative) | Gemma license |
## Detection Performance on Charade
### Large Objects (gun)
| Model | 8 timepoints | Best confidence | Runtime |
|-------|-------------|----------------|---------|
| YOLOv8n fine-tune | ❌ 0/5 (all FP) | 0.45 (stamp→pistol) | 0.03s |
| OWL-ViT | ❌ 2/8 | 0.054 | 3.4s |
| **Grounding DINO Base** | **✅ 8/8** | **0.499** | **0.33s** |
| PaliGemma 3B mix-224 | ✅ 3/8 (gun), 3/8 overall | 0.499 | 0.5-3s |
### Small Objects (stamp, passport, magnifying glass)
| Model | Stamp | Passport | Magnifying glass |
|-------|-------|----------|-----------------|
| Grounding DINO Base | ❌ FP (~0.3) | ❌ FP (~0.4) | ❌ FP (~0.3-0.5) |
| PaliGemma 3B mix-224 | ❌ no det | ❌ no det | not tested |
| PaliGemma 3B mix-448 | ❌ (not tested) | ❌ (not tested) | ❌ (not tested) |
**All models fail on objects smaller than ~50px at native 1920x1080 resolution.**
### Other Objects
| Object | YOLO COCO | Grounding DINO | Notes |
|--------|-----------|----------------|-------|
| knife | ✅ 368 frames | ✅ 84 hits | Small but detectable |
| cup | ✅ | ✅ 13 hits | Moderate size |
| bottle | ✅ | ✅ 12 hits | Moderate size |
| cell phone | ✅ | ✅ 5 hits | Hand-held |
| book | ✅ | ✅ 3 hits | Hand-held |
| car | ✅ | ✅ 9 hits | Large object |
| tie | ✅ | ✅ 139 hits | On-person (worn, not held) |
## Detailed Model Analysis
### Grounding DINO Base (Recommended)
**Scores:** Detection confidence 0.1-0.5 (typical for zero-shot)
**Timing per frame (MPS):**
| Component | Time | % of total |
|-----------|------|------------|
| Processor (text+image) | 17ms | 5% |
| Model inference | 310ms | 93% |
| Post-processing | 5ms | 2% |
| **Total** | **331ms** | **100%** |
**Multi-prompt batching:** 8 prompts in 335ms (42ms/prompt vs 309ms single)
**Memory:** ~1GB (MPS)
**License:** Apache 2.0 — fully commercial, no restrictions
### Grounding DINO Large
**Result:** Identical weights to Base. The GitHub "7-dataset" checkpoint is the same 3-dataset version as HuggingFace. The actual 7-dataset version (56.7 AP) was never released.
**Verdict: Do not use.** Base is identical and simpler.
### OWL-ViT
**Result:** Almost useless for this task. Max confidence 0.054. Detect only 2/8 timepoints.
**Verdict: Do not use.**
### Florence-2
**Issue:** `prepare_inputs_for_generation` bug in current transformers version. Cannot run inference without patching model code.
**Task format:** Uses task tokens (`<OD>`) instead of arbitrary text prompts. Cannot do "detect gun" directly — uses generic object detection.
**Verdict: Cannot use in current environment.**
### PaliGemma
**Result:** Works for gun detection (3/8) but misses small objects entirely.
**Key limitation:** No confidence score output (generative model). Either outputs bbox or nothing.
**Issues:**
- 224px variant: Too low resolution for small objects
- 448px variant: 6GB download, suspected better for detail but untested
- Gemma license may restrict commercial use vs Apache 2.0
**Verdict: Inferior to Grounding DINO for this use case.**
### YOLOv8n Fine-tune (Gun Detector)
| Dataset | 905 images (Roboflow CC BY 4.0) |
| Classes | grenade, knife, pistol, rifle |
| Validation mAP50 | 0.813 |
| Charade FP rate | **100%** (all false positives) |
**Root cause:** Training images are close-up gun photos; Charade has distant/partial guns. Distribution mismatch makes this model unusable.
**Verdict: Requires completely new training dataset.**
## Root Cause Analysis: Small Object Failure
### Grounding DINO's Resolution Limit
Grounding DINO processes images at **384×384px**. At this resolution:
```
1920px frame → 384px input (5:1 reduction)
A 50×50px object → 10×10px at 384px → only ~1 patch token
```
For comparison:
- **Gun** at 200×200px (close-up) → 40×40px → still detectable
- **Stamp** at 30×30px → 6×6px → lost in downsampling
- **Passport** at 80×120px → 16×24px → barely visible
- **Magnifying glass** at 40×40px → 8×8px → lost
### Potential Solutions
| Solution | Pros | Cons | Feasibility |
|----------|------|------|-------------|
| **Crop + zoom** on person region | Leverages existing YOLO person detections | Requires two-stage pipeline | ✅ High |
| **PaliGemma 448px** | 448px native (36% more detail) | 6GB, requires download | ⚠️ Medium |
| **YOLO fine-tune on stamps** | Fast inference (6MB) | Need 200+ training images | ⚠️ Medium |
| **Grounding DINO + tiling** | Split image into tiles, run per tile | 4-9x slower | ⚠️ Medium |
| **Florence-2 448px** | Higher resolution | Bug in transformers | ❌ Low |
## Hand-Held Object Detection Feasibility
### Available Data Sources
| Source | Type | Coverage | Usefulness |
|--------|------|----------|------------|
| YOLO `pre_chunks` | Object detections | 169,625 frames | ✅ Every frame |
| Pose `pre_chunks` | Body keypoints (left_wrist, right_wrist) | 4,269 frames | ✅ Hand location |
| Grounding DINO | Zero-shot classification | On-demand | ✅ Object ID |
| ASR dialogue | Text mentions | 4,188 chunks | ✅ "holding a gun" |
### Approach: YOLO + Pose + Grounding DINO
```
Frame
→ YOLO: Find person + objects
→ Pose: Find wrist keypoints
→ Check: Object bbox overlaps with hand region (wrist ±100px)
→ Grounding DINO: Verify object class
```
### Known Limitations
1. **Pose frame alignment:** Pose data (4,269 frames) doesn't always overlap with YOLO data at the same frame
2. **Object proximity ≠ holding:** YOLO objects near hands may be background, not held
3. **Small object blind spot:** Stamps, magnifying glasses at hand positions are too small to detect
## Recommendations
| Priority | Action | Rationale |
|----------|--------|-----------|
| 1 | Use Grounding DINO Base (Apache 2.0) | Best zero-shot detector, proven on guns, clean license |
| 2 | Two-stage pipeline for small objects | YOLO person box → crop → upscale → Grounding DINO |
| 3 | Pose wrist alignment for hand-held confirmation | Reduce false positives by requiring hand proximity |
| 4 | Replace Grounding DINO "Large" ref with Base | Large is identical weights, no benefit |
## Appendix: License Summary
| Model | License | Commercial Use | Requires |
|-------|---------|---------------|----------|
| Grounding DINO | **Apache 2.0** | ✅ Yes | NOTICE file |
| OWL-ViT | Apache 2.0 | ✅ Yes | NOTICE file |
| PaliGemma | Gemma license | ⚠️ Needs review | Google ToS |
| Florence-2 | MIT | ✅ Yes | Copyright notice |
| YOLOv8 | AGPL-3.0 | ⚠️ Needs license | Open source or paid |

View File

@@ -0,0 +1,49 @@
# Zero-Shot Gun Detection Test Plan
**Date:** 2026-05-10
**Goal:** Compare OWL-ViT vs Grounding DINO for detecting guns in Charade (1963)
## Models
| Model | Source | Type |
|-------|--------|------|
| `google/owlvit-base-patch32` | HuggingFace | Zero-shot object detection |
| `IDEA-Research/grounding-dino-base` | HuggingFace | Zero-shot object detection |
## Test Timepoints (8)
| Time | Label | Source |
|------|-------|--------|
| 2646s (44:06) | 2646s | ASR: "He has a gun" |
| 3188s (53:08) | 3188s | Original detection |
| 3697s (61:37) | 3697s | ASR: "Where's your gun" |
| 5341s (89:01) | 5341s | ASR: "He already killed 3 men" |
| 5461s (91:01) | 5461s | Original detection |
| 6309s (1:45:09) | 6309s | Original detection |
| 6377s (1:46:17) | 6377s | Original detection |
| 6479s (1:47:59) | 6479s | Original detection |
## Prompts
`"gun"`, `"pistol"`, `"rifle"`, `"weapon"`
## Matrix
8 timepoints × 2 models × 4 prompts = 64 inferences
## Output
| File | Description |
|------|-------------|
| `output_dev/zero_shot_test/*.jpg` | Annotated screenshots |
| `output_dev/zero_shot_test/zero_shot_results.json` | Detection results |
| `scripts/zero_shot_gun_test.py` | Test script |
## Success Criteria
| Level | Criteria |
|-------|----------|
| Excellent | Finds real gun with confidence > 0.5 |
| Good | Finds real gun with confidence < 0.5 |
| Limited | Finds guns but many false positives |
| Failed | All false positives |

View File

@@ -0,0 +1,67 @@
# Zero-Shot Gun Detection Test Report
**Date:** 2026-05-10
**Goal:** Compare OWL-ViT vs Grounding DINO for detecting guns in Charade (1963)
## Test Setup
| Model | Prompts | Timepoints | Total inferences |
|-------|---------|------------|-----------------|
| `google/owlvit-base-patch32` | gun, pistol, rifle, weapon | 8 | 32 |
| `IDEA-Research/grounding-dino-base` | gun, pistol, rifle, weapon | 8 | 32 |
## Results
| Model | Timepoints with detections | Total detections | Best confidence | Runtime |
|-------|---------------------------|-----------------|-----------------|---------|
| OWL-ViT | 2/8 | 2 | 0.054 | 1.5s |
| **Grounding DINO** | **8/8** | **109** | **0.186** | 11.5s |
## Grounding DINO — Per Timepoint
| Time | Source | Best prompt | Best confidence | Found? |
|------|--------|-------------|-----------------|--------|
| 2646s (44:06) | ASR: "He has a gun" | gun | 0.082 | ✅ |
| **3188s (53:08)** | **Original pistol** | **gun** | **0.149** | **✅** |
| 3697s (61:37) | ASR: "Where's your gun" | gun | 0.159 | ✅ |
| 5341s (89:01) | ASR: "He already killed 3 men" | gun | 0.074 | ✅ |
| **5461s (91:01)** | **Original pistol** | **gun** | **0.186** | **✅** |
| **6309s (1:45:09)** | **Original pistol** | **gun** | **0.077** | **✅** |
| **6377s (1:46:17)** | **Original gun** | **weapon** | **0.118** | **✅** |
| **6479s (1:47:59)** | **Original pistol** | **gun** | **0.060** | **✅** |
### Original 5 Pistol Frames
| Frame | OWL-ViT | Grounding DINO | Verdict |
|-------|---------|----------------|---------|
| 3188s | Not found | ✅ Found (0.149) | ✅ |
| 5461s | Not found | ✅ Found (0.186) | ✅ |
| 6309s | Not found | ✅ Found (0.077) | ✅ |
| 6377s | Not found | ✅ Found (0.118) | ✅ |
| 6479s | Not found | ✅ Found (0.060) | ✅ |
## Analysis
### OWL-ViT
- Almost completely failed: only 2 detections at 0.05 confidence
- Not suitable for this task
### Grounding DINO
- **Found all 8 timepoints**, including all 5 original pistol frames
- Best prompt is consistently `"gun"` (6/8 timepoints)
- Confidence range: 0.060 - 0.186 (typical for zero-shot detection)
- Higher confidence correlates with user-confirmed detections
### Key Finding
The 5 original pistol frames were produced by **Grounding DINO** (not YOLOv8n). The model was downloaded from HuggingFace at 15:43-15:44 on May 9, and the screenshots were generated at 15:49 — confirming OWL-ViT was tested first (failed) and then Grounding DINO was tested (succeeded).
## Integration
Grounding DINO has been integrated into `object_search_agent.py` as `--source zero_shot`:
```
python3 scripts/object_search_agent.py --keyword gun --source zero_shot
```
## Screenshots
All 64 annotated screenshots saved to `output_dev/zero_shot_test/*.jpg`

View File

@@ -0,0 +1,115 @@
# Zero-Shot vs Fine-Tune 物件偵測模型選型報告
**Date:** 2026-05-10
**Goal:** 在 Charade (1963) 中搜尋非 COCO 物件(槍枝、郵票、信封等)
**System:** M5 MacBook Pro (Apple Silicon MPS)
## 動機
YOLOv8 COCO 只有 80 類,不包含 gun、stamp、envelope 等 Charade 核心物件。需要找到能在電影中搜尋任意物件的方法。
## 候選方案
| 方案 | 方法 | 訓練資料 | 開發成本 |
|------|------|---------|---------|
| A. YOLOv8n fine-tune | Fine-tune on gun dataset | 需收集 500+ 張標註圖片 | 高 |
| B. OWL-ViT zero-shot | Vision-language pretraining | 無須訓練 | 低 |
| C. Grounding DINO zero-shot | Vision-language pretraining | 無須訓練 | 低 |
## 模型大小與效能
| Model | 磁碟 | 參數 | 推論時間 (MPS) | 單幀能耗 | 模型類別 |
|-------|------|------|---------------|---------|---------|
| YOLOv8n | **6MB** | **3.2M** | **0.03s** | **~0.5J** | 封閉集80 類) |
| OWL-ViT | 586MB | 109M | 3.4s | ~50J | 開放集zero-shot |
| **Grounding DINO** | **891MB** | **172M** | **4.3s** | **~65J** | **開放集zero-shot** |
## Charade 實測結果
| Model | 8 時間點命中 | 5 個原始 pistol | 最佳 confidence | 推論時間 | 模型大小 |
|-------|-------------|-----------------|----------------|---------|---------|
| YOLOv8n COCO | ❌ N/A無 gun class | — | — | 0.03s | 6MB |
| YOLOv8n fine-tune | 7/7 FP | ❌ 全部 FP | 0.45(郵票誤判) | 0.03s | 6MB |
| OWL-ViT | 2/8 | ❌ 0/5 | 0.054 | 3.4s | 586MB |
| **Grounding DINO Base** | **31/32** | **✅ 5/5** | **0.672** | **11.6s** | **891MB** |
| **Grounding DINO Large** | **32/32** | **✅ 5/5** | **1.000** | **50.1s** | **895MB** |
### Base vs Large 比較
| 指標 | Base (3 datasets) | Large (7 datasets) |
|------|------------------|-------------------|
| 平均最佳 confidence | 0.384 | **1.000** |
| 總偵測數 | 333 | **28,800** |
| COCO zero-shot AP | 48.4 | **56.7** |
| 推論時間 (MPS) | 11.6s | 50.1s |
| Edge 部署 | 較可行 | 較困難 |
### 結論
**效能優先選擇Grounding DINO Large** — 所有 8 個時間點 confidence 1.000,零漏檢。犧牲推論速度但 detection 品質大幅超越 Base 版。
**Edge 部署選擇Grounding DINO Base** — 體積相近但推論快 4.3x,適合資源受限裝置。
### 關鍵結論
1. **YOLOv8n fine-tune 完全失敗** — 905 張 Roboflow 近距離特寫與 Charade 中遠景畫面分布 mismatch訓練無法泛化
2. **OWL-ViT 幾乎無效** — 對電影中的小物體辨識能力不足
3. **Grounding DINO 成功** — 5/5 找回 pistol frames所有 ASR gun mention 時間點也命中
## Grounding DINO 優缺點
### 優點
- **零樣本搜尋**:任何 COCO 以外的物件直接用文字 prompt 搜尋
- **延伸性**:同一模型可搜尋 gun、stamp、envelope、knife、hat 等任意物件
- **無須訓練**:不需要收集標註資料或 fine-tune
- **Apache 2.0 License**:可商用
### 缺點
- **體積大**891MBvs YOLOv8n 的 6MB
- **推論慢**4.3s/framevs YOLOv8n 的 0.03s
- **不適合 real-time**edge device 上無法做即時偵測,只適合離線掃描
## Edge AI 部署考量
| 項目標題 | YOLOv8n | Grounding DINO |
|---------|---------|---------------|
| 模型大小 | 6MB ✅ | 891MB ⚠️ |
| RAM 需求 | ~100MB | ~2.5GB |
| 推論時間 | 30ms | 4.3s |
| 單幀能耗 | ~0.5J | ~65J |
| 搜尋類別數 | 80固定 | 無限(文字 prompt |
| 電池影響1000 幀) | ~500J | ~65,000J |
### 建議策略
```
離線掃描Server/Gateway
用 Grounding DINO 對全片建立物件索引
→ 耗時但可接受113 min 電影約 2-3 小時)
即時查詢Edge Device
查詢時只跑 Grounding DINO 在該 timepoint → 4s/次
→ 查詢體驗還可接受
```
## 整合狀態
- ✅ Grounding DINO 測試通過
- ✅ 整合進 `scripts/object_search_agent.py``--source zero_shot`
- ✅ 測試計畫:`docs/ZERO_SHOT_GUN_TEST_PLAN.md`
- ✅ 測試報告:`docs/ZERO_SHOT_GUN_TEST_REPORT.md`
## License 聲明
Grounding DINO 採用 Apache 2.0 License可商用。
產品若 bundle 此模型,需附 `NOTICE` 檔案:
```
Momentry
Copyright 2026 Accusys
This product includes software developed by IDEA Research:
- Grounding DINO (https://github.com/IDEA-Research/GroundingDINO)
Copyright 2023 IDEA Research
Licensed under Apache 2.0 (https://www.apache.org/licenses/LICENSE-2.0)
```

View File

@@ -0,0 +1,177 @@
# API Dictionary v1.0.0
58 endpoints across 10 modules. Auth: `X-API-Key` header or `Authorization: Bearer <key>`.
## API Design Principle
Every path segment after the resource ID is a **verb** — an action on that resource.
```
/api/v1/{entity}/{id}/{action}
↑ ↑ ↑
實體 ID 動作
```
**Primary entities**: `file`/`files`, `identity`/`identities`
```
/api/v1/file/:file_uuid ← 檔案資源
/video → 播放影片(動詞)
/video/bbox → 播放含框(動詞)
/thumbnail → 取縮圖(動詞)
/process → 啟動處理(動詞)
/probe → 探測(動詞)
/chunks → 列出段落(動詞)
/identities → 列出身分(動詞)
/face_trace/sortby → 列出追蹤/排序(動詞)
/trace/:trace_id/faces → 列出偵測(動詞)
/api/v1/identity/:identity_uuid
/bind → 綁定(動詞)
/unbind → 解綁(動詞)
/files → 列出檔案(動詞)
/chunks → 列出段落(動詞)
/api/v1/search/universal → 搜尋(動詞)
/api/v1/search/smart → 智慧搜尋(動詞)
```
**Naming conventions**:
- 全域唯一資源 ID → `uuid``file_uuid`, `identity_uuid`
- 單一實體下唯一 ID → `id``trace_id`, `chunk_id`, `face_id`
- 路徑尾端 → 動詞(`/video`, `/chunks`, `/bind`
- 集合列表 → **複數**`/files`, `/identities`, `/resources`, `/faces`
- 單一資源操作 → **單數**`/file/:file_uuid`, `/identity/:identity_uuid`
## Legend
- `→` direction of data flow
- `POST` typically requires JSON body
- All endpoints return JSON unless noted
---
| # | Method | Route | Description |
|---|--------|-------|-------------|
| 1 | GET | `/health` | Server health (ok/degraded) |
| 2 | GET | `/health/detailed` | Per-service health + latency |
| 3 | POST | `/api/v1/auth/login` | Username/password → API key |
| 4 | POST | `/api/v1/auth/logout` | Invalidate session |
| 5 | GET | `/api/v1/stats/ingest` | Ingest statistics |
| 6 | GET | `/api/v1/stats/sftpgo` | SFTPGo service status |
| 7 | GET | `/api/v1/stats/inference` | LLM/embedding health |
| 8 | POST | `/api/v1/files/register` | Register video file → file_uuid |
| 9 | POST | `/api/v1/unregister` | Unregister file(s): by `file_uuid` or pattern match on `file_path`+`pattern` |
| 10 | GET | `/api/v1/files/scan` | Scan directory for new files |
| 11 | GET | `/api/v1/file/:file_uuid/probe` | ffprobe metadata |
| 12 | POST | `/api/v1/file/:file_uuid/process` | Start processing pipeline |
| 13 | GET | `/api/v1/file/:file_uuid/chunks` | List pre-chunks for file |
| 14 | GET | `/api/v1/progress/:file_uuid` | Processing progress |
| 15 | GET | `/api/v1/jobs` | List monitor jobs (filterable by status) |
| 16 | POST | `/api/v1/config/cache` | Toggle Redis cache |
| 17 | POST | `/api/v1/config/auto-pipeline` | Toggle auto-pipeline on register |
| 18 | POST | `/api/v1/config/watcher-auto-register` | Toggle watcher auto-register |
| 17 | POST | `/api/v1/search/visual` | Search visual chunks |
| 18 | POST | `/api/v1/search/visual/class` | Search by object class |
| 19 | POST | `/api/v1/search/visual/density` | Search by spatial density |
| 20 | POST | `/api/v1/search/visual/combination` | Combined visual search |
| 21 | POST | `/api/v1/search/visual/stats` | Visual chunk statistics |
## File/Identity (identity_api.rs)
| # | Method | Route | Description |
|---|--------|-------|-------------|
| 22 | GET | `/api/v1/files` | List registered files (paginated) |
| 23 | GET | `/api/v1/file/:file_uuid` | Single file detail |
| 24 | GET | `/api/v1/file/:file_uuid/identities` | Identities in this file |
| 25 | GET | `/api/v1/identities` | List all identities |
| 26 | POST | `/api/v1/identity` | Register new identity |
| 27 | GET | `/api/v1/identity/:identity_uuid` | Identity detail |
| 28 | DELETE | `/api/v1/identity/:identity_uuid` | Delete identity |
| 29 | GET | `/api/v1/identity/:identity_uuid/files` | Files for an identity |
| 30 | GET | `/api/v1/identity/:identity_uuid/chunks` | Chunks for an identity |
| 31 | POST | `/api/v1/resource/register` | Register processing resource |
| 32 | POST | `/api/v1/resource/heartbeat` | Resource heartbeat |
| 33 | GET | `/api/v1/resources` | List all resources |
## Identity Binding (identity_binding.rs)
| # | Method | Route | Description |
|---|--------|-------|-------------|
| 34 | POST | `/api/v1/identity/:identity_uuid/bind` | Bind face → identity |
| 35 | POST | `/api/v1/identity/:identity_uuid/unbind` | Unbind face from identity |
| 36 | POST | `/api/v1/identity/:identity_uuid/mergeinto` | Merge identity :identity_uuid → target |
## Face Candidates (identities.rs)
| # | Method | Route | Description |
|---|--------|-------|-------------|
| 37 | GET | `/api/v1/faces/candidates` | Unbound face gallery (paginated) |
## Search (search.rs + universal_search.rs)
| # | Method | Route | Description |
|---|--------|-------|-------------|
| 38 | POST | `/api/v1/search/smart` | Semantic search (EmbeddingGemma + pgvector) |
| 39 | POST | `/api/v1/search/universal` | BM25 keyword search (requires file_uuid) |
| 40 | POST | `/api/v1/search/frames` | Frame-level search |
## Trace (trace_agent_api.rs)
| # | Method | Route | Description |
|---|--------|-------|-------------|
| 41 | POST | `/api/v1/file/:file_uuid/face_trace/sortby` | List traces (sorted/filtered) |
| 42 | GET | `/api/v1/file/:file_uuid/trace/:trace_id/faces` | Single trace detections + interpolation |
## Media (media_api.rs)
| # | Method | Route | Description |
|---|--------|-------|-------------|
| 43 | GET | `/api/v1/file/:file_uuid/thumbnail` | Frame JPEG (optional crop via `?frame=&x=&y=&w=&h=`) |
| 44 | GET | `/api/v1/file/:file_uuid/video` | Raw video stream (`?start_time=&end_time=` in seconds) |
| 45 | GET | `/api/v1/file/:file_uuid/video/bbox` | Bbox overlay video (`?start_frame=&end_frame=&duration=` frame numbers) |
| 46 | GET | `/api/v1/file/:file_uuid/trace/:trace_id/video` | Trace clip (`?mode=normal\|debug&padding=`) |
## Identity Delete
| # | Method | Route | Description |
|---|--------|-------|-------------|
| 47 | DELETE | `/api/v1/identity/:identity_uuid` | Delete identity + unbind all faces |
## Agents (agent_api.rs + five_w1h_agent_api.rs + identity_agent_api.rs)
| # | Method | Route | Description |
|---|--------|-------|-------------|
| 48 | POST | `/api/v1/agents/translate` | AI text translation |
| 49 | POST | `/api/v1/agents/5w1h/analyze` | Single chunk 5W1H analysis |
| 50 | POST | `/api/v1/agents/5w1h/batch` | Batch 5W1H analysis |
| 51 | GET | `/api/v1/agents/5w1h/status` | 5W1H job status |
| 52 | POST | `/api/v1/agents/identity/analyze` | Identity analysis |
| 53 | GET | `/api/v1/agents/identity/status` | Identity job status |
| 54 | POST | `/api/v1/agents/identity/suggest` | Identity suggestions |
| 55 | POST | `/api/v1/agents/suggest/merge` | Suggest identity merge |
| 56 | POST | `/api/v1/agents/suggest/clustering` | Suggest re-clustering |
## Identity Search (identity_api.rs, new in V4.1)
| # | Method | Route | Description |
|---|--------|-------|-------------|
| 57 | GET | `/api/v1/identities/search?q=` | Search identities by name → chunk results |
| 58 | GET | `/api/v1/search/identity_text?q=&file_uuid=` | Full-text search → identity-bound chunks |
---
## Summary
| Module | Routes | File |
|--------|--------|------|
| Core | 21 | `server.rs` |
| File/Identity | 14 | `identity_api.rs` (+2 search endpoints) |
| Binding | 3 | `identity_binding.rs` |
| Faces | 1 | `identities.rs` |
| Search | 3 | `search.rs`, `universal_search.rs` |
| Trace | 2 | `trace_agent_api.rs` |
| Media | 4 | `media_api.rs` |
| Identity Delete | 1 | `identity_api.rs` |
| Agents | 9 | `agent_api.rs`, `five_w1h_agent_api.rs`, `identity_agent_api.rs` |
| **Total** | **58** | |

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,381 @@
---
document_type: "reference_doc"
service: "MOMENTRY_CORE"
title: "Momentry Core Release API Reference v1.0.0"
date: "2026-05-25"
version: "V4.2"
status: "active"
owner: "Warren"
---
# Momentry Core API Reference v1.0.0
55 endpoints across 10 categories, with real curl examples and responses.
## Base
| Environment | URL |
|-------------|-----|
| Production | `http://localhost:3002` or `https://api.momentry.ddns.net` |
| Development | `http://localhost:3003` |
| Auth | Header `X-API-Key: <key>` (login endpoint unprotected) |
> **Note**: All examples below use production port 3002. For dev testing, replace `3002` with `3003`.
---
## 1. System
| # | Method | Path | Description |
|---|--------|------|-------------|
| 1 | GET | `/health` | Server status (ok/degraded) |
| 2 | GET | `/health/detailed` | Per-service health + latency |
| 3 | GET | `/health/consistency` | Data consistency check |
| 4 | POST | `/api/v1/auth/login` | Username/password → API key |
| 5 | POST | `/api/v1/auth/logout` | Invalidate session |
| 6 | GET | `/api/v1/stats/sftpgo` | SFTPGo status |
| 7 | POST | `/api/v1/config/cache` | Toggle Redis cache |
| 8 | POST | `/api/v1/config/auto-pipeline` | Toggle auto-pipeline on register |
| 9 | POST | `/api/v1/config/watcher-auto-register` | Toggle watcher auto-register |
```bash
curl http://localhost:3002/health
```
```json
{
"status": "ok",
"version": "1.0.0",
"build_git_hash": "de88fd4e",
"build_timestamp": "2026-05-25",
"uptime_ms": 7052517
}
```
| # | Method | Path | Description |
|---|--------|------|-------------|
| 2a | GET | `/health/detailed` | Per-service health + resources + pipeline |
```bash
curl -X POST http://localhost:3002/api/v1/files/register \
-H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
-H "Content-Type: application/json" \
-d '{"file_path":"/path/to/video.mp4","content_hash":"optional-sha256-of-file"}'
```
```json
{"success":true,"file_uuid":"3abeee81d94597629ed8cb943f182e94","duration":5954.0}
```
Supports all file types (video, image, document, audio). SHA256 content_hash computed automatically if not provided.
```json
{
"status": "ok",
"build_git_hash": "de88fd4e",
"build_timestamp": "2026-05-25",
"services": {
"postgres": {"status": "ok", "latency_ms": 6},
"redis": {"status": "ok", "latency_ms": 0},
"qdrant": {"status": "ok", "latency_ms": 1},
"mongodb": {"status": "ok", "latency_ms": 0}
},
"resources": {
"cpu_used_percent": 50.0,
"cpu_idle_percent": 50.0,
"memory_available_mb": 8028,
"memory_total_mb": 16384,
"memory_used_percent": 51.0,
"gpu_available": false,
"gpu_utilization": null,
"gpu_memory_used_pct": null
},
"pipeline": {
"scripts": true,
"models": true,
"ffmpeg": true,
"embedding_server": {"status": "ok", "latency_ms": 0},
"gdino_api": {"status": "error", "latency_ms": 0, "error": "..."},
"llm": {"status": "ok", "latency_ms": 0}
}
}
```
---
## 2. File Management
| # | Method | Path | Description |
|---|--------|------|-------------|
| 10 | POST | `/api/v1/files/register` | Register file → file_uuid. Body: `{"file_path":"...", "content_hash":"optional"}` |
| 11 | GET | `/api/v1/files/lookup?file_name=` | Pre-upload name conflict check. Returns matches + `next_name` for auto-rename |
| 12 | POST | `/api/v1/unregister` | Unregister file(s): by `file_uuid` or pattern match (`file_path`+`pattern`) |
| 13 | GET | `/api/v1/files/scan` | Scan directory for new files |
| 14 | GET | `/api/v1/files` | List files (paginated) |
| 15 | GET | `/api/v1/file/:file_uuid` | Single file detail |
| 16 | GET | `/api/v1/file/:file_uuid/probe` | ffprobe metadata |
| 17 | POST | `/api/v1/file/:file_uuid/process` | Start pipeline |
| 18 | POST | `/api/v1/file/:file_uuid/chunk/:chunk_id` | Single chunk detail (V1.0.2+) |
| 19 | POST | `/api/v1/progress/:file_uuid` | Processing progress |
| 20 | POST | `/api/v1/jobs` | Monitor jobs (filterable) |
```bash
curl -X POST http://localhost:3002/api/v1/files/register -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" -H "Content-Type: application/json" -d '{"file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/video.mp4"}'
```
```json
{"success":true,"file_uuid":"3abeee81d94597629ed8cb943f182e94","duration":5954.0}
```
Modes:
- By `file_uuid`: unregister a single file
- By `file_path` + `pattern` regex: unregister all matching files in a directory
```bash
# By file_uuid
curl -X POST http://localhost:3002/api/v1/unregister \
-H "X-API-Key: muser_..." -H "Content-Type: application/json" \
-d '{"file_uuid":"53e3a229bf68878b7a799e811e097f9c"}'
# By pattern (unregister all .mp4 files in directory)
curl -X POST http://localhost:3002/api/v1/unregister \
-H "X-API-Key: muser_..." -H "Content-Type: application/json" \
-d '{"file_path":"/data/demo","pattern":"\\.mp4$"}'
```
```json
{"success":true,"file_uuid":"53e3a229bf68878b7a799e811e097f9c","message":"File unregistered successfully"}
```
```bash
curl "http://localhost:3002/api/v1/files?page=1&page_size=2" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
```
```json
{"success":true,"data":[{"file_uuid":"aeed7134...","file_name":"Charade (1963)...","status":"ready"}],"total":0,"page":1,"page_size":2}
```
---
## 3. Search
| # | Method | Path | Description |
|---|--------|------|-------------|
| 21 | POST | `/api/v1/search/visual` | Visual chunk search |
| 22 | POST | `/api/v1/search/visual/class` | By object class |
| 23 | POST | `/api/v1/search/visual/density` | By spatial density |
| 24 | POST | `/api/v1/search/visual/combination` | Combined visual search |
| 25 | POST | `/api/v1/search/visual/stats` | Visual stats |
| 26 | POST | `/api/v1/search/smart` | Semantic (EmbeddingGemma + pgvector) |
| 27 | POST | `/api/v1/search/universal` | BM25 keyword (requires file_uuid) |
| 28 | POST | `/api/v1/search/frames` | Frame-level search |
```bash
curl -X POST http://localhost:3002/api/v1/search/universal -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" -H "Content-Type: application/json" -d '{"query":"name","limit":2,"mode":"bm25","file_uuid":"3abeee81d94597629ed8cb943f182e94"}'
```
```json
{"query":"name","results":[{"chunk_id":"100","text":"What's your name?","start_time":258.68,"score":0.90}],"total":5,"page":1,"page_size":20,"took_ms":42}
```
```bash
curl -X POST http://localhost:3002/api/v1/search/universal -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" -H "Content-Type: application/json" -d '{"query":"friends","limit":2,"mode":"bm25","file_uuid":"3abeee81d94597629ed8cb943f182e94"}'
```
```json
{"query":"friends","results":[{"chunk_id":"104","text":"You won't find it difficult to make some new friends.","start_time":272.38,"score":0.90}],"total":3,"page":1,"page_size":20,"took_ms":38}
```
---
## 4. Face Trace
| # | Method | Path | Description |
|---|--------|------|-------------|
| 29 | POST | `/api/v1/file/:file_uuid/traces` | List traces (sorted/filtered) |
| 30 | GET | `/api/v1/file/:file_uuid/trace/:trace_id/faces` | Trace detections (+ interpolation) |
### traces — list traces
Parameters:
- `sort_by`: `face_count` | `duration` | `first_appearance`
- `min_faces`, `min_confidence`, `max_confidence`: filters
- `limit`: max results
```bash
curl -X POST "http://localhost:3002/api/v1/file/3abeee81d94597629ed8cb943f182e94/traces" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" -H "Content-Type: application/json" -d '{"sort_by":"face_count","limit":2}'
```
```json
{"success":true,"total_traces":6892,"total_faces":108204,"traces":[
{"trace_id":3128,"face_count":1109,"avg_confidence":0.779},
{"trace_id":3126,"face_count":743,"avg_confidence":0.758}
]}
```
### trace/:trace_id/faces — individual detections
Parameters:
- `limit`, `offset`: pagination
- `interpolate`: boolean (fills sparse gaps with lerp bbox)
```bash
curl "http://localhost:3002/api/v1/file/3abeee81d94597629ed8cb943f182e94/trace/2/faces?limit=2&interpolate=true" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
```
```json
{"success":true,"trace_id":2,"fps":25.0,"total":1,"faces":[
{"id":12399,"start_frame":4620,"end_frame":4620,"start_time":184.8,"end_time":184.8,"x":787,"y":582,"width":225,"height":225,"confidence":0.666,"interpolated":false}
]}
```
---
## 5. Media
| # | Method | Path | Description |
|---|--------|------|-------------|
| 31 | GET | `/api/v1/file/:file_uuid/thumbnail` | Frame JPEG (?frame=&x=&y=&w=&h=) |
| 32 | GET | `/api/v1/file/:file_uuid/video` | Raw video stream. Dual input: `?start_time=&end_time=` (seconds) or `?start_frame=&end_frame=` (frames). |
| 33 | GET | `/api/v1/file/:file_uuid/video/bbox` | Bbox overlay. `?start_frame=&end_frame=&face_uuid=&duration=` (all frame numbers). Dual input via `start_time`/`end_time`. |
| 34 | GET | `/api/v1/file/:file_uuid/trace/:trace_id/video` | Trace clip (?mode=&padding=&audio=) |
All video endpoints support:
- `mode=normal|debug` (default: `normal`)
- `audio=on|off` (default: `on`)
`mode=normal`: raw clip, `-c copy`, no overlay.
`mode=debug`: re-encoded with top-left text info + green bboxes (trace labels at actual frames with thickness=4, interpolated at first known position with thickness=1).
```bash
# Normal mode
curl -o trace.mp4 "http://localhost:3002/api/v1/file/{file_uuid}/trace/42/video?mode=normal"
# Debug mode
curl -o trace_debug.mp4 "http://localhost:3002/api/v1/file/{file_uuid}/trace/42/video?mode=debug"
```
Debug overlay shows at bottom-left:
```
Frame {n} {pts}s
Cut: {id}
{file_uuid}
Trace {id}: start={frame} {name}
...
```
Green bbox per face detection: actual frames `thickness=4`, interpolated `thickness=1`.
---
## 6. Identities
| # | Method | Path | Description |
|---|--------|------|-------------|
| 35 | GET | `/api/v1/identities` | List all identities |
| 36 | GET | `/api/v1/file/:file_uuid/identities` | Identities in a file |
| 37 | POST | `/api/v1/identity` | Register new identity |
| 38 | GET | `/api/v1/identity/:identity_uuid` | Identity detail |
| 39 | DELETE | `/api/v1/identity/:identity_uuid` | Delete identity |
| 40 | GET | `/api/v1/identity/:identity_uuid/files` | Files for identity |
| 41 | GET | `/api/v1/identity/:identity_uuid/chunks` | Chunks for identity |
| 42 | GET | `/api/v1/faces/candidates` | Unbound face gallery |
| 43 | GET | `/api/v1/identities/search?q=` | Search identities by name → chunks |
| 44 | GET | `/api/v1/search/identity_text?q=&file_uuid=` | Full-text search → identity-bound chunks |
```bash
curl "http://localhost:3002/api/v1/identities?page=1&page_size=3" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
```
```json
{"count":3852,"page":1,"page_size":3,"identities":[
{"id":18299,"identity_uuid":"76f85ee6-bc47-4a1a-9878-1beb67851ec5","name":"PERSON_aeed7134_390","metadata":{}},
{"id":18298,"identity_uuid":"f4d4ccbf-fccb-4f62-8806-2b7f4a706edb","name":"PERSON_aeed7134_389","metadata":{}},
{"id":18297,"identity_uuid":"e8a1b2c3-d4e5-4f67-8901-23456789abcd","name":"PERSON_aeed7134_388","metadata":{}}
]}
```
### GET /api/v1/file/:file_uuid/identities — identities with frame/time ranges
```bash
curl "http://localhost:3002/api/v1/file/aeed71342a899fe4b4c57b7d41bcb692/identities?limit=2" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
```
```json
{"success":true,"file_uuid":"aeed71342a899fe4b4c57b7d41bcb692","fps":25.0,"total":20,"page":1,"page_size":20,"data":[
{"identity_id":18276,"identity_uuid":"77d895cc-bc2e-4f5a-84b3-3c1f0e2a5b6a","name":"PERSON_aeed7134_367","face_count":86,"start_frame":150744,"end_frame":152895,"start_time":6029.76,"end_time":6115.8,"confidence":0.855},
{"identity_id":18179,"identity_uuid":"90fc04cd-003b-4a1b-9f7d-8c3e1d2f4a5b","name":"PERSON_aeed7134_270","face_count":13,"start_frame":77418,"end_frame":77454,"start_time":3096.72,"end_time":3098.16,"confidence":0.851}
]}
```
```bash
curl "http://localhost:3002/api/v1/faces/candidates?page=1&page_size=2" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
```
```json
{"total":42,"candidates":[{"frame_number":30,"confidence":0.85},...]}
```
---
## 7. Identity Binding
| # | Method | Path | Description |
|---|--------|------|-------------|
| 45 | POST | `/api/v1/identity/:identity_uuid/bind` | Bind face → identity |
| 46 | POST | `/api/v1/identity/:identity_uuid/unbind` | Unbind face from identity |
| 47 | POST | `/api/v1/identity/:identity_uuid/mergeinto` | Merge into another identity |
```bash
curl -X POST "http://localhost:3002/api/v1/identity/a9a90105-6d6b-46ff-92da-0c3c1a57dff4/bind" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" -H "Content-Type: application/json" -d '{"file_uuid":"3abeee81d94597629ed8cb943f182e94","face_id":"face_42"}'
```
```json
{"success":true}
```
---
## 8. Resources
| # | Method | Path | Description |
|---|--------|------|-------------|
| 48 | POST | `/api/v1/resource/register` | Register processing resource |
| 49 | POST | `/api/v1/resource/heartbeat` | Resource heartbeat |
| 50 | GET | `/api/v1/resources` | List all resources |
```bash
curl "http://localhost:3002/api/v1/resources" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
```
```json
{"success":true,"data":[{"resource_id":"mxbai-embed-large-v1","resource_type":"embedding_model"}],"message":"OK"}
```
---
## 9. Agents — 5W1H
| # | Method | Path | Description |
|---|--------|------|-------------|
| 51 | POST | `/api/v1/agents/translate` | AI text translation |
| 52 | POST | `/api/v1/agents/5w1h/analyze` | Single chunk analysis |
| 53 | POST | `/api/v1/agents/5w1h/batch` | Batch analysis |
| 54 | GET | `/api/v1/agents/5w1h/status` | Job status |
```bash
curl -X POST "http://localhost:3002/api/v1/agents/translate" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" -H "Content-Type: application/json" -d '{"text":"Hello world","target_language":"zh-TW"}'
```
```json
{"success":true,"translated_text":"你好世界"}
```
---
## 10. Agents — Identity
| # | Method | Path | Description |
|---|--------|------|-------------|
| 55 | POST | `/api/v1/agents/identity/match-from-photo` | Match face from photo |
| 56 | POST | `/api/v1/agents/identity/match-from-trace` | Match face from trace |
| 57 | POST | `/api/v1/agents/suggest/merge` | Suggest merge |
| 58 | POST | `/api/v1/agents/suggest/clustering` | Suggest re-clustering |
---
## Version History
| Version | Date | Changes |
|---------|------|---------|
| V4.2 | 2026-05-25 | Removed phantom routes (stats/ingest, stats/inference, agents/identity/status); fixed HTTP methods (chunk, progress, jobs → POST); renamed endpoints (face_trace/sortby → traces, analyze → match-from-photo, suggest → match-from-trace); added config endpoints (consistency, auto-pipeline, watcher-auto-register); updated git hash to de88fd4e |
| V4.1 | 2026-05-14 | Added `build_timestamp` + `resources` + `pipeline` to health APIs; identity search endpoints; trace debug rework (green bbox, text overlay, all traces listed) |
## Related
- `API_DICTIONARY_V1.0.0.md` — Quick reference (55 endpoints)
- `API_DOCUMENTATION_v1.0.0.md` — Detailed spec with examples
- `TRACE/TRACE_API_REFERENCE_V1.0.0.md` — Trace-specific reference

View File

@@ -0,0 +1,381 @@
---
document_type: "reference_doc"
service: "MOMENTRY_CORE"
title: "Momentry Core Release API Reference v1.0.0"
date: "2026-05-25"
version: "V4.2"
status: "active"
owner: "Warren"
---
# Momentry Core API Reference v1.0.0
55 endpoints across 10 categories, with real curl examples and responses.
## Base
| Environment | URL |
|-------------|-----|
| Production | `http://localhost:3002` or `https://api.momentry.ddns.net` |
| Development | `http://localhost:3003` |
| Auth | Header `X-API-Key: <key>` (login endpoint unprotected) |
> **Note**: All examples below use production port 3002. For dev testing, replace `3002` with `3003`.
---
## 1. System
| # | Method | Path | Description |
|---|--------|------|-------------|
| 1 | GET | `/health` | Server status (ok/degraded) |
| 2 | GET | `/health/detailed` | Per-service health + latency |
| 3 | GET | `/health/consistency` | Data consistency check |
| 4 | POST | `/api/v1/auth/login` | Username/password → API key |
| 5 | POST | `/api/v1/auth/logout` | Invalidate session |
| 6 | GET | `/api/v1/stats/sftpgo` | SFTPGo status |
| 7 | POST | `/api/v1/config/cache` | Toggle Redis cache |
| 8 | POST | `/api/v1/config/auto-pipeline` | Toggle auto-pipeline on register |
| 9 | POST | `/api/v1/config/watcher-auto-register` | Toggle watcher auto-register |
```bash
curl http://localhost:3002/health
```
```json
{
"status": "ok",
"version": "1.0.0",
"build_git_hash": "de88fd4e",
"build_timestamp": "2026-05-25",
"uptime_ms": 7052517
}
```
| # | Method | Path | Description |
|---|--------|------|-------------|
| 2a | GET | `/health/detailed` | Per-service health + resources + pipeline |
```bash
curl -X POST http://localhost:3002/api/v1/files/register \
-H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
-H "Content-Type: application/json" \
-d '{"file_path":"/path/to/video.mp4","content_hash":"optional-sha256-of-file"}'
```
```json
{"success":true,"file_uuid":"3abeee81d94597629ed8cb943f182e94","duration":5954.0}
```
Supports all file types (video, image, document, audio). SHA256 content_hash computed automatically if not provided.
```json
{
"status": "ok",
"build_git_hash": "de88fd4e",
"build_timestamp": "2026-05-25",
"services": {
"postgres": {"status": "ok", "latency_ms": 6},
"redis": {"status": "ok", "latency_ms": 0},
"qdrant": {"status": "ok", "latency_ms": 1},
"mongodb": {"status": "ok", "latency_ms": 0}
},
"resources": {
"cpu_used_percent": 50.0,
"cpu_idle_percent": 50.0,
"memory_available_mb": 8028,
"memory_total_mb": 16384,
"memory_used_percent": 51.0,
"gpu_available": false,
"gpu_utilization": null,
"gpu_memory_used_pct": null
},
"pipeline": {
"scripts": true,
"models": true,
"ffmpeg": true,
"embedding_server": {"status": "ok", "latency_ms": 0},
"gdino_api": {"status": "error", "latency_ms": 0, "error": "..."},
"llm": {"status": "ok", "latency_ms": 0}
}
}
```
---
## 2. File Management
| # | Method | Path | Description |
|---|--------|------|-------------|
| 10 | POST | `/api/v1/files/register` | Register file → file_uuid. Body: `{"file_path":"...", "content_hash":"optional"}` |
| 11 | GET | `/api/v1/files/lookup?file_name=` | Pre-upload name conflict check. Returns matches + `next_name` for auto-rename |
| 12 | POST | `/api/v1/unregister` | Unregister file(s): by `file_uuid` or pattern match (`file_path`+`pattern`) |
| 13 | GET | `/api/v1/files/scan` | Scan directory for new files |
| 14 | GET | `/api/v1/files` | List files (paginated) |
| 15 | GET | `/api/v1/file/:file_uuid` | Single file detail |
| 16 | GET | `/api/v1/file/:file_uuid/probe` | ffprobe metadata |
| 17 | POST | `/api/v1/file/:file_uuid/process` | Start pipeline |
| 18 | POST | `/api/v1/file/:file_uuid/chunk/:chunk_id` | Single chunk detail (V1.0.2+) |
| 19 | POST | `/api/v1/progress/:file_uuid` | Processing progress |
| 20 | POST | `/api/v1/jobs` | Monitor jobs (filterable) |
```bash
curl -X POST http://localhost:3002/api/v1/files/register -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" -H "Content-Type: application/json" -d '{"file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/video.mp4"}'
```
```json
{"success":true,"file_uuid":"3abeee81d94597629ed8cb943f182e94","duration":5954.0}
```
Modes:
- By `file_uuid`: unregister a single file
- By `file_path` + `pattern` regex: unregister all matching files in a directory
```bash
# By file_uuid
curl -X POST http://localhost:3002/api/v1/unregister \
-H "X-API-Key: muser_..." -H "Content-Type: application/json" \
-d '{"file_uuid":"53e3a229bf68878b7a799e811e097f9c"}'
# By pattern (unregister all .mp4 files in directory)
curl -X POST http://localhost:3002/api/v1/unregister \
-H "X-API-Key: muser_..." -H "Content-Type: application/json" \
-d '{"file_path":"/data/demo","pattern":"\\.mp4$"}'
```
```json
{"success":true,"file_uuid":"53e3a229bf68878b7a799e811e097f9c","message":"File unregistered successfully"}
```
```bash
curl "http://localhost:3002/api/v1/files?page=1&page_size=2" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
```
```json
{"success":true,"data":[{"file_uuid":"aeed7134...","file_name":"Charade (1963)...","status":"ready"}],"total":0,"page":1,"page_size":2}
```
---
## 3. Search
| # | Method | Path | Description |
|---|--------|------|-------------|
| 21 | POST | `/api/v1/search/visual` | Visual chunk search |
| 22 | POST | `/api/v1/search/visual/class` | By object class |
| 23 | POST | `/api/v1/search/visual/density` | By spatial density |
| 24 | POST | `/api/v1/search/visual/combination` | Combined visual search |
| 25 | POST | `/api/v1/search/visual/stats` | Visual stats |
| 26 | POST | `/api/v1/search/smart` | Semantic (EmbeddingGemma + pgvector) |
| 27 | POST | `/api/v1/search/universal` | BM25 keyword (requires file_uuid) |
| 28 | POST | `/api/v1/search/frames` | Frame-level search |
```bash
curl -X POST http://localhost:3002/api/v1/search/universal -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" -H "Content-Type: application/json" -d '{"query":"name","limit":2,"mode":"bm25","file_uuid":"3abeee81d94597629ed8cb943f182e94"}'
```
```json
{"query":"name","results":[{"chunk_id":"100","text":"What's your name?","start_time":258.68,"score":0.90}],"total":5,"page":1,"page_size":20,"took_ms":42}
```
```bash
curl -X POST http://localhost:3002/api/v1/search/universal -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" -H "Content-Type: application/json" -d '{"query":"friends","limit":2,"mode":"bm25","file_uuid":"3abeee81d94597629ed8cb943f182e94"}'
```
```json
{"query":"friends","results":[{"chunk_id":"104","text":"You won't find it difficult to make some new friends.","start_time":272.38,"score":0.90}],"total":3,"page":1,"page_size":20,"took_ms":38}
```
---
## 4. Face Trace
| # | Method | Path | Description |
|---|--------|------|-------------|
| 29 | POST | `/api/v1/file/:file_uuid/traces` | List traces (sorted/filtered) |
| 30 | GET | `/api/v1/file/:file_uuid/trace/:trace_id/faces` | Trace detections (+ interpolation) |
### traces — list traces
Parameters:
- `sort_by`: `face_count` | `duration` | `first_appearance`
- `min_faces`, `min_confidence`, `max_confidence`: filters
- `limit`: max results
```bash
curl -X POST "http://localhost:3002/api/v1/file/3abeee81d94597629ed8cb943f182e94/traces" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" -H "Content-Type: application/json" -d '{"sort_by":"face_count","limit":2}'
```
```json
{"success":true,"total_traces":6892,"total_faces":108204,"traces":[
{"trace_id":3128,"face_count":1109,"avg_confidence":0.779},
{"trace_id":3126,"face_count":743,"avg_confidence":0.758}
]}
```
### trace/:trace_id/faces — individual detections
Parameters:
- `limit`, `offset`: pagination
- `interpolate`: boolean (fills sparse gaps with lerp bbox)
```bash
curl "http://localhost:3002/api/v1/file/3abeee81d94597629ed8cb943f182e94/trace/2/faces?limit=2&interpolate=true" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
```
```json
{"success":true,"trace_id":2,"fps":25.0,"total":1,"faces":[
{"id":12399,"start_frame":4620,"end_frame":4620,"start_time":184.8,"end_time":184.8,"x":787,"y":582,"width":225,"height":225,"confidence":0.666,"interpolated":false}
]}
```
---
## 5. Media
| # | Method | Path | Description |
|---|--------|------|-------------|
| 31 | GET | `/api/v1/file/:file_uuid/thumbnail` | Frame JPEG (?frame=&x=&y=&w=&h=) |
| 32 | GET | `/api/v1/file/:file_uuid/video` | Raw video stream. Dual input: `?start_time=&end_time=` (seconds) or `?start_frame=&end_frame=` (frames). |
| 33 | GET | `/api/v1/file/:file_uuid/video/bbox` | Bbox overlay. `?start_frame=&end_frame=&face_uuid=&duration=` (all frame numbers). Dual input via `start_time`/`end_time`. |
| 34 | GET | `/api/v1/file/:file_uuid/trace/:trace_id/video` | Trace clip (?mode=&padding=&audio=) |
All video endpoints support:
- `mode=normal|debug` (default: `normal`)
- `audio=on|off` (default: `on`)
`mode=normal`: raw clip, `-c copy`, no overlay.
`mode=debug`: re-encoded with top-left text info + green bboxes (trace labels at actual frames with thickness=4, interpolated at first known position with thickness=1).
```bash
# Normal mode
curl -o trace.mp4 "http://localhost:3002/api/v1/file/{file_uuid}/trace/42/video?mode=normal"
# Debug mode
curl -o trace_debug.mp4 "http://localhost:3002/api/v1/file/{file_uuid}/trace/42/video?mode=debug"
```
Debug overlay shows at bottom-left:
```
Frame {n} {pts}s
Cut: {id}
{file_uuid}
Trace {id}: start={frame} {name}
...
```
Green bbox per face detection: actual frames `thickness=4`, interpolated `thickness=1`.
---
## 6. Identities
| # | Method | Path | Description |
|---|--------|------|-------------|
| 35 | GET | `/api/v1/identities` | List all identities |
| 36 | GET | `/api/v1/file/:file_uuid/identities` | Identities in a file |
| 37 | POST | `/api/v1/identity` | Register new identity |
| 38 | GET | `/api/v1/identity/:identity_uuid` | Identity detail |
| 39 | DELETE | `/api/v1/identity/:identity_uuid` | Delete identity |
| 40 | GET | `/api/v1/identity/:identity_uuid/files` | Files for identity |
| 41 | GET | `/api/v1/identity/:identity_uuid/chunks` | Chunks for identity |
| 42 | GET | `/api/v1/faces/candidates` | Unbound face gallery |
| 43 | GET | `/api/v1/identities/search?q=` | Search identities by name → chunks |
| 44 | GET | `/api/v1/search/identity_text?q=&file_uuid=` | Full-text search → identity-bound chunks |
```bash
curl "http://localhost:3002/api/v1/identities?page=1&page_size=3" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
```
```json
{"count":3852,"page":1,"page_size":3,"identities":[
{"id":18299,"identity_uuid":"76f85ee6-bc47-4a1a-9878-1beb67851ec5","name":"PERSON_aeed7134_390","metadata":{}},
{"id":18298,"identity_uuid":"f4d4ccbf-fccb-4f62-8806-2b7f4a706edb","name":"PERSON_aeed7134_389","metadata":{}},
{"id":18297,"identity_uuid":"e8a1b2c3-d4e5-4f67-8901-23456789abcd","name":"PERSON_aeed7134_388","metadata":{}}
]}
```
### GET /api/v1/file/:file_uuid/identities — identities with frame/time ranges
```bash
curl "http://localhost:3002/api/v1/file/aeed71342a899fe4b4c57b7d41bcb692/identities?limit=2" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
```
```json
{"success":true,"file_uuid":"aeed71342a899fe4b4c57b7d41bcb692","fps":25.0,"total":20,"page":1,"page_size":20,"data":[
{"identity_id":18276,"identity_uuid":"77d895cc-bc2e-4f5a-84b3-3c1f0e2a5b6a","name":"PERSON_aeed7134_367","face_count":86,"start_frame":150744,"end_frame":152895,"start_time":6029.76,"end_time":6115.8,"confidence":0.855},
{"identity_id":18179,"identity_uuid":"90fc04cd-003b-4a1b-9f7d-8c3e1d2f4a5b","name":"PERSON_aeed7134_270","face_count":13,"start_frame":77418,"end_frame":77454,"start_time":3096.72,"end_time":3098.16,"confidence":0.851}
]}
```
```bash
curl "http://localhost:3002/api/v1/faces/candidates?page=1&page_size=2" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
```
```json
{"total":42,"candidates":[{"frame_number":30,"confidence":0.85},...]}
```
---
## 7. Identity Binding
| # | Method | Path | Description |
|---|--------|------|-------------|
| 45 | POST | `/api/v1/identity/:identity_uuid/bind` | Bind face → identity |
| 46 | POST | `/api/v1/identity/:identity_uuid/unbind` | Unbind face from identity |
| 47 | POST | `/api/v1/identity/:identity_uuid/mergeinto` | Merge into another identity |
```bash
curl -X POST "http://localhost:3002/api/v1/identity/a9a90105-6d6b-46ff-92da-0c3c1a57dff4/bind" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" -H "Content-Type: application/json" -d '{"file_uuid":"3abeee81d94597629ed8cb943f182e94","face_id":"face_42"}'
```
```json
{"success":true}
```
---
## 8. Resources
| # | Method | Path | Description |
|---|--------|------|-------------|
| 48 | POST | `/api/v1/resource/register` | Register processing resource |
| 49 | POST | `/api/v1/resource/heartbeat` | Resource heartbeat |
| 50 | GET | `/api/v1/resources` | List all resources |
```bash
curl "http://localhost:3002/api/v1/resources" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
```
```json
{"success":true,"data":[{"resource_id":"mxbai-embed-large-v1","resource_type":"embedding_model"}],"message":"OK"}
```
---
## 9. Agents — 5W1H
| # | Method | Path | Description |
|---|--------|------|-------------|
| 51 | POST | `/api/v1/agents/translate` | AI text translation |
| 52 | POST | `/api/v1/agents/5w1h/analyze` | Single chunk analysis |
| 53 | POST | `/api/v1/agents/5w1h/batch` | Batch analysis |
| 54 | GET | `/api/v1/agents/5w1h/status` | Job status |
```bash
curl -X POST "http://localhost:3002/api/v1/agents/translate" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" -H "Content-Type: application/json" -d '{"text":"Hello world","target_language":"zh-TW"}'
```
```json
{"success":true,"translated_text":"你好世界"}
```
---
## 10. Agents — Identity
| # | Method | Path | Description |
|---|--------|------|-------------|
| 55 | POST | `/api/v1/agents/identity/match-from-photo` | Match face from photo |
| 56 | POST | `/api/v1/agents/identity/match-from-trace` | Match face from trace |
| 57 | POST | `/api/v1/agents/suggest/merge` | Suggest merge |
| 58 | POST | `/api/v1/agents/suggest/clustering` | Suggest re-clustering |
---
## Version History
| Version | Date | Changes |
|---------|------|---------|
| V4.2 | 2026-05-25 | Removed phantom routes (stats/ingest, stats/inference, agents/identity/status); fixed HTTP methods (chunk, progress, jobs → POST); renamed endpoints (face_trace/sortby → traces, analyze → match-from-photo, suggest → match-from-trace); added config endpoints (consistency, auto-pipeline, watcher-auto-register); updated git hash to de88fd4e |
| V4.1 | 2026-05-14 | Added `build_timestamp` + `resources` + `pipeline` to health APIs; identity search endpoints; trace debug rework (green bbox, text overlay, all traces listed) |
## Related
- `API_DICTIONARY_V1.0.0.md` — Quick reference (55 endpoints)
- `API_DOCUMENTATION_v1.0.0.md` — Detailed spec with examples
- `TRACE/TRACE_API_REFERENCE_V1.0.0.md` — Trace-specific reference

View File

@@ -0,0 +1,218 @@
# Momentry API 使用指南
## 認證流程
```mermaid
sequenceDiagram
actor User
participant API as Momentry API
participant Auth as Auth Service
User->>API: POST /api/v1/auth/login
API->>Auth: 驗證 username/password
Auth-->>API: API Key
API-->>User: { "api_key": "muser_xxx..." }
Note over User: 後續請求帶入 Header
User->>API: GET /api/v1/files<br/>X-API-Key: muser_xxx...
API-->>User: { files: [...] }
```
**demo 帳號**: `demo` / `demo`
---
## 註冊 + 處理流程
```mermaid
flowchart LR
A[上傳影片] --> B[POST /files/register]
B --> C[取得 file_uuid]
C --> D[POST /file/:file_uuid/process]
...
F --> M[GET /progress/:file_uuid]
G --> M
H --> M
I --> M
J --> M
K --> M
L --> M
M --> N[completed]
```
---
## 臉部追蹤架構
```mermaid
graph TB
subgraph Detection
A[Face Processor] --> B[face_detections]
B --> C[Store Traced Faces]
end
subgraph Tracing
C --> D[face_traces]
D --> E[Trace Aggregation]
end
subgraph API
E --> F[POST /face_trace/sortby]
E --> G[GET /trace/:id/faces]
E --> H[GET /trace/:id/video]
end
subgraph Display
F --> I[Face Thumbnail Timeline V1]
F --> J[Identity Swimlane V2]
G --> K[Interpolation POC]
H --> L[MP4 with BBOX]
end
```
---
## 搜尋三模式
```mermaid
flowchart TD
Q[使用者輸入查詢] --> M{選擇模式}
M -->|BM25| A[POST /search/universal]
A --> B[PostgreSQL ILIKE]
B --> C[關鍵字比對 text_content]
M -->|Vector| D[POST /search/smart]
D --> E[EmbeddingGemma 768D]
E --> F[pgvector 相似度搜尋]
M -->|Hybrid| G[內部組合]
G --> H[Vector Search]
G --> I[BM25 Rerank]
H --> J[Reranked Results]
I --> J
C --> K[結果回傳]
F --> K
J --> K
```
---
## 資料模型關聯
```mermaid
erDiagram
VIDEOS ||--o{ FACE_DETECTIONS : contains
VIDEOS ||--o{ CHUNKS : contains
VIDEOS ||--o{ PRE_CHUNKS : contains
FACE_DETECTIONS ||--o{ FACE_TRACES : belongs_to
FACE_TRACES }o--|| IDENTITIES : identifies
IDENTITIES ||--o{ IDENTITY_BINDINGS : binds
CHUNKS ||--o{ PARENT_CHUNKS : groups
VIDEOS {
string file_uuid PK
string file_name
float duration
int width
int height
float fps
}
FACE_DETECTIONS {
int id PK
string file_uuid FK
int trace_id
int frame_number
int x
int y
float confidence
}
IDENTITIES {
int id PK
string name
string file_uuid
int tmdb_id
}
```
---
## 端點路徑總覽
```mermaid
mindmap
root((api.momentry.ddns.net))
System
GET /health
POST /auth/login
GET /stats/ingest
Files
POST /files/register
GET /files
GET /file/:file_uuid
POST /file/:file_uuid/process
Traces
POST /face_trace/sortby
GET /trace/:trace_id/faces
GET /trace/:trace_id/video
GET /thumbnail
Search
POST /search/universal
POST /search/smart
POST /search/visual
Identities
GET /identities
POST /identity
POST /identity/:identity_uuid/bind
Agents
POST /agents/translate
POST /agents/5w1h/analyze
POST /agents/identity/suggest
```
---
## 互動範例
### 1. 登入 → 取得檔案列表
```mermaid
sequenceDiagram
actor Dev
Dev->>API: POST /api/v1/auth/login<br/>{ "username": "demo", "password": "demo" }
API-->>Dev: { "api_key": "muser_test_001..." }
Dev->>API: GET /api/v1/files<br/>X-API-Key: muser_test_001...
API-->>Dev: { "files": [...], "total": 37 }
```
### 2. 查看臉部追蹤 → 播放影片
```mermaid
sequenceDiagram
actor Dev
Dev->>API: POST /api/v1/file/{file_uuid}/face_trace/sortby<br/>{ "sort_by": "face_count", "limit": 3 }
API-->>Dev: { "total_traces": 6892, "traces": [...] }
Dev->>API: GET /api/v1/file/{file_uuid}/trace/3128/video
API-->>Dev: MP4 binary
Note over Dev: Browser opens video with bbox
```
### 3. 身分識別
```mermaid
sequenceDiagram
actor Dev
Dev->>API: GET /api/v1/identities?page=560&page_size=5
API-->>Dev: { "identities": [<br/> {"name":"Cary Grant"},<br/> {"name":"Audrey Hepburn"}<br/>] }
```
---
## 快速參考
| 用途 | 指令 |
|------|------|
| 登入取得 Key | `curl -X POST https://api.momentry.ddns.net/api/v1/auth/login -H "Content-Type: application/json" -d '{"username":"demo","password":"demo"}'` |
| 列出檔案 | `curl https://api.momentry.ddns.net/api/v1/files -H "X-API-Key: muser_test_001"` |
| Top Traces | `curl -X POST https://api.momentry.ddns.net/api/v1/file/{file_uuid}/face_trace/sortby -H "X-API-Key: muser_test_001" -H "Content-Type: application/json" -d '{"sort_by":"face_count","limit":3}'` |
| BM25 搜尋 | `curl -X POST https://api.momentry.ddns.net/api/v1/search/universal -H "X-API-Key: muser_test_001" -H "Content-Type: application/json" -d '{"query":"friends","mode":"bm25","uuid":"{file_uuid}"}'` |
| 身分列表 | `curl https://api.momentry.ddns.net/api/v1/identities?page=1&page_size=5 -H "X-API-Key: muser_test_001"` |

View File

@@ -0,0 +1,136 @@
{
"title": "Momentry Core 展示 v1.0.0",
"version": "1.0",
"language": "zh_TW",
"server": "https://api.momentry.ddns.net",
"setup": "KEY=\"X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69\"; BASE=https://api.momentry.ddns.net; FILE=3abeee81d94597629ed8cb943f182e94",
"steps": [
{
"type": "separator",
"label": "開場:系統活著"
},
{
"type": "note",
"label": "確認服務正常",
"note": "Momentry Core 是一套影片內容分析系統。給它一支影片,它會自動辨識裡面的人臉、追蹤他們的移動、分析誰是誰,還能用文字搜尋影片內容。"
},
{
"type": "curl",
"label": "伺服器狀態檢查",
"note": "先確認服務正常。正式環境伺服器回應狀態「ok」。",
"cmd": "curl -s $BASE/health",
"expect": "ok"
},
{
"type": "browser",
"label": "瀏覽器開啟狀態頁",
"note": "瀏覽器直接開啟狀態頁面也可以。",
"url": "$BASE/health"
},
{
"type": "separator",
"label": "檔案與人臉追蹤"
},
{
"type": "curl",
"label": "檢視已註冊檔案",
"note": "目前系統有三十七支已註冊的影片,以 Charade 這部老電影為主。",
"cmd": "curl -s \"$BASE/api/v1/files?page=1&page_size=3\" -H \"X-API-Key: $KEY\"",
"expect": "file_uuid"
},
{
"type": "curl",
"label": "人臉追蹤總覽",
"note": "核心功能:系統把影片中每個出現的人臉追蹤成一個「追蹤紀錄」。這部 Charade 總共找到六千八百九十二個追蹤、十萬八千二百零四次臉部偵測。最長的一段追蹤有一千一百零九次連續出現,持續四十四點三秒。",
"cmd": "curl -s -X POST $BASE/api/v1/file/$FILE/face_trace/sortby -H \"X-API-Key: $KEY\" -H \"Content-Type: application/json\" -d '{\"sort_by\":\"face_count\",\"limit\":5}'",
"expect": "total_traces"
},
{
"type": "curl",
"label": "追蹤細節與補間動畫",
"note": "人臉處理器每隔三十個影格才取樣一次,原始資料是稀疏的。加上補間參數後,系統會自動計算中間每個影格的方框位置。補間標記為真的代表這是運算產生的,信心度為零。",
"cmd": "curl -s \"$BASE/api/v1/file/$FILE/trace/2/faces?limit=5&interpolate=true\" -H \"X-API-Key: $KEY\"",
"expect": "interpolated"
},
{
"type": "separator",
"label": "影片播放"
},
{
"type": "browser",
"label": "觀看追蹤影片",
"note": "把人臉追蹤渲染成影片,紅色方框標記人臉位置。每個偵測的框會持續到下一次偵測為止。",
"url": "$BASE/api/v1/file/$FILE/trace/5/video?padding=1"
},
{
"type": "browser",
"label": "觀看單張縮圖",
"note": "單一個影格的 JPEG 截圖。",
"url": "$BASE/api/v1/file/$FILE/thumbnail?frame=68280"
},
{
"type": "separator",
"label": "文字搜尋"
},
{
"type": "curl",
"label": "關鍵字搜尋「朋友」",
"note": "文字搜尋:不需要向量,直接用關鍵字比對。這是搜尋「朋友」的結果。",
"cmd": "curl -s -X POST $BASE/api/v1/search/universal -H \"X-API-Key: $KEY\" -H \"Content-Type: application/json\" -d '{\"query\":\"friends\",\"limit\":3,\"mode\":\"bm25\",\"uuid\":\"$FILE\"}'",
"expect": "friends"
},
{
"type": "curl",
"label": "關鍵字搜尋「名字」",
"note": "再搜尋「名字」看看,會找到「你叫什麼名字?」這段台詞。",
"cmd": "curl -s -X POST $BASE/api/v1/search/universal -H \"X-API-Key: $KEY\" -H \"Content-Type: application/json\" -d '{\"query\":\"name\",\"limit\":3,\"mode\":\"bm25\",\"uuid\":\"$FILE\"}'",
"expect": "name"
},
{
"type": "separator",
"label": "身分辨識"
},
{
"type": "curl",
"label": "電影資料庫身分列表",
"note": "系統不只是追蹤臉,它還知道誰是誰。處理管線自動比對電影資料庫後的結果:兩千八百一十個身分,包含 Cary Grant、Audrey Hepburn 等知名演員。",
"cmd": "curl -s \"$BASE/api/v1/identities?page=560&page_size=5\" -H \"X-API-Key: $KEY\"",
"expect": "\"name\""
},
{
"type": "curl",
"label": "未辨識人臉候選",
"note": "還沒被指認的身分叫做候選人,可以在這裡手動綁定到正確人名。",
"cmd": "curl -s \"$BASE/api/v1/faces/candidates?page=1&page_size=3\" -H \"X-API-Key: $KEY\"",
"expect": "candidates"
},
{
"type": "curl",
"label": "系統資源一覽",
"note": "系統資源一覽:包含目前使用的文字嵌入模型等資訊。",
"cmd": "curl -s \"$BASE/api/v1/resources\" -H \"X-API-Key: $KEY\"",
"expect": "success"
},
{
"type": "separator",
"label": "人工智慧語意搜尋"
},
{
"type": "curl",
"label": "向量語意搜尋",
"note": "最後是人工智慧搜尋。查詢先經由嵌入模型轉成七百六十八維的向量,再到向量資料庫做相似度比對。",
"cmd": "curl -s -X POST $BASE/api/v1/search/smart -H \"X-API-Key: $KEY\" -H \"Content-Type: application/json\" -d '{\"query\":\"Audrey Hepburn\",\"uuid\":\"$FILE\"}'",
"expect": "results"
},
{
"type": "separator",
"label": "展示結束"
}
]
}

View File

@@ -0,0 +1,173 @@
# Momentry Demo Script v1.0.0
Curl for POST/API, browser for video/thumbnail. 約 10 分鐘。
---
## 開場:這是什麼?
> 「Momentry Core — 影片內容分析系統。給它一支影片,它會自動辨識裡面的人臉、追蹤他們的移動、分析誰是誰,還能用文字搜尋影片內容。」
---
## Step 0: 設定
```bash
KEY="X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
BASE=https://api.momentry.ddns.net
```
---
## Step 1: 系統活著
> 「先確認服務正常。」
```bash
curl $BASE/health
```
**預期**: `{"status":"ok","version":"1.0.0","uptime_ms":...}`
👉 瀏覽器開 `https://api.momentry.ddns.net/health` 也可。
---
## Step 2: 檔案一覽
> 「目前系統有 37 支已註冊的影片。」
```bash
curl "$BASE/api/v1/files?page=1&page_size=3" -H "$KEY"
```
**預期**: Charade (1963) 為主,還有其他測試檔。
---
## Step 3: 臉部追蹤概覽
> 「這是核心功能。系統把影片中每個出現的人臉追蹤成一個『trace』。這部 Charade 總共找到 **6,892 個 trace、108,204 次臉部偵測**。」
```bash
curl -X POST $BASE/api/v1/file/3abeee81d94597629ed8cb943f182e94/face_trace/sortby -H "$KEY" \
-H "Content-Type: application/json" \
-d '{"sort_by":"face_count","limit":5}'
```
**解說**:
- trace #3128: **1,109 次出現**,持續 44.3 秒 — 這是最長的一段
- trace #3126: 743 次
- 數字越高代表這個人出現在畫面上的時間越長
---
## Step 4: 單一 Trace 細節
> 「點進去看一個 trace 的每一幀。每個框框就是一次臉部偵測,包含位置、大小、信心度。」
```bash
curl "$BASE/api/v1/file/3abeee81d94597629ed8cb943f182e94/trace/2/faces?limit=3" -H "$KEY"
```
**解說**: 回傳的資料包含 `start_frame`(第幾幀)、`start_time`第幾秒、bbox 座標、信心度。
---
## Step 5: 補間動畫
> 「因為 face processor 每隔 30 幀才取樣一次,所以原始資料是稀疏的。加上 `interpolate=true` 後,系統會自動線性補間,填滿中間每一幀的 bbox 位置。」
```bash
curl "$BASE/api/v1/file/3abeee81d94597629ed8cb943f182e94/trace/2/faces?limit=5&interpolate=true" -H "$KEY"
```
**解說**: `interpolated: false` 是真實偵測,`interpolated: true` 是補間的confidence = 0。前端的淺色框就是補間框。
---
## Step 6: Trace 影片播放(瀏覽器)
> 「把 trace 渲染成影片,紅框標記人臉位置。」
**瀏覽器開**:
```
https://api.momentry.ddns.net/api/v1/file/3abeee81d94597629ed8cb943f182e94/trace/5/video?padding=1
```
**解說**: 紅框 = 臉部位置,文字標籤 = trace ID。每個 detection 的框會持續到下一次偵測為止。
---
## Step 7: 關鍵字搜尋 (BM25)
> 「文字搜尋 — 不需要向量直接用關鍵字比對。這是『friends』的搜尋結果。」
```bash
curl -X POST $BASE/api/v1/search/universal -H "$KEY" \
-H "Content-Type: application/json" \
-d '{"query":"friends","limit":3,"mode":"bm25","file_uuid":"3abeee81d94597629ed8cb943f182e94"}'
```
**預期**: `"You won't find it difficult to make some new friends."` score=0.90
> 「再搜尋『name』看看
```bash
curl -X POST $BASE/api/v1/search/universal -H "$KEY" \
-H "Content-Type: application/json" \
-d '{"query":"name","limit":3,"mode":"bm25","file_uuid":"3abeee81d94597629ed8cb943f182e94"}'
```
**預期**: `"What's your name?"` score=0.90
---
## Step 8: 身分辨識
> 「系統不只是追蹤臉,它還知道誰是誰。這是 M5 pipeline 自動比對 TMDb 資料庫後的結果 — **2,810 個身分**,包含 Cary Grant、Audrey Hepburn 等。」
```bash
curl "$BASE/api/v1/identities?page=560&page_size=5" -H "$KEY"
```
**預期**: Raoul Delfosse, Albert Daumergue, Claudine Berg...
> 「也可以直接看所有身分的列表,按頁次翻找。」
---
## Step 9: 臉部候選人(未辨識)
> 「還沒被指认的身分叫做『candidate』可以在這裡手動綁定。」
```bash
curl "$BASE/api/v1/faces/candidates?page=1&page_size=3" -H "$KEY"
```
---
## Step 10: 嵌入向量搜尋
> 「最後是 AI 搜尋。Query 先經由 EmbeddingGemma 轉成 768 維向量,再到 Qdrant 做相似度比對。」
```bash
curl -X POST $BASE/api/v1/search/smart -H "$KEY" \
-H "Content-Type: application/json" \
-d '{"query":"Audrey Hepburn","file_uuid":"3abeee81d94597629ed8cb943f182e94"}'
```
---
## 收尾
> 「以上就是 Momentry Core v1.0.0 的主要功能展示。總結:**
>
> 1. **臉部追蹤** — 6,892 traces, 108,204 detections
> 2. **補間動畫** — 稀疏取樣 → 連續軌跡
> 3. **影片渲染** — bbox overlay MP4
> 4. **關鍵字搜尋** — BM25 全文檢索
> 5. **身分辨識** — 2,810 identities, TMDb 整合
> 6. **AI 語意搜尋** — EmbeddingGemma + Qdrant
>
> 所有 API 皆可透過 `https://api.momentry.ddns.net` 存取,使用 demo/demo 登入取得 API key。"

View File

@@ -0,0 +1,173 @@
# Momentry Demo Script v1.0.0
Curl for POST/API, browser for video/thumbnail. 約 10 分鐘。
---
## 開場:這是什麼?
> 「Momentry Core — 影片內容分析系統。給它一支影片,它會自動辨識裡面的人臉、追蹤他們的移動、分析誰是誰,還能用文字搜尋影片內容。」
---
## Step 0: 設定
```bash
KEY="X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
BASE=https://api.momentry.ddns.net
```
---
## Step 1: 系統活著
> 「先確認服務正常。」
```bash
curl $BASE/health
```
**預期**: `{"status":"ok","version":"1.0.0","uptime_ms":...}`
👉 瀏覽器開 `https://api.momentry.ddns.net/health` 也可。
---
## Step 2: 檔案一覽
> 「目前系統有 37 支已註冊的影片。」
```bash
curl "$BASE/api/v1/files?page=1&page_size=3" -H "$KEY"
```
**預期**: Charade (1963) 為主,還有其他測試檔。
---
## Step 3: 臉部追蹤概覽
> 「這是核心功能。系統把影片中每個出現的人臉追蹤成一個『trace』。這部 Charade 總共找到 **6,892 個 trace、108,204 次臉部偵測**。」
```bash
curl -X POST $BASE/api/v1/file/3abeee81d94597629ed8cb943f182e94/face_trace/sortby -H "$KEY" \
-H "Content-Type: application/json" \
-d '{"sort_by":"face_count","limit":5}'
```
**解說**:
- trace #3128: **1,109 次出現**,持續 44.3 秒 — 這是最長的一段
- trace #3126: 743 次
- 數字越高代表這個人出現在畫面上的時間越長
---
## Step 4: 單一 Trace 細節
> 「點進去看一個 trace 的每一幀。每個框框就是一次臉部偵測,包含位置、大小、信心度。」
```bash
curl "$BASE/api/v1/file/3abeee81d94597629ed8cb943f182e94/trace/2/faces?limit=3" -H "$KEY"
```
**解說**: 回傳的資料包含 `start_frame`(第幾幀)、`start_time`第幾秒、bbox 座標、信心度。
---
## Step 5: 補間動畫
> 「因為 face processor 每隔 30 幀才取樣一次,所以原始資料是稀疏的。加上 `interpolate=true` 後,系統會自動線性補間,填滿中間每一幀的 bbox 位置。」
```bash
curl "$BASE/api/v1/file/3abeee81d94597629ed8cb943f182e94/trace/2/faces?limit=5&interpolate=true" -H "$KEY"
```
**解說**: `interpolated: false` 是真實偵測,`interpolated: true` 是補間的confidence = 0。前端的淺色框就是補間框。
---
## Step 6: Trace 影片播放(瀏覽器)
> 「把 trace 渲染成影片,紅框標記人臉位置。」
**瀏覽器開**:
```
https://api.momentry.ddns.net/api/v1/file/3abeee81d94597629ed8cb943f182e94/trace/5/video?padding=1
```
**解說**: 紅框 = 臉部位置,文字標籤 = trace ID。每個 detection 的框會持續到下一次偵測為止。
---
## Step 7: 關鍵字搜尋 (BM25)
> 「文字搜尋 — 不需要向量直接用關鍵字比對。這是『friends』的搜尋結果。」
```bash
curl -X POST $BASE/api/v1/search/universal -H "$KEY" \
-H "Content-Type: application/json" \
-d '{"query":"friends","limit":3,"mode":"bm25","file_uuid":"3abeee81d94597629ed8cb943f182e94"}'
```
**預期**: `"You won't find it difficult to make some new friends."` score=0.90
> 「再搜尋『name』看看
```bash
curl -X POST $BASE/api/v1/search/universal -H "$KEY" \
-H "Content-Type: application/json" \
-d '{"query":"name","limit":3,"mode":"bm25","file_uuid":"3abeee81d94597629ed8cb943f182e94"}'
```
**預期**: `"What's your name?"` score=0.90
---
## Step 8: 身分辨識
> 「系統不只是追蹤臉,它還知道誰是誰。這是 M5 pipeline 自動比對 TMDb 資料庫後的結果 — **2,810 個身分**,包含 Cary Grant、Audrey Hepburn 等。」
```bash
curl "$BASE/api/v1/identities?page=560&page_size=5" -H "$KEY"
```
**預期**: Raoul Delfosse, Albert Daumergue, Claudine Berg...
> 「也可以直接看所有身分的列表,按頁次翻找。」
---
## Step 9: 臉部候選人(未辨識)
> 「還沒被指认的身分叫做『candidate』可以在這裡手動綁定。」
```bash
curl "$BASE/api/v1/faces/candidates?page=1&page_size=3" -H "$KEY"
```
---
## Step 10: 嵌入向量搜尋
> 「最後是 AI 搜尋。Query 先經由 EmbeddingGemma 轉成 768 維向量,再到 Qdrant 做相似度比對。」
```bash
curl -X POST $BASE/api/v1/search/smart -H "$KEY" \
-H "Content-Type: application/json" \
-d '{"query":"Audrey Hepburn","file_uuid":"3abeee81d94597629ed8cb943f182e94"}'
```
---
## 收尾
> 「以上就是 Momentry Core v1.0.0 的主要功能展示。總結:**
>
> 1. **臉部追蹤** — 6,892 traces, 108,204 detections
> 2. **補間動畫** — 稀疏取樣 → 連續軌跡
> 3. **影片渲染** — bbox overlay MP4
> 4. **關鍵字搜尋** — BM25 全文檢索
> 5. **身分辨識** — 2,810 identities, TMDb 整合
> 6. **AI 語意搜尋** — EmbeddingGemma + Qdrant
>
> 所有 API 皆可透過 `https://api.momentry.ddns.net` 存取,使用 demo/demo 登入取得 API key。"

View File

@@ -0,0 +1,114 @@
# Demo Sequence v1.0.0
Curl for POST, browser for GET/Video.
## Setup
```bash
KEY="X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
BASE=https://api.momentry.ddns.net
FILE=3abeee81d94597629ed8cb943f182e94
```
---
## 1. Server Alive
Curl:
```bash
curl $BASE/health
```
Browser: open `https://api.momentry.ddns.net/health`
---
## 2. List Traces (top 3 最多臉孔)
Curl:
```bash
curl -X POST $BASE/api/v1/file/$FILE/face_trace/sortby -H "$KEY" -H "Content-Type: application/json" -d '{"sort_by":"face_count","limit":3}'
```
**預期**: 6892 traces, 最大 trace 1109 faces
---
## 3. Trace 詳情 + 補間動畫
Curl:
```bash
curl "$BASE/api/v1/file/$FILE/trace/2/faces?limit=3&interpolate=true" -H "$KEY"
```
**預期**: real + interpolated framesbbox 線性過渡
---
## 4. BM25 關鍵字搜尋
Curl:
```bash
curl -X POST $BASE/api/v1/search/universal -H "$KEY" -H "Content-Type: application/json" -d '{"query":"friends","limit":3,"mode":"bm25","file_uuid":"$FILE"}'
```
**預期**: "You won't find it difficult to make some new friends."
---
## 5. 身分列表
Curl:
```bash
curl "$BASE/api/v1/identities?page=560&page_size=5" -H "$KEY"
```
**預期**: Cary Grant, Audrey Hepburn, Walter Matthau...
---
## 6. Trace 影片播放 (Browser)
Browser 開:
```
https://api.momentry.ddns.net/api/v1/file/3abeee81d94597629ed8cb943f182e94/trace/3128/video?padding=1
```
**預期**: MP4 影片,紅框標記臉部,顯示 "t3128" 標籤
---
## 7. BBOX 影片 (frame 區間)
Browser 開:
```
https://api.momentry.ddns.net/api/v1/file/3abeee81d94597629ed8cb943f182e94/video/bbox?start_frame=68000&end_frame=69000
```
**預期**: 該區間內所有臉部偵測的 bbox overlay 影片
---
## 8. Frame 縮圖
Browser 開:
```
https://api.momentry.ddns.net/api/v1/file/3abeee81d94597629ed8cb943f182e94/thumbnail?frame=68280
```
**預期**: JPEG 圖片trace #3128 的第一幀)
---
## Summary
| Step | Type | Endpoint | What to See |
|------|------|----------|-------------|
| 1 | Curl/Browser | `/health` | Server ok |
| 2 | Curl | `face_trace/sortby` | 6892 traces |
| 3 | Curl | `trace/:trace_id/faces?interpolate=true` | Interpolated bbox |
| 4 | Curl | `search/universal` | BM25 match |
| 5 | Curl | `/identities` | Named persons |
| 6 | **Browser** | `trace/:trace_id/video` | MP4 with bbox |
| 7 | **Browser** | `video/bbox` | Frame interval overlay |
| 8 | **Browser** | `thumbnail` | Single frame JPEG |

View File

@@ -0,0 +1,114 @@
# Demo Sequence v1.0.0
Curl for POST, browser for GET/Video.
## Setup
```bash
KEY="X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
BASE=https://api.momentry.ddns.net
FILE=3abeee81d94597629ed8cb943f182e94
```
---
## 1. Server Alive
Curl:
```bash
curl $BASE/health
```
Browser: open `https://api.momentry.ddns.net/health`
---
## 2. List Traces (top 3 最多臉孔)
Curl:
```bash
curl -X POST $BASE/api/v1/file/$FILE/face_trace/sortby -H "$KEY" -H "Content-Type: application/json" -d '{"sort_by":"face_count","limit":3}'
```
**預期**: 6892 traces, 最大 trace 1109 faces
---
## 3. Trace 詳情 + 補間動畫
Curl:
```bash
curl "$BASE/api/v1/file/$FILE/trace/2/faces?limit=3&interpolate=true" -H "$KEY"
```
**預期**: real + interpolated framesbbox 線性過渡
---
## 4. BM25 關鍵字搜尋
Curl:
```bash
curl -X POST $BASE/api/v1/search/universal -H "$KEY" -H "Content-Type: application/json" -d '{"query":"friends","limit":3,"mode":"bm25","file_uuid":"$FILE"}'
```
**預期**: "You won't find it difficult to make some new friends."
---
## 5. 身分列表
Curl:
```bash
curl "$BASE/api/v1/identities?page=560&page_size=5" -H "$KEY"
```
**預期**: Cary Grant, Audrey Hepburn, Walter Matthau...
---
## 6. Trace 影片播放 (Browser)
Browser 開:
```
https://api.momentry.ddns.net/api/v1/file/3abeee81d94597629ed8cb943f182e94/trace/3128/video?padding=1
```
**預期**: MP4 影片,紅框標記臉部,顯示 "t3128" 標籤
---
## 7. BBOX 影片 (frame 區間)
Browser 開:
```
https://api.momentry.ddns.net/api/v1/file/3abeee81d94597629ed8cb943f182e94/video/bbox?start_frame=68000&end_frame=69000
```
**預期**: 該區間內所有臉部偵測的 bbox overlay 影片
---
## 8. Frame 縮圖
Browser 開:
```
https://api.momentry.ddns.net/api/v1/file/3abeee81d94597629ed8cb943f182e94/thumbnail?frame=68280
```
**預期**: JPEG 圖片trace #3128 的第一幀)
---
## Summary
| Step | Type | Endpoint | What to See |
|------|------|----------|-------------|
| 1 | Curl/Browser | `/health` | Server ok |
| 2 | Curl | `face_trace/sortby` | 6892 traces |
| 3 | Curl | `trace/:trace_id/faces?interpolate=true` | Interpolated bbox |
| 4 | Curl | `search/universal` | BM25 match |
| 5 | Curl | `/identities` | Named persons |
| 6 | **Browser** | `trace/:trace_id/video` | MP4 with bbox |
| 7 | **Browser** | `video/bbox` | Frame interval overlay |
| 8 | **Browser** | `thumbnail` | Single frame JPEG |

View File

@@ -0,0 +1,83 @@
# Embedding 跨機器部署方案 v1.0.0
## 分工原則
```
M5Pipeline + 主力 Embedding M4Portal + Fallback Embedding
├── 批量 vectorize1709 chunks ├── Portal search query embedding
├── EmbeddingGemma 主 server ├── 備援 embed server
├── 模型已上線port 11436 └── 預設呼叫 M5 API
└── 出門 demo 可離線運作
```
## 部署架構
```
Portal Search Query
┌─────────────┐ 成功 ┌──────────────────┐
│ M4 Portal │ ──────────→ │ M5:11436 │
│ embed │ │ EmbeddingGemma │
│ client │ │ (主力) │
│ │ 失敗 └──────────────────┘
│ retry │ ──────────→ ┌──────────────────┐
│ fallback │ │ M4:11436 │
└─────────────┘ │ EmbeddingGemma │
│ (備援) │
└──────────────────┘
```
## M4 安裝步驟
```bash
# 1. 安裝 Python 依賴
pip install torch transformers flask
# 2. 登入 HuggingFace需接受授權
open https://huggingface.co/google/embeddinggemma-300m
huggingface-cli login --token YOUR_TOKEN
# 3. 取得 script
rsync -av accusys@192.168.110.201:/Users/accusys/momentry_core_0.1/scripts/embeddinggemma_server.py \
./scripts/embeddinggemma_server.py
# 4. 啟動備援 server
python3 scripts/embeddinggemma_server.py --port 11436
```
## Portal Embed Client
```javascript
async function embedQuery(text) {
const servers = [
'http://192.168.110.201:11436/v1/embeddings', // M5 主力
'http://localhost:11436/v1/embeddings', // M4 備援
];
for (const url of servers) {
try {
const res = await fetch(url, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ input: text }),
});
const data = await res.json();
return data.data[0].embedding;
} catch (e) {
continue; // 下一台
}
}
throw new Error('Embedding servers unreachable');
}
```
## 模型一致性
| 項目 | M5 | M4 |
|------|-----|-----|
| 模型 | EmbeddingGemma 300M | EmbeddingGemma 300M |
| 維度 | 768D | 768D |
| Server | Python MPS (port 11436) | Python CPU/MPS (port 11436) |
| Qdrant | 192.168.110.201:6333 | 192.168.110.201:6333 |
兩台使用同一模型、同一維度,確保 query embedding 與索引 embedding 可比對。

Some files were not shown because too many files have changed in this diff Show More