cleanup: remove dead code and duplicate docs

- Remove session-ses_2f27.md (161KB raw session log) - Remove 49 ROOT_* duplicate files across REFERENCE/ - Remove 14 duplicate files between REFERENCE/ root and history/ - Remove asr_legacy.rs (dead code, replaced by asr.rs) - Remove src/core/worker/ (duplicate JobWorker) - Remove src/core/layers/ (empty directory) - Remove 4 .bak files in src/ - Remove 7 dead private methods in worker/processor.rs - Remove backup directory from git tracking
2026-05-04 01:31:21 +08:00
parent ee81e343ce
commit e75c4d6f07
3270 changed files with 35190 additions and 53367 deletions
--- a/docs_v1.0/API_DOCUMENTATION.md
+++ b/docs_v1.0/API_DOCUMENTATION.md
@@ -1,699 +0,0 @@
-# Momentry Core API Documentation v1.0.0
-
-## Overview
-Momentry Core is a digital asset management system with video analysis, RAG, and face recognition capabilities. This document covers all API endpoints available in v1.0.0.
-
-**Base URL**: `http://<host>:<port>`
- Production: Port 3002
- Development (Playground): Port 3003
-
-**Authentication**: All protected routes require API key validation via `X-API-Key` header.
-
---
-
-## API Classification
-
-The API is organized into 7 categories:
-
-| Category | Prefix | Description |
-|----------|--------|-------------|
-| **Health & Auth** | `/health`, `/api/v1/auth` | System health, authentication |
-| **Asset Management** | `/api/v1/register`, `/api/v1/files`, `/api/v1/assets` | File registration, probing, processing |
-| **Search** | `/api/v1/search`, `/api/v1/n8n` | Text, hybrid, visual, and n8n search |
-| **Video Details** | `/api/v1/videos`, `/api/v1/progress` | Video listing, details, chunks |
-| **Identity & Binding** | `/api/v1/identities`, `/api/v1/signals` | Face/speaker identity management |
-| **Jobs & Rules** | `/api/v1/jobs`, `/api/v1/rules` | Processing job monitoring |
-| **Stats & Config** | `/api/v1/stats`, `/api/v1/config` | System statistics, configuration |
-
---
-
-## 1. Health & Authentication
-
-### `GET /health`
-Basic health check.
-
-**Response**:
-```json
-{
-  "status": "ok",
-  "version": "v1.0.0",
-  "uptime_ms": 12345
-}
-```
-
-### `GET /health/detailed`
-Detailed health check with service status (PostgreSQL, Redis, Qdrant, MongoDB).
-
-**Response**:
-```json
-{
-  "status": "ok",
-  "version": "v1.0.0",
-  "uptime_ms": 12345,
-  "services": {
-    "postgres": { "status": "ok", "latency_ms": 5 },
-    "redis": { "status": "ok", "latency_ms": 2 },
-    "qdrant": { "status": "ok", "latency_ms": 10 },
-    "mongodb": { "status": "ok", "latency_ms": 8 }
-  }
-}
-```
-
-### `POST /api/v1/auth/login`
-Authenticate and obtain API key.
-
-**Request**:
-```json
-{
-  "username": "demo",
-  "password": "demo"
-}
-```
-
-**Response**:
-```json
-{
-  "success": true,
-  "message": "Login successful",
-  "api_key": "muser_test_001",
-  "user": { "username": "demo" }
-}
-```
-
-### `POST /api/v1/auth/logout`
-Logout session.
-
-**Response**:
-```json
-{ "success": true }
-```
-
---
-
-## 2. Asset Management
-
-### `POST /api/v1/register`
-Register a video file (legacy path-based).
-
-**Request**:
-```json
-{ "path": "./demo/video.mp4" }
-```
-
-**Response**:
-```json
-{
-  "file_uuid": "384b0ff44aaaa1f1",
-  "file_id": 1,
-  "job_id": 1,
-  "file_name": "video.mp4",
-  "duration": 120.5,
-  "width": 1920,
-  "height": 1080,
-  "already_exists": false
-}
-```
-
-### `POST /api/v1/files/register`
-Register a file with full metadata (recommended). Supports move detection.
-
-**Request**:
-```json
-{
-  "file_path": "/Users/accusys/momentry/var/sftpgo/data/demo/video.mp4",
-  "user_id": null
-}
-```
-
-**Response**:
-```json
-{
-  "success": true,
-  "file_uuid": "384b0ff44aaaa1f1",
-  "file_name": "video.mp4",
-  "file_path": "/Users/accusys/momentry/var/sftpgo/data/demo/video.mp4",
-  "file_type": "video",
-  "duration": 120.5,
-  "width": 1920,
-  "height": 1080,
-  "fps": 30.0,
-  "total_frames": 3615,
-  "registration_time": null,
-  "already_exists": false,
-  "message": "File registered successfully"
-}
-```
-
-### `GET /api/v1/files/scan`
-Scan filesystem for unregistered files.
-
-### `POST /api/v1/unregister`
-Unregister a video file.
-
-**Request**:
-```json
-{ "uuid": "384b0ff44aaaa1f1" }
-```
-
-### `POST /api/v1/probe`
-Probe a video file for metadata.
-
-**Request**:
-```json
-{ "path": "./demo/video.mp4" }
-```
-
-**Response**:
-```json
-{
-  "uuid": "384b0ff44aaaa1f1",
-  "file_name": "video.mp4",
-  "duration": 120.5,
-  "width": 1920,
-  "height": 1080,
-  "fps": 30.0,
-  "cached": true,
-  "format": { ... },
-  "streams": [ ... ]
-}
-```
-
-### `GET /api/v1/assets/:uuid/probe`
-Probe a video by UUID.
-
-### `POST /api/v1/assets/:uuid/process`
-Trigger processing pipeline for an asset.
-
-**Request**:
-```json
-{
-  "processors": ["asr", "cut", "yolo", "ocr", "face", "pose", "asrx", "visual_chunk"]
-}
-```
-
-**Response**:
-```json
-{
-  "job_id": 1,
-  "asset_uuid": "384b0ff44aaaa1f1",
-  "status": "PENDING",
-  "message": "Processing triggered for video.mp4"
-}
-```
-
-### `GET /api/v1/assets/:uuid/status`
-Get asset processing status with frame progress.
-
-**Response**:
-```json
-{
-  "uuid": "384b0ff44aaaa1f1",
-  "file_name": "video.mp4",
-  "registration_time": "2026-04-30T10:00:00Z",
-  "processing_status": "processing",
-  "current_job_id": "abc-123",
-  "frame_progress": {
-    "total_frames": 3615,
-    "processed_frames": 1200,
-    "progress_percent": 33.2
-  }
-}
-```
-
---
-
-## 3. Search
-
-### `POST /api/v1/search`
-Vector/smart search across chunks.
-
-**Request**:
-```json
-{
-  "query": "person talking about AI",
-  "mode": "smart",
-  "uuid": "384b0ff44aaaa1f1",
-  "limit": 10
-}
-```
-
-**Response**:
-```json
-{
-  "results": [
-    {
-      "uuid": "384b0ff44aaaa1f1",
-      "chunk_id": "chunk_1",
-      "chunk_type": "sentence",
-      "start_time": 10.5,
-      "end_time": 15.2,
-      "text": "AI is transforming...",
-      "score": 0.85
-    }
-  ],
-  "query": "person talking about AI"
-}
-```
-
-### `POST /api/v1/search/hybrid`
-Hybrid search (vector + BM25).
-
-**Request**:
-```json
-{
-  "query": "search term",
-  "limit": 10,
-  "uuid": "384b0ff44aaaa1f1",
-  "vector_weight": 0.7,
-  "bm25_weight": 0.3
-}
-```
-
-### `POST /api/v1/search/bm25`
-BM25 full-text search.
-
-### `POST /api/v1/search/visual`
-Search visual chunks by criteria.
-
-**Request**:
-```json
-{
-  "uuid": "384b0ff44aaaa1f1",
-  "criteria": {
-    "object_class": "person",
-    "min_count": 1
-  }
-}
-```
-
-### `POST /api/v1/search/visual/class`
-Search by object class.
-
-**Request**:
-```json
-{
-  "uuid": "384b0ff44aaaa1f1",
-  "object_class": "person",
-  "min_count": 1,
-  "max_count": null
-}
-```
-
-### `POST /api/v1/search/visual/density`
-Search by object density.
-
-**Request**:
-```json
-{
-  "uuid": "384b0ff44aaaa1f1",
-  "min_density": 0.5,
-  "max_density": null
-}
-```
-
-### `POST /api/v1/search/visual/combination`
-Search by object combination.
-
-**Request**:
-```json
-{
-  "uuid": "384b0ff44aaaa1f1",
-  "combination": [["person", 2], ["car", 1]]
-}
-```
-
-### `POST /api/v1/search/visual/stats`
-Get visual chunk statistics.
-
-**Request**:
-```json
-{ "uuid": "384b0ff44aaaa1f1" }
-```
-
-### `POST /api/v1/n8n/search`
-Search via n8n integration.
-
-### `POST /api/v1/n8n/search/bm25`
-BM25 search via n8n.
-
-### `POST /api/v1/n8n/search/hybrid`
-Hybrid search via n8n.
-
-### `POST /api/v1/n8n/search/smart`
-Smart search via n8n.
-
---
-
-## 4. Video Details
-
-### `GET /api/v1/videos`
-List all registered videos with pagination.
-
-**Query Parameters**:
- `page`: Page number (default: 1)
- `page_size`: Items per page (default: 20)
- `status`: Filter by status
- `q`: Search query
- `uuid`: Filter by UUID
-
-**Response**:
-```json
-{
-  "files": [
-    {
-      "file_uuid": "384b0ff44aaaa1f1",
-      "file_path": "/path/to/video.mp4",
-      "file_name": "video.mp4",
-      "file_type": "video",
-      "duration": 120.5,
-      "width": 1920,
-      "height": 1080,
-      "status": "completed",
-      "created_at": "2026-04-30T10:00:00Z",
-      "file_size": 52428800,
-      "total_frames": 3615
-    }
-  ],
-  "count": 1,
-  "page": 1,
-  "page_size": 20
-}
-```
-
-### `DELETE /api/v1/videos/:uuid`
-Delete a video and all associated data (faces, chunks, processor results).
-
-**Response**:
-```json
-{
-  "success": true,
-  "message": "File 384b0ff44aaaa1f1 unregistered successfully...",
-  "file_uuid": "384b0ff44aaaa1f1",
-  "deleted_face_detections": 150,
-  "deleted_processor_results": 8,
-  "deleted_chunks": 45
-}
-```
-
-### `GET /api/v1/videos/:uuid/details`
-Get detailed chunk information.
-
-**Query Parameters**:
- `chunk_id`: Specific chunk ID (required)
- `parent_id`: Parent chunk ID
-
-**Response**:
-```json
-{
-  "uuid": "384b0ff44aaaa1f1",
-  "chunk_id": "chunk_1",
-  "chunk_type": "sentence",
-  "frame_range": {
-    "start_frame": 315,
-    "end_frame": 456,
-    "duration_frames": 141,
-    "fps": 30.0
-  },
-  "reference_time": {
-    "start": 10.5,
-    "end": 15.2
-  },
-  "text_content": "AI is transforming...",
-  "summary_text": "Discussion about AI impact",
-  "speaker_ids": ["SPEAKER_0"],
-  "person_ids": ["face_100"]
-}
-```
-
-### `GET /api/v1/videos/:uuid/pre_chunks`
-List pre-processor chunks.
-
-**Query Parameters**:
- `processor_type`: Filter by processor (asr, yolo, face, etc.)
- `page`: Page number
- `page_size`: Items per page
-
-### `GET /api/v1/progress/:uuid`
-Get processing progress for a video.
-
---
-
-## 5. Identity & Binding
-
-### `POST /api/v1/identities/from-face`
-Register a global identity from face.json with multi-angle reference vectors.
-
-**Request**:
-```json
-{
-  "face_json_path": "/path/to/face.json",
-  "identity_name": "John Doe",
-  "schema": "dev"
-}
-```
-
-### `POST /api/v1/identities/from-person`
-Register identity from a person in a video.
-
-**Request**:
-```json
-{
-  "file_uuid": "384b0ff44aaaa1f1",
-  "person_id": "person_1",
-  "identity_name": "John Doe"
-}
-```
-
-### `GET /api/v1/identities`
-List all global identities.
-
-**Query Parameters**:
- `page`: Page number
- `page_size`: Items per page
-
-### `GET /api/v1/faces/candidates`
-List unbound face candidates.
-
-**Query Parameters**:
- `file_uuid`: Filter by file
- `min_confidence`: Minimum confidence (default: 0.5)
- `page`, `page_size`: Pagination
-
-### `GET /api/v1/identities/:identity_id/faces`
-Get all faces for an identity.
-
-### `GET /api/v1/faces/:face_id/thumbnail`
-Get face thumbnail image (JPEG).
-
-### `POST /api/v1/identities/bind`
-Bind a face/speaker to an identity.
-
-**Request**:
-```json
-{
-  "identity_id": 1,
-  "binding_type": "face",
-  "binding_value": "face_100",
-  "source": "manual"
-}
-```
-
-### `POST /api/v1/identities/unbind`
-Unbind an identity.
-
-**Request**:
-```json
-{
-  "binding_type": "face",
-  "binding_value": "face_100"
-}
-```
-
-### `GET /api/v1/identity/:binding_type/:binding_value`
-Get identity info by binding.
-
-### `GET /api/v1/signals/unbound`
-List unbound signals.
-
-**Query Parameters**:
- `uuid`: File UUID
- `binding_type`: "face" or "speaker"
-
-### `GET /api/v1/signals/:uuid/:binding_type/:binding_value/timeline`
-Get signal timeline (all chunks for a face/speaker).
-
-### `POST /api/v1/identities/suggest-av`
-Suggest audio-visual bindings based on temporal overlap.
-
-**Request**:
-```json
-{
-  "file_uuid": "384b0ff44aaaa1f1",
-  "overlap_threshold": 0.6
-}
-```
-
---
-
-## 6. Jobs & Rules
-
-### `GET /api/v1/jobs`
-List all monitor jobs.
-
-**Query Parameters**:
- `page`, `page_size`: Pagination
- `status`: Filter by status
-
-### `GET /api/v1/jobs/:job_id`
-Get job details with processor information.
-
-**Response**:
-```json
-{
-  "job_id": "1",
-  "asset_uuid": "384b0ff44aaaa1f1",
-  "rule": "default",
-  "status": "RUNNING",
-  "current_processor_id": "asr",
-  "frame_progress": {
-    "total_frames": 3615,
-    "processed_frames": 1200,
-    "progress_percent": 33.2
-  }
-}
-```
-
-### `GET /api/v1/rules/:rule/status`
-Get rule status with active jobs.
-
---
-
-## 7. Stats & Configuration
-
-### `GET /api/v1/stats/ingest`
-Get ingestion statistics.
-
-**Response**:
-```json
-{
-  "total_videos": 50,
-  "total_chunks": 1200,
-  "sentence_chunks": 800,
-  "cut_chunks": 300,
-  "time_chunks": 100,
-  "searchable_chunks": 1150,
-  "chunks_with_visual": 450,
-  "chunks_with_summary": 200,
-  "pending_videos": 5
-}
-```
-
-### `GET /api/v1/stats/sftpgo`
-Get SFTPGo status and registered videos.
-
-### `GET /api/v1/stats/inference`
-Check inference engine health (Ollama, llama-server).
-
-**Response**:
-```json
-{
-  "ollama": {
-    "engine": "Ollama",
-    "model": "nomic-embed-text",
-    "status": "ok",
-    "latency_ms": 15
-  },
-  "llama_server": {
-    "engine": "llama-server",
-    "model": "gemma4_e4b_q5",
-    "status": "ok",
-    "latency_ms": 25
-  }
-}
-```
-
-### `POST /api/v1/config/cache`
-Toggle MongoDB cache.
-
-**Request**:
-```json
-{ "enabled": false }
-```
-
-**Response**:
-```json
-{
-  "success": true,
-  "cache_enabled": false,
-  "message": "Cache disabled"
-}
-```
-
---
-
-## API Usage Patterns
-
-### 1. List Pattern
-```
-GET /api/v1/videos?page=1&page_size=20
-```
- Supports pagination
- Optional filters via query parameters
- Returns `{ items: [...], count, page, page_size }`
-
-### 2. Detail Pattern
-```
-GET /api/v1/videos/:uuid/details?chunk_id=chunk_1
-```
- Path parameter for resource identifier
- Query parameters for sub-resource selection
- Returns detailed object with nested structures
-
-### 3. Operation Pattern
-```
-POST /api/v1/assets/:uuid/process
-```
- Action-oriented endpoint
- Request body contains operation parameters
- Returns operation status and job ID
-
-### 4. Application Pattern
-```
-POST /api/v1/identities/bind
-POST /api/v1/identities/suggest-av
-```
- Complex workflows with multiple steps
- Often involve external services (Python scripts, FFmpeg)
- Return comprehensive results with metadata
-
---
-
-## Error Responses
-
-| Status Code | Description |
-|-------------|-------------|
-| `400` | Bad Request - Invalid parameters |
-| `404` | Not Found - Resource doesn't exist |
-| `500` | Internal Server Error - Database/service failure |
-
---
-
-## V4.0 Architecture Notes
-
-### Key Changes from V3.x
- `video_uuid` → `file_uuid` (terminology update)
- `person_identities` table **removed**
- Face → Identity direct binding (no intermediate person_id)
- 28 person_id APIs removed (except register/bind)
- Chunk binding auto via time alignment
-
-### Identity Model
-```
-Face Detection → Identity (direct binding)
-Speaker Detection → Identity (direct binding)
-```
-
-### Processing Pipeline
-```
-Register → Probe → ASR → CUT → YOLO → OCR → Face → Pose → ASRX → Visual Chunk
-```