# Chat History - 2026-03-18 ## User Request User asked to: 1. Review files in `./docs` directory related to API documentation 2. Save chat history to note.md ## Files Reviewed ### 1. API_REFERENCE.md - Base URL: `http://localhost:3002/api/v1` - Port 3000 is used by Gitea, API runs on 3002 **Endpoints:** | Method | Endpoint | Description | |--------|----------|-------------| | POST | `/api/v1/register` | Register a video file | | GET | `/api/v1/progress/:uuid` | Get real-time processing progress via Redis | | POST | `/api/v1/search` | Natural language search using RAG | | GET | `/api/v1/lookup` | Lookup video UUID by path or get video details | | GET | `/api/v1/videos` | List all registered videos | **Processor Status Values:** - `pending` - Not started - `info` - Starting/info message - `progress` - In progress - `complete` - Finished - `error` - Failed ### 2. CHUNK_DESIGN.md **Design Principles:** - Dual UUID system (external_uuid + internal id) - Internal tables use `videos.id` (4 bytes) instead of uuid (32 bytes) for space efficiency **Database Tables:** - `videos` - File mapping table with internal ID - `pre_chunks` - Pre-processed chunks from ASR, CUT, TIME, YOLO trace - `frames` - Single image recognition results (YOLO, OCR, Face per frame) - `chunks` - Final chunks after combination rules - `chunk_vectors` - Vector embeddings **Combination Rules:** - Rule 1 (Direct): pre_chunk → chunk - Rule 2 (Enrich): pre_chunk + frames → enriched chunk ### 3. CHUNK_SPEC.md **Chunk Types:** | Type | Description | Can Overlap | |------|-------------|-------------| | Sentence | Speech recognition segments | Yes | | Cut | Scene detection segments | Yes | | TimeBased | Fixed duration segments (default 10s) | Yes | **Time Coordinate System:** - All times in seconds (float with microsecond precision) - Frame calculation: `frame_number = floor(time_in_seconds * fps)` **Chunk ID Format:** `{chunk_type}_{chunk_index:04}` - Examples: `sentence_0001`, `cut_0002`, `time_based_0015` **Processors:** | Processor | Model | Description | |-----------|-------|-------------| | ASR | WhisperX (faster-whisper) | Speech recognition | | CUT | PySceneDetect | Scene detection | | YOLO | YOLOv8n | Object detection | | OCR | EasyOCR | Text recognition | | Face | OpenCV Haar Cascade | Face detection | | Pose | YOLOv8n-Pose | Pose estimation | ### 4. SERVICES.md **Core Services:** | Service | Port | Purpose | |---------|------|---------| | PostgreSQL | 5432 | Video metadata storage | | Redis | 6379 | Cache and job queue | | Ollama | 11434 | Local LLM inference | | n8n | 5678/5690 | Workflow automation | | Qdrant | 6333 | Vector database | | Gitea | 3000 | Git service | | Momentry API | 3002 | Rust API server | ## Notes - Chat history saved to note.md - User may want to continue with API implementation, code review, or new features