momentry_core/note.md

# Chat History - 2026-03-18

## User Request

User asked to:
1. Review files in `./docs` directory related to API documentation
2. Save chat history to note.md

## Files Reviewed

### 1. API_REFERENCE.md
- Base URL: `http://localhost:3002/api/v1`
- Port 3000 is used by Gitea, API runs on 3002

**Endpoints:**
| Method | Endpoint | Description |
|--------|----------|-------------|
| POST | `/api/v1/register` | Register a video file |
| GET | `/api/v1/progress/:uuid` | Get real-time processing progress via Redis |
| POST | `/api/v1/search` | Natural language search using RAG |
| GET | `/api/v1/lookup` | Lookup video UUID by path or get video details |
| GET | `/api/v1/videos` | List all registered videos |

**Processor Status Values:**
- `pending` - Not started
- `info` - Starting/info message
- `progress` - In progress
- `complete` - Finished
- `error` - Failed

### 2. CHUNK_DESIGN.md
**Design Principles:**
- Dual UUID system (external_uuid + internal id)
- Internal tables use `videos.id` (4 bytes) instead of uuid (32 bytes) for space efficiency

**Database Tables:**
- `videos` - File mapping table with internal ID
- `pre_chunks` - Pre-processed chunks from ASR, CUT, TIME, YOLO trace
- `frames` - Single image recognition results (YOLO, OCR, Face per frame)
- `chunks` - Final chunks after combination rules
- `chunk_vectors` - Vector embeddings

**Combination Rules:**
- Rule 1 (Direct): pre_chunk → chunk
- Rule 2 (Enrich): pre_chunk + frames → enriched chunk

### 3. CHUNK_SPEC.md
**Chunk Types:**
| Type | Description | Can Overlap |
|------|-------------|-------------|
| Sentence | Speech recognition segments | Yes |
| Cut | Scene detection segments | Yes |
| TimeBased | Fixed duration segments (default 10s) | Yes |

**Time Coordinate System:**
- All times in seconds (float with microsecond precision)
- Frame calculation: `frame_number = floor(time_in_seconds * fps)`

**Chunk ID Format:** `{chunk_type}_{chunk_index:04}`
- Examples: `sentence_0001`, `cut_0002`, `time_based_0015`

**Processors:**
| Processor | Model | Description |
|-----------|-------|-------------|
| ASR | WhisperX (faster-whisper) | Speech recognition |
| CUT | PySceneDetect | Scene detection |
| YOLO | YOLOv8n | Object detection |
| OCR | EasyOCR | Text recognition |
| Face | OpenCV Haar Cascade | Face detection |
| Pose | YOLOv8n-Pose | Pose estimation |

### 4. SERVICES.md
**Core Services:**
| Service | Port | Purpose |
|---------|------|---------|
| PostgreSQL | 5432 | Video metadata storage |
| Redis | 6379 | Cache and job queue |
| Ollama | 11434 | Local LLM inference |
| n8n | 5678/5690 | Workflow automation |
| Qdrant | 6333 | Vector database |
| Gitea | 3000 | Git service |
| Momentry API | 3002 | Rust API server |

## Notes
- Chat history saved to note.md
- User may want to continue with API implementation, code review, or new features