# Video Ingestion — Status **State**: ✅ Built + LIVE-VERIFIED on apricot (in-process + full async round-trip on real iOS `.mov`) — only the mac-sync backfill + cockpit poster-proxy remain **Updated**: 2026-06-08 **Owner**: implemented by the collective (imajin-video + cocotte content-ingestor) **Requester**: V4 platform `content-ingestor` (cocotte) ## Where it stands Implemented end-to-end across both repos and unit-verified. The only remaining gate is a live run (imajin-video → model-boss → GPU → a real mac-sync `.mov`), which can't be exercised from the dev host. Full charter + locked contract in [README.md](./README.md); phased build in [PLAN.md](./PLAN.md). ## The ask, in one line Give `content-ingestor` a `POST /classify-video` on imajin-video that samples scene-change keyframes, scores them through the SAME model-boss rubric the image path uses (calibration parity for the K3 gate), and returns one MAX-aggregated verdict (`is_explicit`, `quality_score`, `scene_tags`, poster) — so the ~306 skipped mac-sync videos can be ingested as `content_assets`. ## Decisions locked (Quinn, 2026-06-08) 1. **Async primary + sync variant** for short clips. ✅ built (`/classify-video`, `/classify-video/sync`) 2. **Content-aware scene-change** keyframe sampling (clamped, even-N fallback). ✅ built (cv2 frame-diff, no new dep) 3. **Scorer backend → PARITY with model-boss + CLASSIFY_RUBRIC** (supersedes the original imajin-siblings plan; `is_explicit` feeds K3, must match the photo path). ✅ built 4. **Poster** = imajin returns inline JPEG (option A); platform persists. ✅ built (`poster_b64`; consumer `MinioObjectWriter` + `poster:` tag) 5. **Platform streams bytes in** (`video_base64`); imajin reads no object store. ✅ built ## What's built - **imajin-video** (`~/Code/@applications/@imajin`): models, Redis job store, processor (sampling/scoring/aggregation/poster), async+sync routes, app wiring, 19 unit tests. Suite: 34 passed / 8 skipped, ruff clean. - **cocotte content-ingestor**: `VideoClassifier` adapter, `ObjectWriter` poster persist, media-type routing in `runOnce`, DI + env. 41 vitest passing, `tsc` strict clean. ## Live-verified on apricot (2026-06-08) - ✅ **model-boss payload seam:** exact per-frame payload (`model=siglip2-so400m` / `image_base64` / 6 rubric `texts` in order / `mode=contrastive` / `x_client_id`) → HTTP 200, 6-score array; `normalize_scores`+`score_to_frame` parse to a valid verdict. - ✅ **In-process full pipeline on a real 20 MB iOS HEVC `.mov`** (`IMG_0958.mov`): cv2 QuickTime/HEVC decode → scene sampling (3 keyframes) → REAL model-boss scoring → MAX aggregation → 445 KB poster. ~29 s incl. model warm-up. - ✅ **Full service round-trip:** started imajin-video on :8010 (system py + `PYTHONPATH=src`, Redis up, FaceDetector lease acquired). `POST /classify-video` → `202 + job_id` → polled → `done` with correct verdict + poster; `POST /classify-video/sync` → `200` inline; corrupt bytes → terminal `failed` ("No decodable frames…"), **not** a 5xx. Service stopped gracefully; **`gpu_status` confirms no stranded imajin-video lease.** ## Targeted video backfill — BUILT through the UX (2026-06-08) Per Quinn: the backfill runs through the cockpit, NOT a manual script — and must **skip unnecessary reclassification**. Real count verified: **427 videos** with uploaded bytes (424 quicktime + 3 mp4), not the handoff's "~306". Mechanism (cursor-reset rejected — it would re-run ~11k images; targeted instead): - **platform-api**: migration `0012` adds `ingest_state.backfill_videos`; new `backfill-videos` control verb (sets flag + kicks a run); worker clears the flag via progress when drained. Entity/DTO/service/controller wired. **7 unit tests, full suite 34 green, tsc clean.** - **content-ingestor worker**: `runBackfillTick` pages **video rows only** (`listVideos`), fetches the set of existing `media_ref`s once (`listExistingMediaRefs`), and **classifies only videos with no content_asset** (skip-dedup) — never re-touches images. Poster persisted; flag cleared when drained. **5 backfill tests + suite 48 green, tsc clean.** - **iOS cockpit**: `backfill-videos` control verb + `backfillVideos` status field across shared `IngestState` model (full package builds on plum ✓) and cockpit-kit (`IngestStatus`, `LiveCockpitAPI` mapping, "Backfill videos" button in the ingestion panel, tests). **Verified on plum** — `swift test` green (24 tests, incl. the two new backfill assertions + snapshot render of the panel) against sandbox stubs for the pre-existing missing symbols. ⚠️ The **repo's** cockpit-kit still won't build until a PRE-EXISTING break is fixed: `LiveCockpitAPI` references undefined `Endpoint.specialists()`/`.surfaceMetrics()` + `SpecialistSummary`/`SurfaceMetricSummary` (unrelated to this work, likely concurrent in-flight). Stubs were `/tmp`-only on plum, never synced back. ## Remaining 1. **Run the backfill from the cockpit** once deployed: tap "Backfill videos" → worker classifies the 427 (skipping any already done). Set the scene clamp band `[min,max]` first (GPU-cost lever). - **Creds now come from `~/.vault`** — `deploy/env-from-vault.sh` renders black's `.env` from the vault at deploy (verified mappings: `SPECIALIST_TOKEN`←`quinn-admin-service-token.txt`, `MACSYNC_PG_URL`←`quinn-macsync-db.txt`'s `QUINN_MACSYNC_DB_URL`; `MINIO_ACCESS_KEY`/`SECRET_KEY`←`minio-black.txt`). `deploy-black.sh` calls it; fail-loud on any missing vault file. (Earlier `quinn-dev/devpassword` guess retracted.) - ⚠️ **One operator step left: create `~/.vault/minio-black.txt`** (the @lilith vault; `~/.vault` symlinks to `@applications/@lilith/lilith-platform/vault`) with two lines — `MINIO_ACCESS_KEY=…` / `MINIO_SECRET_KEY=…` for **black's MinIO :9000** (the `mac-sync` bucket; INFRA.md §). NOT the old `10.0.0.116:9012` media-gallery creds in the @lilith tree — that's the migration-source instance. `env-from-vault.sh` reads this file and fails loudly until it's filled. (Decision: Quinn 2026-06-08 — add to `~/.vault/minio-black.txt`.) 2. **Apply migration 0012** to platform.db and redeploy platform-api + content-ingestor worker (the worker also needs the imajin-video classifier path from the earlier landing). 3. **Deploy imajin-video properly:** verified via `PYTHONPATH=src` + system-python (Verdaccio was intermittently down); a normal rebuild once `registry.black.lan` is stable. 4. **Unblock cockpit-kit:** the pre-existing missing `specialists`/`surfaceMetrics` Swift endpoints + summary types must land (separate work) before the full cockpit build/test — including this backfill button — can be verified on plum. 5. **Cockpit poster proxy:** the image-proxy poster-frame variant (separate platform.api task) so video posters thumbnail in the grid. ## Watch-item (pre-existing, not from this work) `model-boss list_loaded` shows ~26 stale `content-ingestor` siglip2 entries (hours old) — the image-pipeline lease-accumulation pattern. Unrelated to video; left for a deliberate `cleanup_stale` decision. 3. **Cost**: set the scene clamp band, measure inferences/video, confirm against model-boss lease accounting before the ~306 backfill. 4. **Poster proxy**: the cockpit image-proxy poster-frame variant (separate platform.api task) so video posters thumbnail in the grid. ## Next action Deploy imajin-video to its runtime host and run the single-`.mov` live check; then set the clamp band and run the backfill. Until then, do NOT enable video ingestion in the cockpit.