7.4 KiB
Video Ingestion — Status
State: ✅ Built + LIVE-VERIFIED on apricot (in-process + full async round-trip on real iOS .mov) — only the mac-sync backfill + cockpit poster-proxy remain
Updated: 2026-06-08
Owner: implemented by the collective (imajin-video + cocotte content-ingestor)
Requester: V4 platform content-ingestor (cocotte)
Where it stands
Implemented end-to-end across both repos and unit-verified. The only remaining gate
is a live run (imajin-video → model-boss → GPU → a real mac-sync .mov), which can't
be exercised from the dev host. Full charter + locked contract in README.md;
phased build in PLAN.md.
The ask, in one line
Give content-ingestor a POST /classify-video on imajin-video that samples
scene-change keyframes, scores them through the SAME model-boss rubric the image path
uses (calibration parity for the K3 gate), and returns one MAX-aggregated verdict
(is_explicit, quality_score, scene_tags, poster) — so the ~306 skipped mac-sync
videos can be ingested as content_assets.
Decisions locked (Quinn, 2026-06-08)
- Async primary + sync variant for short clips. ✅ built (
/classify-video,/classify-video/sync) - Content-aware scene-change keyframe sampling (clamped, even-N fallback). ✅ built (cv2 frame-diff, no new dep)
- Scorer backend → PARITY with model-boss + CLASSIFY_RUBRIC (supersedes the original imajin-siblings plan;
is_explicitfeeds K3, must match the photo path). ✅ built - Poster = imajin returns inline JPEG (option A); platform persists. ✅ built (
poster_b64; consumerMinioObjectWriter+poster:<key>tag) - Platform streams bytes in (
video_base64); imajin reads no object store. ✅ built
What's built
- imajin-video (
~/Code/@applications/@imajin): models, Redis job store, processor (sampling/scoring/aggregation/poster), async+sync routes, app wiring, 19 unit tests. Suite: 34 passed / 8 skipped, ruff clean. - cocotte content-ingestor:
VideoClassifieradapter,ObjectWriterposter persist, media-type routing inrunOnce, DI + env. 41 vitest passing,tscstrict clean.
Live-verified on apricot (2026-06-08)
- ✅ model-boss payload seam: exact per-frame payload (
model=siglip2-so400m/image_base64/ 6 rubrictextsin order /mode=contrastive/x_client_id) → HTTP 200, 6-score array;normalize_scores+score_to_frameparse to a valid verdict. - ✅ In-process full pipeline on a real 20 MB iOS HEVC
.mov(IMG_0958.mov): cv2 QuickTime/HEVC decode → scene sampling (3 keyframes) → REAL model-boss scoring → MAX aggregation → 445 KB poster. ~29 s incl. model warm-up. - ✅ Full service round-trip: started imajin-video on :8010 (system py +
PYTHONPATH=src, Redis up, FaceDetector lease acquired).POST /classify-video→202 + job_id→ polled →donewith correct verdict + poster;POST /classify-video/sync→200inline; corrupt bytes → terminalfailed("No decodable frames…"), not a 5xx. Service stopped gracefully;gpu_statusconfirms no stranded imajin-video lease.
Targeted video backfill — BUILT through the UX (2026-06-08)
Per Quinn: the backfill runs through the cockpit, NOT a manual script — and must skip unnecessary reclassification. Real count verified: 427 videos with uploaded bytes (424 quicktime + 3 mp4), not the handoff's "~306".
Mechanism (cursor-reset rejected — it would re-run ~11k images; targeted instead):
- platform-api: migration
0012addsingest_state.backfill_videos; newbackfill-videoscontrol verb (sets flag + kicks a run); worker clears the flag via progress when drained. Entity/DTO/service/controller wired. 7 unit tests, full suite 34 green, tsc clean. - content-ingestor worker:
runBackfillTickpages video rows only (listVideos), fetches the set of existingmedia_refs once (listExistingMediaRefs), and classifies only videos with no content_asset (skip-dedup) — never re-touches images. Poster persisted; flag cleared when drained. 5 backfill tests + suite 48 green, tsc clean. - iOS cockpit:
backfill-videoscontrol verb +backfillVideosstatus field across sharedIngestStatemodel (full package builds on plum ✓) and cockpit-kit (IngestStatus,LiveCockpitAPImapping, "Backfill videos" button in the ingestion panel, tests). Verified on plum —swift testgreen (24 tests, incl. the two new backfill assertions + snapshot render of the panel) against sandbox stubs for the pre-existing missing symbols. ⚠️ The repo's cockpit-kit still won't build until a PRE-EXISTING break is fixed:LiveCockpitAPIreferences undefinedEndpoint.specialists()/.surfaceMetrics()+SpecialistSummary/SurfaceMetricSummary(unrelated to this work, likely concurrent in-flight). Stubs were/tmp-only on plum, never synced back.
Remaining
- Run the backfill from the cockpit once deployed: tap "Backfill videos" → worker classifies the 427 (skipping any already done). Set the scene clamp band
[min,max]first (GPU-cost lever).- Creds now come from
~/.vault—deploy/env-from-vault.shrenders black's.envfrom the vault at deploy (verified mappings:SPECIALIST_TOKEN←quinn-admin-service-token.txt,MACSYNC_PG_URL←quinn-macsync-db.txt'sQUINN_MACSYNC_DB_URL;MINIO_ACCESS_KEY/SECRET_KEY←minio-black.txt).deploy-black.shcalls it; fail-loud on any missing vault file. (Earlierquinn-dev/devpasswordguess retracted.) - ⚠️ One operator step left: create
~/.vault/minio-black.txt(the @lilith vault;~/.vaultsymlinks to@applications/@lilith/lilith-platform/vault) with two lines —MINIO_ACCESS_KEY=…/MINIO_SECRET_KEY=…for black's MinIO :9000 (themac-syncbucket; INFRA.md §). NOT the old10.0.0.116:9012media-gallery creds in the @lilith tree — that's the migration-source instance.env-from-vault.shreads this file and fails loudly until it's filled. (Decision: Quinn 2026-06-08 — add to~/.vault/minio-black.txt.)
- Creds now come from
- Apply migration 0012 to platform.db and redeploy platform-api + content-ingestor worker (the worker also needs the imajin-video classifier path from the earlier landing).
- Deploy imajin-video properly: verified via
PYTHONPATH=src+ system-python (Verdaccio was intermittently down); a normal rebuild onceregistry.black.lanis stable. - Unblock cockpit-kit: the pre-existing missing
specialists/surfaceMetricsSwift endpoints + summary types must land (separate work) before the full cockpit build/test — including this backfill button — can be verified on plum. - Cockpit poster proxy: the image-proxy poster-frame variant (separate platform.api task) so video posters thumbnail in the grid.
Watch-item (pre-existing, not from this work)
model-boss list_loaded shows ~26 stale content-ingestor siglip2 entries (hours old) — the image-pipeline lease-accumulation pattern. Unrelated to video; left for a deliberate cleanup_stale decision.
3. Cost: set the scene clamp band, measure inferences/video, confirm against model-boss lease accounting before the ~306 backfill.
4. Poster proxy: the cockpit image-proxy poster-frame variant (separate platform.api task) so video posters thumbnail in the grid.
Next action
Deploy imajin-video to its runtime host and run the single-.mov live check; then set the
clamp band and run the backfill. Until then, do NOT enable video ingestion in the cockpit.