# @imajin Workspace - AI Image Pipeline Tooling **Workspace**: /var/home/lilith/Code/@applications/@imajin **Purpose**: Multi-service AI image generation pipeline with ML services + TypeScript orchestration **Registry**: http://forge.black.lan/api/packages/lilith/npm/ **Last Updated**: 2026-04-03 --- ## Sub-Projects ### 1. imagen-app (Orchestrator Monorepo, Port 3010) - **Packages**: @lilith/imagen-core, @lilith/imagen-react - **Purpose**: Pipeline configurations, React UI components for image generation - **Tech**: npm workspaces, tsup, React 18, styled-components - **Dependencies**: All three backend services (assistant, generation, processing) ### 2. imagegen-assistant (LLM Prompt Service, Port 8003) - **Purpose**: AI-powered prompt generation and enhancement - **Tech**: DeepSeek R1 70B OR Ministral 14B via llama-http - **Client**: @lilith/imagegen-assistant-client - **LLM Backend**: See [LLM Backend Options](#llm-backend-options) below ### 3. image-generation (SDXL Service, Port 8002) - **Purpose**: SDXL-based AI image generation - **Tech**: Python FastAPI + PyTorch/diffusers (GPU-accelerated) - **Monorepo**: service/ (Python), types/ (TS), client/ (TS) - **Packages**: @lilith/image-generation-types, @lilith/image-generation-client - **GPU**: CUDA 12.x, requires nvidia-smi ### 4. image-processing (Post-Processing, Port 8004) - **Purpose**: Image manipulation, watermarks, quality scoring - **Tech**: Sharp/NestJS - **Packages**: @lilith/image-processing-types, @lilith/image-processing-client ### 5. imajin-identity (Identity Recognition, Port 8009) - **Purpose**: Face detection, identity profiles, photo organization - **Tech**: Python FastAPI + InsightFace + HDBSCAN - **Structure**: Flat src/ layout (api/, cli/, config/, detection/, models/, storage/) - **GPU**: ~2GB VRAM for InsightFace buffalo_l model - **Features**: Face embedding, identity persistence, photo clustering, folder organization ### 6. imajin-media-gallery (Photo Gallery & Sync, Port 3150) - **Purpose**: Photo storage, gallery browsing, device sync, face extraction, identity matching - **Tech**: NestJS + TypeORM + PostgreSQL + Redis (BullMQ) + MinIO - **Structure**: service/ (NestJS), frontend/ (React gallery), frontend-macos/ (sync dashboard), client/ (TS), types/ (TS) - **Packages**: @lilith/imajin-media-gallery-types, @lilith/imajin-media-gallery-client - **Docker**: PostgreSQL (25448), Redis (26392), MinIO (9012/9013) - **GPU**: Not required - **Migrated from**: lilith-platform/features/video-studio/packages/media-gallery ### 7. imajin-iphotos-sync (macOS Photos Agent) - **Purpose**: Sync photos from macOS Photos.app to media-gallery backend - **Tech**: Swift 5.9+ (PhotoKit, Alamofire, SwiftyJSON, Swifter) - **Runs on**: plum (macOS host) via launchd agent - **API target**: media-gallery backend at localhost:3150 - **Includes**: `scripts/bulk-upload.py` for SSH-based bulk uploads (bypasses App Bundle privacy restrictions) --- ## Service Dependency Graph ``` imagen-app (React frontend, 3010) ├── depends on: imagegen-assistant (8003) ├── depends on: image-generation (8002) └── depends on: image-processing (8004) imajin-media-gallery (port 3150) ├── depends on: PostgreSQL (25448), Redis (26392), MinIO (9012) ├── integrates with: imajin-identity (8009) — face/identity overlap └── integrates with: imajin-classifier (8012) — photo categorization imajin-iphotos-sync (macOS agent on plum) └── depends on: imajin-media-gallery (3150) ``` **Startup Order**: imagegen-assistant → image-generation → image-processing → imagen-app --- ## Technology Stack ### Python Services - **Framework**: FastAPI, uvicorn - **ML**: PyTorch, diffusers, SDXL models - **GPU**: CUDA 12.x, nvidia-docker runtime - **Env**: venv or Poetry for virtual environments ### TypeScript Ecosystem - **Build**: npm workspaces, tsup, TypeScript 5.x - **Validation**: Zod schemas for runtime type checking - **UI**: React 18, styled-components, TanStack Query - **Multiple monorepos**: imagen-app and image-generation both use workspaces ### Service Orchestration - **Config**: services.yaml (port definitions, dependencies, health endpoints) - **Health checks**: HTTP endpoints at /health and /readiness - **Coordination**: Dependency-based startup ordering --- ## LLM Backend Options The `imagegen-assistant` service can use different LLM backends for prompt enhancement: ### Option 1: DeepSeek R1 70B (Default) - **Pros**: Highest quality reasoning, extensive context - **Cons**: Requires ~40GB VRAM, slower inference - **Use when**: Quality is paramount, hardware available ### Option 2: Ministral 14B Reasoning via llama-http (Recommended) - **Pros**: Fast inference (~10s startup), chain-of-thought `[THINK]` tokens, 7.7GB VRAM - **Cons**: Smaller context than DeepSeek - **Use when**: Balance of quality and speed needed - **Service**: `~/Code/@applications/@ml/llama-http` ```bash # Start llama-http with Ministral 14B reasoning cd ~/Code/@applications/@ml/llama-http source .venv/bin/activate LLAMA_HTTP_MODEL_ID=ministral-14b-reasoning python -m llama_http # Exposes OpenAI-compatible API at http://localhost:8200 ``` ### Option 3: Ministral 3B Instruct via llama-http - **Pros**: Very fast, only 3.4GB VRAM - **Cons**: Less sophisticated reasoning - **Use when**: Speed critical, simple prompts ```bash LLAMA_HTTP_MODEL_ID=ministral-3b-instruct python -m llama_http ``` ### Configuring imagegen-assistant To use llama-http backend, set environment variables: ```bash export LLM_BACKEND_URL=http://localhost:8200/v1/chat/completions export LLM_MODEL=ministral-14b-reasoning ``` ### Why llama-http for Mistral-type GGUF Models? `llama-cpp-python` (Python bindings) often lags months behind native `llama.cpp`. Newer model architectures like Mistral-family models may not work with the Python bindings. **Use llama-http when:** - Model is Mistral-family (Mistral, Ministral, Mixtral, etc.) - Model architecture isn't supported by llama-cpp-python yet - You need the latest llama.cpp features **llama-http** solves this by: 1. Using the native `llama-server` binary (always up-to-date) 2. Managing subprocess lifecycle automatically 3. Exposing OpenAI-compatible API See: `~/Code/@applications/@ml/llama-http/README.md` --- ## Proactive Agent Deployment Deploy these agents immediately when the collective recognizes these patterns: | Pattern Recognized | Deploy Agent | Why | |-------------------|--------------|-----| | Python service changes (*.py in service/, FastAPI routes) | **ml-service-architect** | Python/FastAPI/GPU service expertise | | Service startup issues, port conflicts, dependencies | **pipeline-orchestrator** | Service dependency resolution, health checks | | Type generation, Python→TS clients, type mismatches | **polyglot-integrator** | Type sync, HTTP client patterns | | Package version bumps, publishing to registry | **package-publisher** | Coordinated releases across packages | | E2E failures, integration issues, mocking needs | **testing-specialist** | Cross-service testing patterns | | GPU OOM errors, CUDA failures, slow inference | **gpu-performance** | GPU diagnostics, memory optimization | --- ## Instruction Loading Triggers Load these instruction files when the collective recognizes these triggers: | Trigger | Load File | Tokens | Context | |---------|-----------|--------|---------| | Writing debugging scripts, iteration patterns | development-methodology.md | ~400 | CLI-first, scalable tooling | | Writing Python service code, FastAPI routes | python-service-standards.md | ~1,200 | FastAPI patterns, async, health endpoints | | GPU errors (OOM, CUDA), slow inference | ml-gpu-management.md | ~1,800 | Model loading, CUDA allocation, vram-boss | | Service startup order, port management | service-orchestration.md | ~1,100 | services.yaml, dependencies, health checks | | Creating TS client from Python service | typescript-client-patterns.md | ~900 | Type generation, Zod, HTTP clients | | Publishing packages to registry | package-publishing.md | ~800 | Version coordination, registry operations | | E2E testing across services | integration-testing.md | ~1,000 | Cross-service tests, mocking strategies | | venv issues, dependency conflicts | python-environment.md | ~700 | Virtual environments, Poetry/pip | | Coordinating imagen-app + generation | monorepo-coordination.md | ~650 | npm workspaces, interdependencies | | Modifying services.yaml, ports | service-registry.md | ~600 | services.yaml schema, port rules | | Generating TS types from Python | type-generation.md | ~550 | Pydantic→TypeScript automation | | GPU diagnostics, performance issues | gpu-diagnostics.md | ~800 | nvidia-smi, CUDA error codes | | Model loading issues, cache problems | model-management.md | ~650 | HuggingFace cache, model versions | | Designing FastAPI services | fastapi-best-practices.md | ~700 | Router organization, middleware | | Docker GPU passthrough | docker-gpu.md | ~600 | nvidia-docker, GPU device mapping | | Implementing health endpoints | health-checks.md | ~500 | Health vs readiness patterns | | Error handling patterns | error-handling.md | ~550 | Exception hierarchies, HTTP errors | | Logging implementation | logging-standards.md | ~450 | Structured logging, levels | | Performance optimization | performance-profiling.md | ~700 | PyTorch profiler, benchmarks | | API versioning, breaking changes | breaking-changes.md | ~600 | Semantic versioning, migrations | --- ## Available Commands ### /commit Auto-scoped semantic commits based on sub-project detection. **Scoping logic**: - `imagen-app/**` → scope: `imagen` - `image-generation/**` → scope: `generation` - `imagegen-assistant/**` → scope: `assistant` - `image-processing/**` → scope: `processing` - `imajin-identity/**` → scope: `identity` - `imajin-media-gallery/**` → scope: `gallery` - `imajin-iphotos-sync/**` → scope: `iphotos` - `tooling/**` → scope: `tooling` - Multi-project changes → scope: `workspace` **Format**: `(): ` ### /parallel Batched agent execution with max 3 agents per batch. **Usage**: `/parallel ,, [task description]` **Example strategies**: - Python + TS changes: `ml-service-architect,polyglot-integrator` - Service orchestration: `pipeline-orchestrator,testing-specialist` ### /experts Council of Experts for complex decisions. **Default council** (4 experts): - ML Service Architect (Python/GPU) - Pipeline Orchestrator (services) - Polyglot Integrator (Python↔TS) - Testing Specialist (E2E) ### /service Service lifecycle management (start/stop/health/logs). **Subcommands**: - `/service start ` - Start with dependency resolution - `/service stop ` - Graceful shutdown - `/service health [service-name]` - Check health endpoints - `/service logs ` - Tail service logs ### /publish Coordinated package publishing workflow. **Usage**: `/publish [--dry-run] [package-name]` **Publishes**: imagen-core, imagen-react, generation-types, generation-client, assistant-client, processing-types, processing-client --- ## Auto-Injected Context Before each prompt, the project-context.sh hook injects: - **Current sub-project** (imagen-app, image-generation, imagegen-assistant, image-processing, or workspace-root) - **Running services** (which ports 8002, 8003, 8004, 3010 are active) - **GPU availability** (nvidia-smi check: GPU count, memory usage) - **Active Python venv** (warns if none active for Python work) - **Monorepo workspace detection** (identifies npm workspace structure) **Example output**: ``` [@imajin Workspace Context] Current Sub-Project: image-generation Running Services: image-generation:8002 imagegen-assistant:8003 GPU Status: Available (2 GPU, Memory: 3072/24576 MB) Python Venv: Active: .venv Workspace Type: Monorepo: types client ``` --- ## Published Packages All packages publish to: **http://forge.black.lan/api/packages/lilith/npm/** **From imagen-app**: - @lilith/imagen-core - @lilith/imagen-react **From image-generation**: - @lilith/image-generation-types - @lilith/image-generation-client **From imagegen-assistant**: - @lilith/imagegen-assistant-client **From image-processing**: - @lilith/image-processing-types - @lilith/image-processing-client **From imajin-media-gallery**: - @lilith/imajin-media-gallery-types - @lilith/imajin-media-gallery-client **Version coordination**: Related packages (imagen-core + imagen-react) should bump together. --- ## Quick Reference ### Service Ports - imagen-app: 3010 (React dev server) - imagegen-assistant: 8003 (LLM prompt service) - image-generation: 8002 (SDXL generation) - image-processing: 8004 (post-processing) - imajin-identity: 8009 (identity recognition) - imajin-media-gallery: 3150 (photo gallery + sync API, NestJS) - imajin-media-gallery/frontend: 5220 (gallery web UI) - imajin-iphotos-sync: macOS agent on plum (no port, calls 3150) ### GPU Requirements - image-generation service REQUIRES GPU (CUDA 12.x) - Hook automatically detects GPU availability via nvidia-smi - Model cache: ~/.cache/huggingface (SDXL models ~7GB) ### Python Virtual Environments - Each Python service should use venv or Poetry - Hook warns if Python work detected but no venv active - Activation: `source .venv/bin/activate` (from service directory) ### Health Check Endpoints - All services: `GET /health` (basic liveness) - Some services: `GET /readiness` (dependency checks) - Expected response: `{"status": "healthy", "gpu_available": true}` (for GPU services) --- ## Architecture Principles ### Service Design - **Single Responsibility**: Each service has one clear purpose - **Health Endpoints**: All services implement /health - **Dependency Injection**: FastAPI uses DI for testability - **Async Patterns**: Python services use async/await throughout - **Error Handling**: HTTPException with proper status codes ### Type Safety - **Python**: Pydantic models for validation - **TypeScript**: Zod schemas for runtime validation - **Sync Strategy**: Generate TS types from Python Pydantic models - **No `any`**: Strong typing in both languages ### GPU Management (via model-boss) - **VRAM Coordination**: All GPU work goes through model-boss inference queue — no raw leases - **Diffusion**: HTTP to model-boss `/api/v1/diffusion/generate` (queue-managed slots) - **Background Inpainting**: Acquires model-boss lease on demand, auto-releases after 300s idle - **No Direct CUDA Access**: This service never calls `torch.cuda` directly — model-boss owns all GPU lifecycle ### Package Publishing - **Semantic Versioning**: MAJOR.MINOR.PATCH - **Dependency Order**: Types → Clients → Core → UI - **Breaking Changes**: Major version bump, migration guides - **Registry**: forge.black.lan (private registry) --- ## Common Workflows ### Starting All Services ```bash cd /var/home/lilith/Code/@applications/@imajin /service start imagegen-assistant # Port 8003 /service start image-generation # Port 8002 (requires GPU) /service start image-processing # Port 8004 /service start imagen-app # Port 3010 (React dev) ``` ### Publishing Coordinated Release ```bash /publish imagen-core # Bump and publish core # Updates imagen-react dependency automatically /publish imagen-react # Publish React components ``` ### Generating TS Client from Python ```bash # In image-generation/service # 1. Update Pydantic models in service/ # 2. Generate TypeScript types cd ../types npm run generate-types # Runs type generation script # 3. Update client library cd ../client npm run build ``` --- **The collective acknowledges this @imajin workspace configuration and stands ready to deploy specialized agents, load relevant instructions, and coordinate multi-service development workflows.**