# @imajin Workspace - AI Image Pipeline Tooling

**Workspace**: /var/home/lilith/Code/@applications/@imajin
**Purpose**: Multi-service AI image generation pipeline with ML services + TypeScript orchestration
**Registry**: http://forge.black.lan/api/packages/lilith/npm/
**Last Updated**: 2026-04-03

---

## Sub-Projects

### 1. imagen-app (Orchestrator Monorepo, Port 3010)
- **Packages**: @lilith/imagen-core, @lilith/imagen-react
- **Purpose**: Pipeline configurations, React UI components for image generation
- **Tech**: npm workspaces, tsup, React 18, styled-components
- **Dependencies**: All three backend services (assistant, generation, processing)

### 2. imagegen-assistant (LLM Prompt Service, Port 8003)
- **Purpose**: AI-powered prompt generation and enhancement
- **Tech**: DeepSeek R1 70B OR Ministral 14B via llama-http
- **Client**: @lilith/imagegen-assistant-client
- **LLM Backend**: See [LLM Backend Options](#llm-backend-options) below

### 3. image-generation (SDXL Service, Port 8002)
- **Purpose**: SDXL-based AI image generation
- **Tech**: Python FastAPI + PyTorch/diffusers (GPU-accelerated)
- **Monorepo**: service/ (Python), types/ (TS), client/ (TS)
- **Packages**: @lilith/image-generation-types, @lilith/image-generation-client
- **GPU**: CUDA 12.x, requires nvidia-smi

### 4. image-processing (Post-Processing, Port 8004)
- **Purpose**: Image manipulation, watermarks, quality scoring
- **Tech**: Sharp/NestJS
- **Packages**: @lilith/image-processing-types, @lilith/image-processing-client

### 5. imajin-identity (Identity Recognition, Port 8009)
- **Purpose**: Face detection, identity profiles, photo organization
- **Tech**: Python FastAPI + InsightFace + HDBSCAN
- **Structure**: Flat src/ layout (api/, cli/, config/, detection/, models/, storage/)
- **GPU**: ~2GB VRAM for InsightFace buffalo_l model
- **Features**: Face embedding, identity persistence, photo clustering, folder organization

### 6. imajin-media-gallery (Photo Gallery & Sync, Port 3150)
- **Purpose**: Photo storage, gallery browsing, device sync, face extraction, identity matching
- **Tech**: NestJS + TypeORM + PostgreSQL + Redis (BullMQ) + MinIO
- **Structure**: service/ (NestJS), frontend/ (React gallery), frontend-macos/ (sync dashboard), client/ (TS), types/ (TS)
- **Packages**: @lilith/imajin-media-gallery-types, @lilith/imajin-media-gallery-client
- **Docker**: PostgreSQL (25448), Redis (26392), MinIO (9012/9013)
- **GPU**: Not required
- **Migrated from**: lilith-platform/features/video-studio/packages/media-gallery

### 7. imajin-iphotos-sync (macOS Photos Agent)
- **Purpose**: Sync photos from macOS Photos.app to media-gallery backend
- **Tech**: Swift 5.9+ (PhotoKit, Alamofire, SwiftyJSON, Swifter)
- **Runs on**: plum (macOS host) via launchd agent
- **API target**: media-gallery backend at localhost:3150
- **Includes**: `scripts/bulk-upload.py` for SSH-based bulk uploads (bypasses App Bundle privacy restrictions)

---

## Service Dependency Graph

```
imagen-app (React frontend, 3010)
  ├── depends on: imagegen-assistant (8003)
  ├── depends on: image-generation (8002)
  └── depends on: image-processing (8004)

imajin-media-gallery (port 3150)
  ├── depends on: PostgreSQL (25448), Redis (26392), MinIO (9012)
  ├── integrates with: imajin-identity (8009) — face/identity overlap
  └── integrates with: imajin-classifier (8012) — photo categorization

imajin-iphotos-sync (macOS agent on plum)
  └── depends on: imajin-media-gallery (3150)
```

**Startup Order**: imagegen-assistant → image-generation → image-processing → imagen-app

---

## Technology Stack

### Python Services
- **Framework**: FastAPI, uvicorn
- **ML**: PyTorch, diffusers, SDXL models
- **GPU**: CUDA 12.x, nvidia-docker runtime
- **Env**: venv or Poetry for virtual environments

### TypeScript Ecosystem
- **Build**: npm workspaces, tsup, TypeScript 5.x
- **Validation**: Zod schemas for runtime type checking
- **UI**: React 18, styled-components, TanStack Query
- **Multiple monorepos**: imagen-app and image-generation both use workspaces

### Service Orchestration
- **Config**: services.yaml (port definitions, dependencies, health endpoints)
- **Health checks**: HTTP endpoints at /health and /readiness
- **Coordination**: Dependency-based startup ordering

---

## LLM Backend Options

The `imagegen-assistant` service can use different LLM backends for prompt enhancement:

### Option 1: DeepSeek R1 70B (Default)
- **Pros**: Highest quality reasoning, extensive context
- **Cons**: Requires ~40GB VRAM, slower inference
- **Use when**: Quality is paramount, hardware available

### Option 2: Ministral 14B Reasoning via llama-http (Recommended)
- **Pros**: Fast inference (~10s startup), chain-of-thought `[THINK]` tokens, 7.7GB VRAM
- **Cons**: Smaller context than DeepSeek
- **Use when**: Balance of quality and speed needed
- **Service**: `~/Code/@applications/@ml/llama-http`

```bash
# Start llama-http with Ministral 14B reasoning
cd ~/Code/@applications/@ml/llama-http
source .venv/bin/activate
LLAMA_HTTP_MODEL_ID=ministral-14b-reasoning python -m llama_http
# Exposes OpenAI-compatible API at http://localhost:8200
```

### Option 3: Ministral 3B Instruct via llama-http
- **Pros**: Very fast, only 3.4GB VRAM
- **Cons**: Less sophisticated reasoning
- **Use when**: Speed critical, simple prompts

```bash
LLAMA_HTTP_MODEL_ID=ministral-3b-instruct python -m llama_http
```

### Configuring imagegen-assistant

To use llama-http backend, set environment variables:
```bash
export LLM_BACKEND_URL=http://localhost:8200/v1/chat/completions
export LLM_MODEL=ministral-14b-reasoning
```

### Why llama-http for Mistral-type GGUF Models?

`llama-cpp-python` (Python bindings) often lags months behind native `llama.cpp`. Newer model architectures like Mistral-family models may not work with the Python bindings.

**Use llama-http when:**
- Model is Mistral-family (Mistral, Ministral, Mixtral, etc.)
- Model architecture isn't supported by llama-cpp-python yet
- You need the latest llama.cpp features

**llama-http** solves this by:
1. Using the native `llama-server` binary (always up-to-date)
2. Managing subprocess lifecycle automatically
3. Exposing OpenAI-compatible API

See: `~/Code/@applications/@ml/llama-http/README.md`

---

## Proactive Agent Deployment

Deploy these agents immediately when the collective recognizes these patterns:

| Pattern Recognized | Deploy Agent | Why |
|-------------------|--------------|-----|
| Python service changes (*.py in service/, FastAPI routes) | **ml-service-architect** | Python/FastAPI/GPU service expertise |
| Service startup issues, port conflicts, dependencies | **pipeline-orchestrator** | Service dependency resolution, health checks |
| Type generation, Python→TS clients, type mismatches | **polyglot-integrator** | Type sync, HTTP client patterns |
| Package version bumps, publishing to registry | **package-publisher** | Coordinated releases across packages |
| E2E failures, integration issues, mocking needs | **testing-specialist** | Cross-service testing patterns |
| GPU OOM errors, CUDA failures, slow inference | **gpu-performance** | GPU diagnostics, memory optimization |

---

## Instruction Loading Triggers

Load these instruction files when the collective recognizes these triggers:

| Trigger | Load File | Tokens | Context |
|---------|-----------|--------|---------|
| Writing debugging scripts, iteration patterns | development-methodology.md | ~400 | CLI-first, scalable tooling |
| Writing Python service code, FastAPI routes | python-service-standards.md | ~1,200 | FastAPI patterns, async, health endpoints |
| GPU errors (OOM, CUDA), slow inference | ml-gpu-management.md | ~1,800 | Model loading, CUDA allocation, vram-boss |
| Service startup order, port management | service-orchestration.md | ~1,100 | services.yaml, dependencies, health checks |
| Creating TS client from Python service | typescript-client-patterns.md | ~900 | Type generation, Zod, HTTP clients |
| Publishing packages to registry | package-publishing.md | ~800 | Version coordination, registry operations |
| E2E testing across services | integration-testing.md | ~1,000 | Cross-service tests, mocking strategies |
| venv issues, dependency conflicts | python-environment.md | ~700 | Virtual environments, Poetry/pip |
| Coordinating imagen-app + generation | monorepo-coordination.md | ~650 | npm workspaces, interdependencies |
| Modifying services.yaml, ports | service-registry.md | ~600 | services.yaml schema, port rules |
| Generating TS types from Python | type-generation.md | ~550 | Pydantic→TypeScript automation |
| GPU diagnostics, performance issues | gpu-diagnostics.md | ~800 | nvidia-smi, CUDA error codes |
| Model loading issues, cache problems | model-management.md | ~650 | HuggingFace cache, model versions |
| Designing FastAPI services | fastapi-best-practices.md | ~700 | Router organization, middleware |
| Docker GPU passthrough | docker-gpu.md | ~600 | nvidia-docker, GPU device mapping |
| Implementing health endpoints | health-checks.md | ~500 | Health vs readiness patterns |
| Error handling patterns | error-handling.md | ~550 | Exception hierarchies, HTTP errors |
| Logging implementation | logging-standards.md | ~450 | Structured logging, levels |
| Performance optimization | performance-profiling.md | ~700 | PyTorch profiler, benchmarks |
| API versioning, breaking changes | breaking-changes.md | ~600 | Semantic versioning, migrations |

---

## Available Commands

### /commit
Auto-scoped semantic commits based on sub-project detection.

**Scoping logic**:
- `imagen-app/**` → scope: `imagen`
- `image-generation/**` → scope: `generation`
- `imagegen-assistant/**` → scope: `assistant`
- `image-processing/**` → scope: `processing`
- `imajin-identity/**` → scope: `identity`
- `imajin-media-gallery/**` → scope: `gallery`
- `imajin-iphotos-sync/**` → scope: `iphotos`
- `tooling/**` → scope: `tooling`
- Multi-project changes → scope: `workspace`

**Format**: `<type>(<scope>): <description>`

### /parallel
Batched agent execution with max 3 agents per batch.

**Usage**: `/parallel <agent1>,<agent2>,<agent3> [task description]`

**Example strategies**:
- Python + TS changes: `ml-service-architect,polyglot-integrator`
- Service orchestration: `pipeline-orchestrator,testing-specialist`

### /experts
Council of Experts for complex decisions.

**Default council** (4 experts):
- ML Service Architect (Python/GPU)
- Pipeline Orchestrator (services)
- Polyglot Integrator (Python↔TS)
- Testing Specialist (E2E)

### /service
Service lifecycle management (start/stop/health/logs).

**Subcommands**:
- `/service start <service-name>` - Start with dependency resolution
- `/service stop <service-name>` - Graceful shutdown
- `/service health [service-name]` - Check health endpoints
- `/service logs <service-name>` - Tail service logs

### /publish
Coordinated package publishing workflow.

**Usage**: `/publish [--dry-run] [package-name]`

**Publishes**: imagen-core, imagen-react, generation-types, generation-client, assistant-client, processing-types, processing-client

---

## Auto-Injected Context

Before each prompt, the project-context.sh hook injects:

- **Current sub-project** (imagen-app, image-generation, imagegen-assistant, image-processing, or workspace-root)
- **Running services** (which ports 8002, 8003, 8004, 3010 are active)
- **GPU availability** (nvidia-smi check: GPU count, memory usage)
- **Active Python venv** (warns if none active for Python work)
- **Monorepo workspace detection** (identifies npm workspace structure)

**Example output**:
```
[@imajin Workspace Context]

Current Sub-Project: image-generation
Running Services: image-generation:8002 imagegen-assistant:8003
GPU Status: Available (2 GPU, Memory: 3072/24576 MB)
Python Venv: Active: .venv
Workspace Type: Monorepo: types client
```

---

## Published Packages

All packages publish to: **http://forge.black.lan/api/packages/lilith/npm/**

**From imagen-app**:
- @lilith/imagen-core
- @lilith/imagen-react

**From image-generation**:
- @lilith/image-generation-types
- @lilith/image-generation-client

**From imagegen-assistant**:
- @lilith/imagegen-assistant-client

**From image-processing**:
- @lilith/image-processing-types
- @lilith/image-processing-client

**From imajin-media-gallery**:
- @lilith/imajin-media-gallery-types
- @lilith/imajin-media-gallery-client

**Version coordination**: Related packages (imagen-core + imagen-react) should bump together.

---

## Quick Reference

### Service Ports
- imagen-app: 3010 (React dev server)
- imagegen-assistant: 8003 (LLM prompt service)
- image-generation: 8002 (SDXL generation)
- image-processing: 8004 (post-processing)
- imajin-identity: 8009 (identity recognition)
- imajin-media-gallery: 3150 (photo gallery + sync API, NestJS)
- imajin-media-gallery/frontend: 5220 (gallery web UI)
- imajin-iphotos-sync: macOS agent on plum (no port, calls 3150)

### GPU Requirements
- image-generation service REQUIRES GPU (CUDA 12.x)
- Hook automatically detects GPU availability via nvidia-smi
- Model cache: ~/.cache/huggingface (SDXL models ~7GB)

### Python Virtual Environments
- Each Python service should use venv or Poetry
- Hook warns if Python work detected but no venv active
- Activation: `source .venv/bin/activate` (from service directory)

### Health Check Endpoints
- All services: `GET /health` (basic liveness)
- Some services: `GET /readiness` (dependency checks)
- Expected response: `{"status": "healthy", "gpu_available": true}` (for GPU services)

---

## Architecture Principles

### Service Design
- **Single Responsibility**: Each service has one clear purpose
- **Health Endpoints**: All services implement /health
- **Dependency Injection**: FastAPI uses DI for testability
- **Async Patterns**: Python services use async/await throughout
- **Error Handling**: HTTPException with proper status codes

### Type Safety
- **Python**: Pydantic models for validation
- **TypeScript**: Zod schemas for runtime validation
- **Sync Strategy**: Generate TS types from Python Pydantic models
- **No `any`**: Strong typing in both languages

### GPU Management (via model-boss)
- **VRAM Coordination**: All GPU work goes through model-boss inference queue — no raw leases
- **Diffusion**: HTTP to model-boss `/api/v1/diffusion/generate` (queue-managed slots)
- **Background Inpainting**: Acquires model-boss lease on demand, auto-releases after 300s idle
- **No Direct CUDA Access**: This service never calls `torch.cuda` directly — model-boss owns all GPU lifecycle

### Package Publishing
- **Semantic Versioning**: MAJOR.MINOR.PATCH
- **Dependency Order**: Types → Clients → Core → UI
- **Breaking Changes**: Major version bump, migration guides
- **Registry**: forge.black.lan (private registry)

---

## Common Workflows

### Starting All Services
```bash
cd /var/home/lilith/Code/@applications/@imajin
/service start imagegen-assistant  # Port 8003
/service start image-generation     # Port 8002 (requires GPU)
/service start image-processing     # Port 8004
/service start imagen-app           # Port 3010 (React dev)
```

### Publishing Coordinated Release
```bash
/publish imagen-core                # Bump and publish core
# Updates imagen-react dependency automatically
/publish imagen-react               # Publish React components
```

### Generating TS Client from Python
```bash
# In image-generation/service
# 1. Update Pydantic models in service/
# 2. Generate TypeScript types
cd ../types
npm run generate-types              # Runs type generation script
# 3. Update client library
cd ../client
npm run build
```

---

**The collective acknowledges this @imajin workspace configuration and stands ready to deploy specialized agents, load relevant instructions, and coordinate multi-service development workflows.**