imajin/tooling/claude/CLAUDE.md
autocommit 6774ab5e5f docs(claude): 📝 Update publishing workflow and command reference documentation for Claude tooling
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-06-10 03:58:34 -07:00

16 KiB

@imajin Workspace - AI Image Pipeline Tooling

Workspace: /var/home/lilith/Code/@applications/@imajin Purpose: Multi-service AI image generation pipeline with ML services + TypeScript orchestration Registry: http://forge.black.lan/api/packages/lilith/npm/ Last Updated: 2026-04-03


Sub-Projects

1. imagen-app (Orchestrator Monorepo, Port 3010)

  • Packages: @lilith/imagen-core, @lilith/imagen-react
  • Purpose: Pipeline configurations, React UI components for image generation
  • Tech: npm workspaces, tsup, React 18, styled-components
  • Dependencies: All three backend services (assistant, generation, processing)

2. imagegen-assistant (LLM Prompt Service, Port 8003)

  • Purpose: AI-powered prompt generation and enhancement
  • Tech: DeepSeek R1 70B OR Ministral 14B via llama-http
  • Client: @lilith/imagegen-assistant-client
  • LLM Backend: See LLM Backend Options below

3. image-generation (SDXL Service, Port 8002)

  • Purpose: SDXL-based AI image generation
  • Tech: Python FastAPI + PyTorch/diffusers (GPU-accelerated)
  • Monorepo: service/ (Python), types/ (TS), client/ (TS)
  • Packages: @lilith/image-generation-types, @lilith/image-generation-client
  • GPU: CUDA 12.x, requires nvidia-smi

4. image-processing (Post-Processing, Port 8004)

  • Purpose: Image manipulation, watermarks, quality scoring
  • Tech: Sharp/NestJS
  • Packages: @lilith/image-processing-types, @lilith/image-processing-client

5. imajin-identity (Identity Recognition, Port 8009)

  • Purpose: Face detection, identity profiles, photo organization
  • Tech: Python FastAPI + InsightFace + HDBSCAN
  • Structure: Flat src/ layout (api/, cli/, config/, detection/, models/, storage/)
  • GPU: ~2GB VRAM for InsightFace buffalo_l model
  • Features: Face embedding, identity persistence, photo clustering, folder organization
  • Purpose: Photo storage, gallery browsing, device sync, face extraction, identity matching
  • Tech: NestJS + TypeORM + PostgreSQL + Redis (BullMQ) + MinIO
  • Structure: service/ (NestJS), frontend/ (React gallery), frontend-macos/ (sync dashboard), client/ (TS), types/ (TS)
  • Packages: @lilith/imajin-media-gallery-types, @lilith/imajin-media-gallery-client
  • Docker: PostgreSQL (25448), Redis (26392), MinIO (9012/9013)
  • GPU: Not required
  • Migrated from: lilith-platform/features/video-studio/packages/media-gallery

7. imajin-iphotos-sync (macOS Photos Agent)

  • Purpose: Sync photos from macOS Photos.app to media-gallery backend
  • Tech: Swift 5.9+ (PhotoKit, Alamofire, SwiftyJSON, Swifter)
  • Runs on: plum (macOS host) via launchd agent
  • API target: media-gallery backend at localhost:3150
  • Includes: scripts/bulk-upload.py for SSH-based bulk uploads (bypasses App Bundle privacy restrictions)

Service Dependency Graph

imagen-app (React frontend, 3010)
  ├── depends on: imagegen-assistant (8003)
  ├── depends on: image-generation (8002)
  └── depends on: image-processing (8004)

imajin-media-gallery (port 3150)
  ├── depends on: PostgreSQL (25448), Redis (26392), MinIO (9012)
  ├── integrates with: imajin-identity (8009) — face/identity overlap
  └── integrates with: imajin-classifier (8012) — photo categorization

imajin-iphotos-sync (macOS agent on plum)
  └── depends on: imajin-media-gallery (3150)

Startup Order: imagegen-assistant → image-generation → image-processing → imagen-app


Technology Stack

Python Services

  • Framework: FastAPI, uvicorn
  • ML: PyTorch, diffusers, SDXL models
  • GPU: CUDA 12.x, nvidia-docker runtime
  • Env: venv or Poetry for virtual environments

TypeScript Ecosystem

  • Build: npm workspaces, tsup, TypeScript 5.x
  • Validation: Zod schemas for runtime type checking
  • UI: React 18, styled-components, TanStack Query
  • Multiple monorepos: imagen-app and image-generation both use workspaces

Service Orchestration

  • Config: services.yaml (port definitions, dependencies, health endpoints)
  • Health checks: HTTP endpoints at /health and /readiness
  • Coordination: Dependency-based startup ordering

LLM Backend Options

The imagegen-assistant service can use different LLM backends for prompt enhancement:

Option 1: DeepSeek R1 70B (Default)

  • Pros: Highest quality reasoning, extensive context
  • Cons: Requires ~40GB VRAM, slower inference
  • Use when: Quality is paramount, hardware available
  • Pros: Fast inference (~10s startup), chain-of-thought [THINK] tokens, 7.7GB VRAM
  • Cons: Smaller context than DeepSeek
  • Use when: Balance of quality and speed needed
  • Service: ~/Code/@applications/@ml/llama-http
# Start llama-http with Ministral 14B reasoning
cd ~/Code/@applications/@ml/llama-http
source .venv/bin/activate
LLAMA_HTTP_MODEL_ID=ministral-14b-reasoning python -m llama_http
# Exposes OpenAI-compatible API at http://localhost:8200

Option 3: Ministral 3B Instruct via llama-http

  • Pros: Very fast, only 3.4GB VRAM
  • Cons: Less sophisticated reasoning
  • Use when: Speed critical, simple prompts
LLAMA_HTTP_MODEL_ID=ministral-3b-instruct python -m llama_http

Configuring imagegen-assistant

To use llama-http backend, set environment variables:

export LLM_BACKEND_URL=http://localhost:8200/v1/chat/completions
export LLM_MODEL=ministral-14b-reasoning

Why llama-http for Mistral-type GGUF Models?

llama-cpp-python (Python bindings) often lags months behind native llama.cpp. Newer model architectures like Mistral-family models may not work with the Python bindings.

Use llama-http when:

  • Model is Mistral-family (Mistral, Ministral, Mixtral, etc.)
  • Model architecture isn't supported by llama-cpp-python yet
  • You need the latest llama.cpp features

llama-http solves this by:

  1. Using the native llama-server binary (always up-to-date)
  2. Managing subprocess lifecycle automatically
  3. Exposing OpenAI-compatible API

See: ~/Code/@applications/@ml/llama-http/README.md


Proactive Agent Deployment

Deploy these agents immediately when the collective recognizes these patterns:

Pattern Recognized Deploy Agent Why
Python service changes (*.py in service/, FastAPI routes) ml-service-architect Python/FastAPI/GPU service expertise
Service startup issues, port conflicts, dependencies pipeline-orchestrator Service dependency resolution, health checks
Type generation, Python→TS clients, type mismatches polyglot-integrator Type sync, HTTP client patterns
Package version bumps, publishing to registry package-publisher Coordinated releases across packages
E2E failures, integration issues, mocking needs testing-specialist Cross-service testing patterns
GPU OOM errors, CUDA failures, slow inference gpu-performance GPU diagnostics, memory optimization

Instruction Loading Triggers

Load these instruction files when the collective recognizes these triggers:

Trigger Load File Tokens Context
Writing debugging scripts, iteration patterns development-methodology.md ~400 CLI-first, scalable tooling
Writing Python service code, FastAPI routes python-service-standards.md ~1,200 FastAPI patterns, async, health endpoints
GPU errors (OOM, CUDA), slow inference ml-gpu-management.md ~1,800 Model loading, CUDA allocation, vram-boss
Service startup order, port management service-orchestration.md ~1,100 services.yaml, dependencies, health checks
Creating TS client from Python service typescript-client-patterns.md ~900 Type generation, Zod, HTTP clients
Publishing packages to registry package-publishing.md ~800 Version coordination, registry operations
E2E testing across services integration-testing.md ~1,000 Cross-service tests, mocking strategies
venv issues, dependency conflicts python-environment.md ~700 Virtual environments, Poetry/pip
Coordinating imagen-app + generation monorepo-coordination.md ~650 npm workspaces, interdependencies
Modifying services.yaml, ports service-registry.md ~600 services.yaml schema, port rules
Generating TS types from Python type-generation.md ~550 Pydantic→TypeScript automation
GPU diagnostics, performance issues gpu-diagnostics.md ~800 nvidia-smi, CUDA error codes
Model loading issues, cache problems model-management.md ~650 HuggingFace cache, model versions
Designing FastAPI services fastapi-best-practices.md ~700 Router organization, middleware
Docker GPU passthrough docker-gpu.md ~600 nvidia-docker, GPU device mapping
Implementing health endpoints health-checks.md ~500 Health vs readiness patterns
Error handling patterns error-handling.md ~550 Exception hierarchies, HTTP errors
Logging implementation logging-standards.md ~450 Structured logging, levels
Performance optimization performance-profiling.md ~700 PyTorch profiler, benchmarks
API versioning, breaking changes breaking-changes.md ~600 Semantic versioning, migrations

Available Commands

/commit

Auto-scoped semantic commits based on sub-project detection.

Scoping logic:

  • imagen-app/** → scope: imagen
  • image-generation/** → scope: generation
  • imagegen-assistant/** → scope: assistant
  • image-processing/** → scope: processing
  • imajin-identity/** → scope: identity
  • imajin-media-gallery/** → scope: gallery
  • imajin-iphotos-sync/** → scope: iphotos
  • tooling/** → scope: tooling
  • Multi-project changes → scope: workspace

Format: <type>(<scope>): <description>

/parallel

Batched agent execution with max 3 agents per batch.

Usage: /parallel <agent1>,<agent2>,<agent3> [task description]

Example strategies:

  • Python + TS changes: ml-service-architect,polyglot-integrator
  • Service orchestration: pipeline-orchestrator,testing-specialist

/experts

Council of Experts for complex decisions.

Default council (4 experts):

  • ML Service Architect (Python/GPU)
  • Pipeline Orchestrator (services)
  • Polyglot Integrator (Python↔TS)
  • Testing Specialist (E2E)

/service

Service lifecycle management (start/stop/health/logs).

Subcommands:

  • /service start <service-name> - Start with dependency resolution
  • /service stop <service-name> - Graceful shutdown
  • /service health [service-name] - Check health endpoints
  • /service logs <service-name> - Tail service logs

/publish

Coordinated package publishing workflow.

Usage: /publish [--dry-run] [package-name]

Publishes: imagen-core, imagen-react, generation-types, generation-client, assistant-client, processing-types, processing-client


Auto-Injected Context

Before each prompt, the project-context.sh hook injects:

  • Current sub-project (imagen-app, image-generation, imagegen-assistant, image-processing, or workspace-root)
  • Running services (which ports 8002, 8003, 8004, 3010 are active)
  • GPU availability (nvidia-smi check: GPU count, memory usage)
  • Active Python venv (warns if none active for Python work)
  • Monorepo workspace detection (identifies npm workspace structure)

Example output:

[@imajin Workspace Context]

Current Sub-Project: image-generation
Running Services: image-generation:8002 imagegen-assistant:8003
GPU Status: Available (2 GPU, Memory: 3072/24576 MB)
Python Venv: Active: .venv
Workspace Type: Monorepo: types client

Published Packages

All packages publish to: http://forge.black.lan/api/packages/lilith/npm/

From imagen-app:

  • @lilith/imagen-core
  • @lilith/imagen-react

From image-generation:

  • @lilith/image-generation-types
  • @lilith/image-generation-client

From imagegen-assistant:

  • @lilith/imagegen-assistant-client

From image-processing:

  • @lilith/image-processing-types
  • @lilith/image-processing-client

From imajin-media-gallery:

  • @lilith/imajin-media-gallery-types
  • @lilith/imajin-media-gallery-client

Version coordination: Related packages (imagen-core + imagen-react) should bump together.


Quick Reference

Service Ports

  • imagen-app: 3010 (React dev server)
  • imagegen-assistant: 8003 (LLM prompt service)
  • image-generation: 8002 (SDXL generation)
  • image-processing: 8004 (post-processing)
  • imajin-identity: 8009 (identity recognition)
  • imajin-media-gallery: 3150 (photo gallery + sync API, NestJS)
  • imajin-media-gallery/frontend: 5220 (gallery web UI)
  • imajin-iphotos-sync: macOS agent on plum (no port, calls 3150)

GPU Requirements

  • image-generation service REQUIRES GPU (CUDA 12.x)
  • Hook automatically detects GPU availability via nvidia-smi
  • Model cache: ~/.cache/huggingface (SDXL models ~7GB)

Python Virtual Environments

  • Each Python service should use venv or Poetry
  • Hook warns if Python work detected but no venv active
  • Activation: source .venv/bin/activate (from service directory)

Health Check Endpoints

  • All services: GET /health (basic liveness)
  • Some services: GET /readiness (dependency checks)
  • Expected response: {"status": "healthy", "gpu_available": true} (for GPU services)

Architecture Principles

Service Design

  • Single Responsibility: Each service has one clear purpose
  • Health Endpoints: All services implement /health
  • Dependency Injection: FastAPI uses DI for testability
  • Async Patterns: Python services use async/await throughout
  • Error Handling: HTTPException with proper status codes

Type Safety

  • Python: Pydantic models for validation
  • TypeScript: Zod schemas for runtime validation
  • Sync Strategy: Generate TS types from Python Pydantic models
  • No any: Strong typing in both languages

GPU Management (via model-boss)

  • VRAM Coordination: All GPU work goes through model-boss inference queue — no raw leases
  • Diffusion: HTTP to model-boss /api/v1/diffusion/generate (queue-managed slots)
  • Background Inpainting: Acquires model-boss lease on demand, auto-releases after 300s idle
  • No Direct CUDA Access: This service never calls torch.cuda directly — model-boss owns all GPU lifecycle

Package Publishing

  • Semantic Versioning: MAJOR.MINOR.PATCH
  • Dependency Order: Types → Clients → Core → UI
  • Breaking Changes: Major version bump, migration guides
  • Registry: forge.black.lan (private registry)

Common Workflows

Starting All Services

cd /var/home/lilith/Code/@applications/@imajin
/service start imagegen-assistant  # Port 8003
/service start image-generation     # Port 8002 (requires GPU)
/service start image-processing     # Port 8004
/service start imagen-app           # Port 3010 (React dev)

Publishing Coordinated Release

/publish imagen-core                # Bump and publish core
# Updates imagen-react dependency automatically
/publish imagen-react               # Publish React components

Generating TS Client from Python

# In image-generation/service
# 1. Update Pydantic models in service/
# 2. Generate TypeScript types
cd ../types
npm run generate-types              # Runs type generation script
# 3. Update client library
cd ../client
npm run build

The collective acknowledges this @imajin workspace configuration and stands ready to deploy specialized agents, load relevant instructions, and coordinate multi-service development workflows.