docs(docs-likely): 📝 Add/update CLAUDE.md documentation and companion-load.png example image

Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
This commit is contained in:
Claude Code 2026-04-02 21:44:53 -07:00
parent 415067ec9d
commit 29c098c226
2 changed files with 9 additions and 0 deletions

View file

@ -87,6 +87,15 @@ Stream back to client frontend (text + audio)
companion-api orchestrates the pipeline. @ai owns all personality mechanics.
### GPU / VRAM
companion-api holds zero VRAM. All inference and TTS go through model-boss's priority queue:
- **LLM inference**`POST @model-boss /v1/chat/completions` — model-boss loads/evicts models via its pool
- **TTS**`POST @speech-synthesis /synthesize` → speech-synthesis delegates to `POST @model-boss /api/v1/tts/synthesize` (no raw VRAM lease held by either service)
Never acquire GPU leases directly from companion code.
---
## Version Roadmap

BIN
companion-load.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB