# rvoice — push-to-talk dictation for remote rclaude sessions `/voice` in Claude Code opens the mic on **whichever host the claude binary is running on**. When you're sshed to apricot through `cc` / `rclaude resume`, that's apricot — which has no mic. `rvoice` fills the gap. It records audio locally on macOS, transcribes via Groq Whisper (no local model RAM), and injects the transcript into the active remote tmux session via `tmux send-keys` over ssh. The target session is auto-detected from the focused iTerm2 tab title (set by the canonical session-tools `tmux.conf` to ` · `). ## Architecture ``` [ Right ⌥ down ] ──Hammerspoon──▶ rvoice start ──▶ ffmpeg → recording.wav [ Right ⌥ up ] ──Hammerspoon──▶ rvoice stop │ ▼ POST WAV → Groq /audio/transcriptions │ ▼ iTerm2 active tab title → "apricot · claude-…" │ ▼ ssh apricot tmux send-keys -t claude-… -l "" ``` ## Files | Path | Role | |------------------------------------------------------|---------------------------------------| | `bin/rvoice` | CLI: `start`/`stop`/`cancel`/`target`/`log` | | `hammerspoon/rvoice.lua` | Right-⌥ hold detector → calls `rvoice` | | `~/.config/rvoice/config` | Sourced at startup; holds `GROQ_API_KEY` and tweaks | | `$TMPDIR/rvoice/` | Per-recording state (pid, wav, log) | ## Install Prerequisites: `ffmpeg`, `jq`, `curl` (all `brew install`able), a Groq API key (free tier — https://console.groq.com/keys), and Hammerspoon (`brew install --cask hammerspoon`). ```sh # 1. Symlink rvoice (already done if you ran install.sh) ln -sfn ~/Code/@scripts/session-tools/bin/rvoice ~/.local/bin/rvoice # 2. Drop your Groq key mkdir -p ~/.config/rvoice cat >> ~/.config/rvoice/config <<'EOF' export GROQ_API_KEY=gsk_...your_key... # export RVOICE_AUTOSEND=1 # uncomment to auto-press Enter after injection EOF # 3. Wire up Hammerspoon mkdir -p ~/.hammerspoon ln -sfn ~/Code/@scripts/session-tools/hammerspoon/rvoice.lua ~/.hammerspoon/rvoice.lua echo 'require("rvoice")' >> ~/.hammerspoon/init.lua open /Applications/Hammerspoon.app # 4. From Hammerspoon's menu bar → Reload Config. # Grant Accessibility + Microphone permission when macOS prompts. ``` ## Usage From any iTerm2 tab that's attached to a remote claude session via `cc` or `rclaude resume`: 1. **Hold Right ⌥** → "listening…" notification, Tink sound 2. **Speak** 3. **Release** → recording stops, transcript types into your claude prompt, Pop sound on success / Funk sound on error 4. **Hit Enter** when you're ready (review first), or set `RVOICE_AUTOSEND=1` to skip the manual confirmation ## Config (`~/.config/rvoice/config`) Plain shell fragment sourced at startup. Defaults shown. ```sh export GROQ_API_KEY=... # REQUIRED export RVOICE_MODEL=whisper-large-v3-turbo # Groq model id export RVOICE_AUTOSEND=0 # 1 = press Enter after inject export RVOICE_MIN_MS=200 # ignore taps shorter than this (debounce) export RVOICE_MAX_S=60 # hard cap on a single recording export RVOICE_HOST=apricot.lan # force target host (overrides iTerm2 detection) export RVOICE_SESSION=claude-natalie-… # force target tmux session ``` Override any of these per-invocation: `RVOICE_AUTOSEND=1 rvoice stop`. ## Subcommands ```sh rvoice start # begin recording (Hammerspoon calls this on key-down) rvoice stop # stop, transcribe, inject (called on key-up) rvoice cancel # stop without transcribing (called on quick-tap abort) rvoice target # debug: echo the host+session rvoice WOULD inject into rvoice log # tail -50 of the action log ``` ## Troubleshooting - **"GROQ_API_KEY not set"** — Hammerspoon's shell environment doesn't inherit from your login shell. Make sure the key is exported in `~/.config/rvoice/config`; rvoice sources that file before each invocation. - **"no target session resolvable"** — the focused iTerm2 tab title isn't in ` · ` format. Either: (a) you're not in an rclaude/ssh session, or (b) the remote tmux config didn't get the title-setting fragment. `rclaude install --on ` re-pushes the canonical tmux config; verify with `ssh 'tmux show-options -g | grep set-titles'`. - **Hammerspoon doesn't see Right ⌥** — System Settings → Privacy & Security → Accessibility → enable Hammerspoon. Also Microphone for the recording step. Restart Hammerspoon after granting. - **Transcription returns nonsense** — Groq's `whisper-large-v3-turbo` is multilingual but English-biased. Set `RVOICE_MODEL=whisper-large-v3` for the slower but more accurate variant. - **Injection types into the wrong session** — `rvoice target` shows what it will hit. If wrong, set `RVOICE_HOST` / `RVOICE_SESSION` in config to pin the target. - **Latency feels high** — Groq is fast (~500ms for short clips). Network latency to plum + ssh round-trip to apricot adds ~200ms. Local Whisper would be slower in practice on most laptops once you account for model load. ## Why this architecture (vs. /voice over ssh) `/voice` is a feature of the `claude` binary itself; it opens the mic via the OS audio API on whichever host it runs on. ssh has no audio channel and doesn't forward CoreAudio events. The only ways to make `/voice` work over a remote rclaude session would be: 1. **Run claude locally** (lose apricot's compute / project files / LAN services — not viable for our workflow) 2. **Forward audio via PulseAudio** (brittle on macOS, breaks on every claude release) 3. **Reproduce /voice's behavior with our own pieces** ← this is rvoice `rvoice` keeps the mic and the hotkey on the Mac, runs transcription on a hosted endpoint (zero local RAM), and uses tmux's existing send-keys protocol to deliver text — every layer is well-understood and stable.