KENZY¶
Kenzy is a distributed home voice assistant built as six independently deployable microservices. Wake-word detection runs locally on low-power room nodes (Raspberry Pi Zero 2 W or similar). Audio streams over WebSocket to a central server that runs the full speech-to-text → LLM → text-to-speech pipeline and streams synthesized speech back to the room.
Key features¶
- Distributed — each service runs independently; deploy only what you need on each host
- Wake-word activation — always-on local detection with no cloud dependency for the trigger
- Speaker identification — knows who is speaking; used for personalization and access control
- Extensible skills — add new capabilities by dropping a Python file in
skills/; no registration required - LLM-agnostic — works with OpenAI, Anthropic, Ollama, LM Studio, and any provider supported by LiteLLM
- Conversation history — per-room rolling context window so follow-up questions resolve naturally
Services at a glance¶
| Service | Command | Port | Role |
|---|---|---|---|
| node | kenzy-node |
— | Wake word, audio capture, TTS playback |
| server | kenzy-server |
8765 | WebSocket hub, pipeline orchestrator |
| stt | kenzy-stt |
8767 | Speech-to-text via faster-whisper |
| tts | kenzy-tts |
8769 | Text-to-speech via OpenAI TTS |
| llm | kenzy-llm |
8766 | LLM + skill tool-calling via LiteLLM |
| speaker | kenzy-speaker |
8768 | Speaker identification via SpeechBrain |
Quick links¶
- Getting Started — install, configure, and run your first session
- Architecture — how the pieces fit together
- Skills — extend Kenzy with custom capabilities
- Deployment — push to a fleet of remote hosts with
kenzy-deploy