Node Configuration¶
File: configs/node.yaml
Command: kenzy-node [config_path]
The node service runs on each room device. It captures microphone audio, detects the wake word, streams PCM to the server, and plays back TTS responses.
Full reference¶
| Key | Default | Description |
|---|---|---|
server_url |
"ws://127.0.0.1:8765" |
WebSocket URL of the kenzy-server |
room_id |
"living_room" |
Unique identifier for this node; used as the key in conversation history and pipeline routing |
audio_device |
null |
PortAudio device name substring or integer index. null uses the system default. Use kenzy-devices to find the correct value. |
capture_sample_rate |
16000 |
Sample rate for microphone capture. Set to the device's native rate if it does not support 16000 Hz; audio is resampled automatically. |
playback_sample_rate |
24000 |
Sample rate for speaker output. Set to the device's native rate if it does not support 24000 Hz; TTS audio is resampled automatically. |
log_level |
"info" |
Log verbosity |
verbose |
false |
Also enables debug output from websockets and asyncio internals |
Wake word¶
| Key | Default | Description |
|---|---|---|
wakeword_models |
[] |
List of paths to .tflite or .onnx model files. Empty uses the bundled hey_kenzie.tflite |
wakeword_threshold |
0.5 |
Confidence threshold [0.0–1.0] above which a detection fires |
wakeword_vad_threshold |
0.0 |
openwakeword Silero VAD gate [0.0–1.0]. Wake-word predictions are discarded unless the voice-activity score exceeds this. 0 disables it. Set to ~0.5 to suppress false detections on near-silence/noise. With it enabled you can safely lower wakeword_threshold (e.g. 0.4) for better real-speech sensitivity without reintroducing silence false-positives. The Silero VAD model is downloaded automatically by kenzy-setup. |
Voice activity detection (VAD)¶
| Key | Default | Description |
|---|---|---|
vad_enabled |
true |
When false, the node streams until the server sends STOP. Hard cap does not apply. |
silence_rms_threshold |
50 |
RMS amplitude [0–32767] below which a frame is considered silent |
silence_ms |
400 |
Consecutive silence (ms) that ends an active session, once speech_min_ms has been heard |
speech_min_ms |
400 |
Minimum speech (ms) that must be detected before silence detection activates. Prevents the session ending on the pause after the wake word. |
no_speech_timeout_ms |
15000 |
Timeout (ms) if no speech is heard after activation. Prevents indefinite streaming when the wake word fires accidentally. |
hard_cap_ms |
30000 |
Unconditional session ceiling (ms). The session ends regardless of VAD state. |
Sound files¶
| Key | Default | Description |
|---|---|---|
sound_ready |
null |
WAV file played on activation (the "chime"). null uses the bundled ready.wav. Accepts an absolute path or a bare filename loaded from the bundled sounds directory. |
sound_waiting |
null |
WAV file played while waiting for the server response. Plays once and stops naturally or is interrupted when TTS begins. null (or an empty string) disables it — pure silence while waiting. Provide a filename or path to enable it. |
Finding the right device name
Run kenzy-devices after install. It tests every PortAudio device against Kenzy's required sample rates and prints ready-to-paste node.yaml settings including capture_sample_rate and playback_sample_rate if resampling is needed.
Custom wake word
Custom wake word models can be trained at openWakeWord and pointed to via wakeword_models. Both .tflite and .onnx formats are supported.
Example¶
server_url: "ws://192.168.1.100:8765"
room_id: "office"
audio_device: "Anker PowerConf S330" # substring of name shown by kenzy-devices
capture_sample_rate: 48000 # device native rate; resampled to 16000 Hz
playback_sample_rate: 48000 # device native rate; resampled to 24000 Hz
wakeword_threshold: 0.4 # lower is safe once VAD gating is on
wakeword_vad_threshold: 0.5 # reject wake-word hits on near-silence/noise
silence_rms_threshold: 80
silence_ms: 500
sound_ready: null
sound_waiting: null