Server Configuration

File: configs/server.yaml
Command: kenzy-server [config_path]

The server is the central WebSocket hub. It accepts connections from room nodes, runs the STT → LLM → TTS pipeline, and streams audio responses back. Each downstream service is optional — omit its url to disable that stage.

Full reference

Key Default Description
host "0.0.0.0" Bind address. 0.0.0.0 listens on all interfaces.
port 8765 WebSocket port
log_level "info" Log verbosity

STT service

Key Default Description
stt.url URL of the kenzy-stt /transcribe endpoint. Omit or set to null to skip transcription.
stt.timeout 60.0 HTTP timeout in seconds

Speaker identification service

Key Default Description
speaker.url URL of the kenzy-speaker /identify endpoint. Omit to disable speaker ID.
speaker.timeout 10.0 HTTP timeout in seconds
speaker.unknown_speaker "unknown" Name used when no enrolled speaker is identified

LLM service

Key Default Description
llm.url URL of the kenzy-llm /process endpoint. Omit to disable LLM processing.
llm.timeout 30.0 HTTP timeout in seconds

TTS service

Key Default Description
tts.url URL of the kenzy-tts /speak endpoint. Omit to disable TTS.
tts.timeout 60.0 HTTP timeout in seconds
tts.chunk_size 4096 Bytes per PCM chunk streamed to the node. At 24 kHz int16 mono, 4096 bytes ≈ 85 ms of audio.

Example

host: "0.0.0.0"
port: 8765

stt:
  url: "http://127.0.0.1:8767/transcribe"
  timeout: 60.0

speaker:
  url: "http://127.0.0.1:8768/identify"
  timeout: 10.0
  unknown_speaker: "unknown"

llm:
  url: "http://127.0.0.1:8766/process"
  timeout: 30.0

tts:
  url: "http://127.0.0.1:8769/speak"
  timeout: 60.0
  chunk_size: 4096

Disabling stages

You can run a partial pipeline for development. For example, omit llm.url and tts.url to transcribe audio and log the results without generating responses.