Documentation Index
Fetch the complete documentation index at: https://voxray-cac3ed72.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Prerequisites
Before installing Voxray, confirm the following tools are available on your machine.| Requirement | Command | Notes |
|---|---|---|
| Go 1.25+ | go version | Required for all builds |
| Git | git --version | Required to clone the repository |
| C compiler (gcc or clang) | gcc --version | Required only for WebRTC builds with Opus audio |
The default WebSocket-only build has no C compiler dependency. You only need a C compiler if you plan to use WebRTC transport with TTS audio output via Opus encoding.
Install Go and System Dependencies
- macOS
- Linux
- Windows
Install Go using Homebrew:Verify the installation:The CGO toolchain (Clang) ships with Xcode Command Line Tools. Install or confirm it is present:If already installed, this command exits immediately. Verify:
Clone the Repository
Build Variants
Voxray ships two primary build targets. Choose based on which transport you need.| Command | CGO | Transport Support | Binary Size |
|---|---|---|---|
make build | Disabled | WebSocket only | ~15 MB |
make build-voice | Enabled | WebSocket + WebRTC (Opus) | ~20 MB |
Default build — WebSocket only
No C compiler required. CGO is explicitly disabled so the binary is fully static and portable.Voice build — WebSocket and WebRTC with Opus
Requiresgcc or clang on your PATH. Enables the Opus encoder so the server can deliver TTS audio over WebRTC peer connections.
Linux / macOS:
Docker
Voxray ships a production-ready multi-stage Dockerfile. The default Docker build produces a WebSocket-only binary (CGO disabled, static binary on Alpine). Build the image:The Docker image exposes port 8080 by default. The
VOXRAY_CONFIG environment variable is pre-set to /app/config.json in the image. Mount your config at that path or override VOXRAY_CONFIG to point elsewhere.Environment Setup
Create your config file
Copy the example config and open it in your editor:Four fields you must configure before first run
Every config file requires at minimum these four top-level settings before the server can start a voice pipeline:| Field | Type | Description | Example |
|---|---|---|---|
stt_provider | string | Speech-to-text provider name | "openai" |
llm_provider | string | Language model provider name | "openai" |
tts_provider | string | Text-to-speech provider name | "openai" |
api_keys | object | Map of provider name to API key | {"openai": "sk-..."} |
YOUR_OPENAI_API_KEY with your actual key. API keys can also be supplied via environment variables (for example, OPENAI_API_KEY) — they do not need to be embedded in the config file.
Verify the Installation
Start the server:Next Steps
WebSocket Quickstart
Connect a client to the WebSocket endpoint and make your first voice call.
WebRTC Quickstart
Build with CGO and connect a browser client via WebRTC for real-time audio.
Configuration Reference
Explore all config fields: transports, providers, recording, transcripts, and more.
Supported Providers
See the full matrix of STT, LLM, and TTS providers and their config keys.