Skip to main content

Neiroha VoxCPM2

This page covers the Neiroha VoxCPM2 local backend. Extract the portable Release package, or place the source repository anywhere; <backend-root> refers to that directory.

It provides OpenAI-compatible routes, native /api/voxcpm routes, a voice registry, Neiroha Gradio Admin, and optional official WebUI. Startup behavior is controlled by configs/server.toml.

Neiroha VoxCPM Admin home
The backend Admin loads models, manages the voice registry, runs tests, and displays logs.

Capability Summary

DimensionCurrent Notes
Recommended versionVoxCPM2 is the current official deployment version, 2B parameters, 48 kHz output.
LanguagesOfficial list includes 30 languages: Arabic, Burmese, Chinese, Danish, Dutch, English, Finnish, French, German, Greek, Hebrew, Hindi, Indonesian, Italian, Japanese, Khmer, Korean, Lao, Malay, Norwegian, Polish, Portuguese, Russian, Spanish, Swahili, Swedish, Tagalog, Thai, Turkish, Vietnamese.
DialectsOfficial list includes 9 Chinese dialects: Sichuanese, Cantonese, Wu, Northeastern, Henan, Shaanxi, Shandong, Tianjin, Minnan.
Cross-language outputSupports multilingual synthesis and cross-language reference-audio cloning.
Text voice designvoxcpm2-design needs no reference audio; put age, gender, tone, emotion, and speed in natural-language text.
Controllable clonevoxcpm2-clone uses reference_audio and does not need prompt text.
Ultimate clonevoxcpm2-ultimate-clone needs prompt_audio and exact prompt_text; do not rely on bracket style control in this mode.
Official speed referenceOfficial PyTorch reports RTF around 0.30 on RTX 4090; Nano-vLLM / vLLM-Omni acceleration is around 0.13.
BoundariesVery short text may sound weak. Long text can speed up, produce noise, fail to stop, or OOM; split by sentence for production.

Default Addresses

ServiceDefault AddressPurpose
FastAPIhttp://127.0.0.1:8000Neiroha provider connects here.
Adminhttp://127.0.0.1:7860Manage voices, model presets, tests, and logs.

Install

Recommended portable flow. The current package is built for NVIDIA GPU / CUDA environments and mainly targets RTX 30 / 40 / 50 series users:

  1. Open Neiroha-VoxCPM V1.0.0 Release.
  2. Download all V1.0.0 split archives: Neiroha-VoxCPM-portable.7z.001, .002, .003, .004.
  3. If GitHub downloads are unstable, use the Baidu Netdisk mirror from the Release body.
  4. Put all four files in the same directory and extract from .001 with 7-Zip.
  5. Run start_portable.bat.

Source or development environment:

pixi install
pixi run install

Optional ASR model for automatic prompt text transcription in ultimate clone:

pixi run install-asr

ASR is disabled by default. ultimate_clone normally needs manual prompt text.

Start

Portable Release:

.\start_portable.bat

Source environment:

pixi run serve

Common tasks:

CommandPurpose
pixi run serveStart according to configs/server.toml [startup].surface, default API + Admin
pixi run apiStart FastAPI only
pixi run adminStart Neiroha Admin and connect to an existing FastAPI
pixi run smokeCheck /health, /v1/models, /v1/audio/voices, and capabilities
pixi run testRun backend tests
pixi run launcher-helpShow launcher arguments

Connect Neiroha

  1. Open Providers.
  2. Create a provider, adapter type VoxCPM2 Native.
  3. Set Base URL to http://127.0.0.1:8000.
  4. Leave API Key empty if local auth is disabled.
  5. Click Fetch All.
  6. Confirm voxcpm2-design, voxcpm2-clone, and voxcpm2-ultimate-clone.
  7. Enable the provider and click Health Check.

Android emulator host URL:

http://10.0.2.2:8000

Character Setup

GoalSetting
Text voice designSelect voxcpm2-design; write natural-language voice description at the beginning of the text.
Reference cloneSelect voxcpm2-clone; provide reference audio, no prompt text required.
High-fidelity cloneSelect voxcpm2-ultimate-clone; provide prompt audio and prompt text.
Reusable local speakerRegister a voice in Admin or /api/voxcpm/voices, then select it in Neiroha.

VoxCPM2 recommends natural-language style prompts, for example:

(A young woman, gentle and sweet voice)Hello, welcome to VoxCPM2.

Do not depend on undocumented square-bracket tokens.

API Prefix

OpenAI-compatible routes:

MethodPathPurpose
GET/healthHealth check
GET/v1/modelsList voice sets
GET/v1/audio/voicesList voice profiles
GET/v1/audio/speakersSpeaker-list compatibility
POST/v1/audio/speechOpenAI-compatible synthesis

Native prefix is /api/voxcpm. Legacy /voxcpm/* routes remain for compatibility; new integrations should use the standard prefix.

Sources