Connect Local Inference Backends

Local inference backends are useful for local GPUs, LAN inference servers, or workflows that keep text local. Neiroha does not train models; it forwards UI, queue, project, and local API requests to already running TTS services.

Pre-Connection Checklist

Start the TTS backend and confirm the real listening address in the terminal or logs.
On the machine running Neiroha, open the backend /health, /v1/models, or voice list URL.
If Neiroha runs in an Android emulator, use 10.0.2.2 for the host machine, not 127.0.0.1.
If Neiroha runs on an Android phone, use the computer's LAN IP and allow the port through Windows Firewall.
Return to Providers in Neiroha and add or edit a provider.

Common Adapters

Backend Type	Neiroha Adapter	Base URL Example	Character Setup
OpenAI-compatible TTS	OpenAI TTS API Compatible	`http://127.0.0.1:8880/v1`	Model and preset voice
GPT-SoVITS	GPT-SoVITS	`http://127.0.0.1:9880`	Trained voice or reference-audio clone
CosyVoice3	CosyVoice Native	`http://127.0.0.1:9880`	Prompt clone, cross-lingual clone, instruct
VoxCPM2	VoxCPM2 Native	`http://127.0.0.1:8000`	Registered voice, voice design, clone
Windows system voice	Windows System TTS	Empty	Enumerates local Windows SAPI voices

CosyVoice3 and GPT-SoVITS both default to port 9880. When running both, change one backend's [api].port in configs/server.toml, or use the random port chosen by the launcher and copy the logged address into Neiroha.

Backend guides:

Windows Portable Backend Packages

Local backends can be downloaded as portable Releases without a full development environment. The current Windows portable packages are built for NVIDIA GPU / CUDA environments and mainly target RTX 30 / 40 / 50 series users. Download all split archive parts into the same directory, then extract from .001 with 7-Zip.

Backend	GitHub Release	Baidu Netdisk Mirror	Current Asset Pattern
GPT-SoVITS	V1.0.0	Mirror	`Neiroha-GPT-SoVITS-Portable.7z.001` through `.003`
VoxCPM2	V1.0.0	Mirror	`Neiroha-VoxCPM-portable.7z.001` through `.004`
CosyVoice3	V1.0.0	Mirror	`neiroha-cosyvoice3-portable.7z.001` through `.006`

Portable packages use runtime/ under the extracted directory for logs, outputs, temporary files, and voice registry. Do not move only one split part, and avoid long-term use from a system temporary directory.

Backend Selection Quick Reference

This table is a relative ranking for the current Neiroha Windows portable backends, not a universal hardware benchmark. More VRAM stars mean lower memory pressure; more speed stars mean faster synthesis. Actual results depend on GPU, driver, text length, reference audio, concurrency, and model preload state.

Backend	VRAM Floor	VRAM Friendliness	Synthesis Speed	Good For	Notes
GPT-SoVITS v2ProPlus	8 GB VRAM is safer	★★★★★	★★★★★	Trained voices, reference-audio cloning, batch generation	Lowest VRAM use and fastest among the three; clone mode needs reference text.
CosyVoice3 0.5B	8 GB VRAM recommended	★★★☆☆	★★★☆☆	Cross-lingual cloning, instruction control, multilingual trials	Broader capability set with middle-ground speed and VRAM use.
VoxCPM2	Official reference is about 8 GB VRAM	★★☆☆☆	★★☆☆☆	Voice design, multilingual and dialect coverage, high-fidelity cloning	Highest VRAM use and slowest among the three; 8 GB can run it, but start concurrency at `1`.

Source Environments and Multiple Backends

Neiroha local backend projects use Pixi to manage Python, Conda, PyPI dependencies, and common launch commands. When running multiple inference backends long-term on one machine, building each backend from source and downloading only the required model assets is usually easier to upgrade, debug, and keep under disk control than keeping several full portable packages.

Pixi's underlying ecosystem reuses rattler/Conda package cache and uv/PyPI cache, and can reuse files through hard links when available. Repeated dependencies across backends usually do not take a full duplicate copy. Model weights, sample voices, and runtime outputs are not automatically shared across projects; organize them by backend and model version.

OpenAI-Compatible Services

OpenAI-compatible TTS is the lowest-friction local protocol for Kokoro, XTTS, Orpheus, KoboldCpp, or a custom /v1/audio/speech wrapper.

Select OpenAI TTS API Compatible.
Set Base URL to the API version layer, for example http://127.0.0.1:8880/v1.
Leave API Key empty if the local service has no authentication.
Click Fetch All. Neiroha tries common list endpoints such as models, audio/voices, and speakers.
If the voice list is empty, manually fill the backend-supported voice name when creating a character.
After Health Check passes, create a preset voice character and run Quick Test.

GPT-SoVITS

GPT-SoVITS is useful for trained speaker voices and reference-audio cloning.

Start the backend: portable package uses start_portable.bat serve; source environment uses pixi run serve.
Select the GPT-SoVITS adapter.
Set Base URL to the service root, default http://127.0.0.1:9880.
Click Fetch All. The backend provides /v1/models, /v1/audio/voices, and /api/gpt-sovits/voices.
Create characters with either:
- Registered voice: select a server voice such as genshin-keqing.
- Clone: upload reference audio and fill reference text, prompt language, and target text language.
Use it in Dialogue or Phase batches only after Quick Test succeeds.

CosyVoice Native

CosyVoice Native uses Neiroha's JSON / multipart adapter and does not need the backend to pretend to be a pure OpenAI service.

Start the backend: portable package uses start_portable.bat; source environment uses pixi run serve.
Select CosyVoice Native.
Set Base URL to the service root, default http://127.0.0.1:9880.
Health Check calls /health.
Fetch All reads /v1/models, /v1/audio/voices, and /api/cosyvoice/voices.
Fill character fields by mode: prompt_clone needs reference audio and prompt text; cross_lingual only needs reference audio; instruct needs reference audio and instruction.

VoxCPM2 Native

VoxCPM2 Native supports registered voices, natural-language voice design, and reference-audio cloning.

Start the backend: portable package uses start_portable.bat; source environment uses pixi run serve.
Select VoxCPM2 Native.
Set Base URL to http://127.0.0.1:8000 or your actual service address.
Fetch All reads /v1/models, /v1/audio/voices, and /api/voxcpm/voices.
Create characters using registered voice, design, clone, or ultimate clone.
clone needs reference audio but not reference text; ultimate_clone needs reference audio and matching prompt text.

Android Connecting to a Local Backend

Neiroha Location	Backend Location	Base URL
Windows desktop Neiroha	Same Windows machine	`http://127.0.0.1:port`
Android emulator	Host Windows machine	`http://10.0.2.2:port`
Android phone	LAN computer	`http://LAN-IP:port`
Android phone	Public server	`https://domain` or public IP

If a phone cannot access the service, open the same address in the phone browser first. If the browser also fails, check firewall, listening address, proxy, or LAN isolation.

Common Failures

Symptom	Common Cause	Fix
Health Check fails	Wrong URL layer or unopened port	OpenAI-compatible usually includes `/v1`; native adapters usually use service root.
Emulator cannot reach host	Used `127.0.0.1`	Use `10.0.2.2`.
Phone cannot reach computer	Firewall block or backend only listens on localhost	Bind backend to `0.0.0.0` and allow the port.
Fetch All is empty	Backend lacks list APIs or the port points to the wrong service	Open `/v1/models` and voice list manually, then fill model and voice if needed.
Batch generation stalls	Local VRAM or concurrency is too high	Start provider max concurrency at `1`.

Pre-Connection Checklist​

Common Adapters​

Windows Portable Backend Packages​

Backend Selection Quick Reference​

Source Environments and Multiple Backends​

OpenAI-Compatible Services​

GPT-SoVITS​

CosyVoice Native​

VoxCPM2 Native​

Android Connecting to a Local Backend​

Common Failures​