- Whisper-ASR ASR http://localhost:8001 语音转文字 / 字幕
nohup ~/llama/whisper.cpp/build/bin/whisper-server -m /data/models/asr/whisper-large-v3-q8_0.gguf \
–host 0.0.0.0 –port 8001 -ngl 19
- CosyVoice-TTS TTS http://localhost:8002 文本转语音
export PYTHONPATH=~/llama/CosyVoice/third_party/Matcha-TTS//Matcha-TTS:$PYTHONPATH
nohup python3 ~/llama/tts/tts_server2.py > /var/llama/logs/tts-server.log 2>&1 &
- RapidOCR OCR http://localhost:8003 PDF/OCR 文字识别
nohup python3 ~/llama/ocr/ocr_server.py > /var/llama/logs/ocr-server.log 2>&1 &
- LiteLLM 智能路由 http://localhost:8000 所有大语言模型统一入口
- 使用路由专用模型: /data/models/TCAndon-Router
nohup ~/llama/llama-bin/llama-server -m /data/models/TCAndon-Router-Q5_K_M.gguf –host 127.0.0.1 –port 8010 -ngl 99 –log-file /var/llama/logs/tcandon_router.log > /var/llama/logs/server8000.log 2>&1 &
- 其他本地任务:Qwen3.5-122B-A10B-heretic.i1-IQ4_XS.gguf (速度可以达到27 token/s)