GPT‑SoVITS is a state-of-the-art voice conversion and TTS system that enables zero‑shot and few‑shot synthesis based on a short vocal sample (e.g., 5 seconds). It supports cross‑lingual speech synthesis across English, Chinese, Japanese, Korean, Cantonese, and more. It's powered by VITS architecture enhanced for few‑sample adaptation and real‑time usability.
Features
- Zero‑shot TTS: generate speech from a 5‑second voice sample
- Few‑shot fine-tuning: 1 minute of data for improved voice likeness
- Cross-lingual support across multiple languages
- Web UI for inference and batch generation
- Open-source with pretrained model weights
- Active community and publication‑grade performance
Categories
Voice CloningLicense
MIT LicenseFollow GPT-SoVITS
Other Useful Business Software
Manage your Classes
DreamClass helps you efficiently manage all of your processes. Enjoy professional school management for any educational institution, in minutes!
Rate This Project
Login To Rate This Project
User Reviews
Be the first to post a review of GPT-SoVITS!