Voice Cloner XTTS-v2

Clone voices with a single audio file and generate speech in 16 langauges.

© 2024 Grandline AI, Inc. All rights reserved.

Terms of Service Privacy policy

About

XTTSv2 is a voice cloning model released by Coqui. It is known for its improved voice cloning, better audio quality, impressive prosody, and expressiveness.

Features

Multi-lingual: Supports spech generation in 16 languages.
Cross-language voice cloning: Can use a voice in one language to generate speech in another language.

Limitations

Not Perfect: Works pretty well but not perfect. Generated audio may have some artifacts.
Needs good input audio quality: Requires a good reference audio for voice cloning. The better the reference audio, the better the generated audio.

Usage Tips

Only 1 Voice in Reference Speaker File: The reference speaker file should only contain 1 voice. If it contains multiple voices, the generated audio will sound bad.
Clear Speech: The reference speaker should talk clearly.