tts_transformer tr cv7
facebookIntroduction
The TTS_TRANSFORMER-TR-CV7 is a text-to-speech model developed using the fairseq library. It is designed specifically for the Turkish language and provides a single-speaker male voice. The model is trained on the Common Voice v7 dataset.
Architecture
The model utilizes a transformer architecture for converting text to speech, as discussed in the fairseq S^2 paper. It leverages the fairseq toolkit, which is known for its scalability and integration capabilities in speech synthesis tasks.
Training
The model is trained using the Common Voice v7 dataset. This dataset provides a large collection of diverse voice recordings, enabling the model to generate Turkish speech with high fidelity.
Guide: Running Locally
To use the TTS_TRANSFORMER-TR-CV7 model locally, follow these steps:
- Install fairseq: Ensure you have the fairseq library installed.
- Load the Model: Use
load_model_ensemble_and_task_from_hf_hub
to load the model from Hugging Face's model hub. - Prepare the Text: Define the input text you want to convert to speech.
- Generate Speech: Use
TTSHubInterface
to generate audio from the text. - Play the Audio: Use
IPython.display.Audio
to play the generated audio.
from fairseq.checkpoint_utils import load_model_ensemble_and_task_from_hf_hub
from fairseq.models.text_to_speech.hub_interface import TTSHubInterface
import IPython.display as ipd
models, cfg, task = load_model_ensemble_and_task_from_hf_hub(
"facebook/tts_transformer-tr-cv7",
arg_overrides={"vocoder": "hifigan", "fp16": False}
)
model = models[0]
TTSHubInterface.update_cfg_with_data_cfg(cfg, task.data_cfg)
generator = task.build_generator(model, cfg)
text = "Merhaba, bu bir deneme çalışmasıdır."
sample = TTSHubInterface.get_model_input(task, text)
wav, rate = TTSHubInterface.get_prediction(task, model, generator, sample)
ipd.Audio(wav, rate=rate)
Consider using cloud GPUs such as those offered by AWS, Google Cloud, or Azure for faster processing and inference.
License
The model and its usage are subject to the licensing terms specified by fairseq, Facebook AI Research, and Hugging Face. Please refer to their respective licensing documents for detailed information.