Cosy Voice 300 M Instruct LLM Model

Introduction

CosyVoice-300M-Instruct is a model developed by FunAudioLLM designed for audio processing with capabilities for zero-shot, cross-lingual, SFT (Supervised Fine-Tuning), and instruct inference. It is part of the CosyVoice suite, offering advanced features for generating and manipulating voice data.

Architecture

CosyVoice utilizes a multi-faceted architecture to support various inference methodologies, including zero-shot, cross-lingual, and SFT. It integrates components from several other projects, such as FunASR, FunCodec, and Matcha-TTS, to enhance its functionality and performance in voice processing tasks.

Training

The CosyVoice models, including CosyVoice-300M, CosyVoice-300M-SFT, and CosyVoice-300M-Instruct, are pretrained and available for download. Users interested in training from scratch are advised to follow the provided examples and scripts, which guide them through the training process using the provided resources and dependencies.

Guide: Running Locally

Clone the Repository

git clone --recursive https://github.com/FunAudioLLM/CosyVoice.git
cd CosyVoice
git submodule update --init --recursive

Set Up Environment

Install Conda: Miniconda Installation Guide

Create and activate the environment:

conda create -n cosyvoice python=3.8
conda activate cosyvoice
conda install -y -c conda-forge pynini==2.1.5
pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host=mirrors.aliyun.com

Install Additional Dependencies

For Ubuntu:
```
sudo apt-get install sox libsox-dev
```
For CentOS:
```
sudo yum install sox sox-devel
```

Download Pretrained Models

from modelscope import snapshot_download
snapshot_download('iic/CosyVoice-300M', local_dir='pretrained_models/CosyVoice-300M')
# Repeat for other models as needed

Run Basic Usage and Inference

Use the CosyVoice class for different inference modes:

from cosyvoice.cli.cosyvoice import CosyVoice
cosyvoice = CosyVoice('pretrained_models/CosyVoice-300M-SFT')
# Example SFT inference

Cloud GPUs
Consider using cloud GPU services like AWS, GCP, or Azure for intensive processing tasks to ensure efficient model training and inference.

License

This project is intended for academic purposes and technical demonstration. It includes code adapted from several open-source projects, and users should adhere to the respective licenses of these projects. The content should not infringe on rights; for any concerns, contact the project maintainers.

More Related APIs