spkrec ecapa voxceleb LLM Model

Introduction

This repository provides tools for speaker verification using a pretrained ECAPA-TDNN model from SpeechBrain. The model is trained on the VoxCeleb dataset and is capable of extracting speaker embeddings and performing verification tasks. It achieves an Equal Error Rate (EER) of 0.80% on the VoxCeleb1-test set.

Architecture

The system utilizes an ECAPA-TDNN model, which combines convolutional and residual blocks. Embeddings are extracted through attentive statistical pooling, and the model is trained using Additive Margin Softmax Loss. Speaker verification is accomplished by computing the cosine distance between speaker embeddings.

Training

The model was trained using SpeechBrain with recordings sampled at 16kHz. To train from scratch:

Clone the SpeechBrain repository.

git clone https://github.com/speechbrain/speechbrain/

Install the required packages.

cd speechbrain
pip install -r requirements.txt
pip install -e .

Execute the training script.

cd recipes/VoxCeleb/SpeakerRec
python train_speaker_embeddings.py hparams/train_ecapa_tdnn.yaml --data_folder=your_data_folder

Guide: Running Locally

Install SpeechBrain:

pip install git+https://github.com/speechbrain/speechbrain.git@develop

Compute Speaker Embeddings:

import torchaudio
from speechbrain.inference.speaker import EncoderClassifier
classifier = EncoderClassifier.from_hparams(source="speechbrain/spkrec-ecapa-voxceleb")
signal, fs = torchaudio.load('tests/samples/ASR/spk1_snt1.wav')
embeddings = classifier.encode_batch(signal)

Perform Speaker Verification:

from speechbrain.inference.speaker import SpeakerRecognition
verification = SpeakerRecognition.from_hparams(source="speechbrain/spkrec-ecapa-voxceleb", savedir="pretrained_models/spkrec-ecapa-voxceleb")
score, prediction = verification.verify_files("tests/samples/ASR/spk1_snt1.wav", "tests/samples/ASR/spk2_snt1.wav")

Inference on GPU: Add run_opts={"device":"cuda"} when calling from_hparams.

Cloud GPUs

For optimal performance, consider using cloud GPU services such as AWS, Google Cloud, or Azure for running computations.

License

This project is licensed under the Apache-2.0 License.

More Related APIs