wav2vec2 large xlsr basque
cahyaIntroduction
WAV2VEC2-LARGE-XLSR-BASQUE is a model fine-tuned from Facebook's Wav2Vec2-Large-XLSR-53, specifically for the Basque language using the Common Voice dataset. It is designed for automatic speech recognition tasks. This model requires speech input sampled at 16kHz.
Architecture
The model utilizes the Wav2Vec 2.0 architecture, leveraging transformers for processing audio inputs. It is built on the XLSR (cross-lingual speech representations) framework, which allows fine-tuning on multiple languages, in this case, Basque.
Training
The model was trained on the Basque subset of the Common Voice dataset, using the train, validation, and test splits. Detailed training methodologies and scripts can be found here.
Guide: Running Locally
-
Environment Setup:
- Install PyTorch and Transformers library.
- Install
torchaudio
anddatasets
.
-
Load Dataset:
from datasets import load_dataset test_dataset = load_dataset("common_voice", "eu", split="test[:2%]")
-
Model and Processor Initialization:
from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor processor = Wav2Vec2Processor.from_pretrained("cahya-wirawan/wav2vec2-large-xlsr-basque") model = Wav2Vec2ForCTC.from_pretrained("cahya-wirawan/wav2vec2-large-xlsr-basque")
-
Preprocess Audio Files:
import torchaudio def speech_file_to_array_fn(batch): speech_array, sampling_rate = torchaudio.load(batch["path"]) resampler = torchaudio.transforms.Resample(sampling_rate, 16_000) batch["speech"] = resampler(speech_array).squeeze().numpy() return batch test_dataset = test_dataset.map(speech_file_to_array_fn)
-
Inference:
inputs = processor(test_dataset[:2]["speech"], sampling_rate=16_000, return_tensors="pt", padding=True) with torch.no_grad(): logits = model(inputs.input_values, attention_mask=inputs.attention_mask).logits predicted_ids = torch.argmax(logits, dim=-1) print("Prediction:", processor.batch_decode(predicted_ids))
Cloud GPUs: For faster processing and leveraging CUDA, consider using cloud services like AWS EC2, Google Cloud, or Azure with GPU instances.
License
This model is licensed under the Apache-2.0 License.