hindi_model_with_lm_vakyansh
HarveenchadhaHindi Model with LM Vakyansh
Introduction
The Hindi Model with LM Vakyansh is an automatic speech recognition (ASR) model tailored for the Hindi language. It is part of the Hugging Face model collection and utilizes the wav2vec2
architecture. This model has been evaluated on datasets such as Mozilla Foundation's Common Voice and is listed on the Hugging Face ASR leaderboard.
Architecture
The model is based on the wav2vec2
architecture, which is known for its effectiveness in speech recognition tasks. It leverages large-scale pre-trained models to improve recognition accuracy with the Hindi language.
Training
The model was trained using the indic-voice
dataset and evaluated on multiple versions of the Common Voice dataset. The performance was measured using Word Error Rate (WER) and Character Error Rate (CER) metrics, achieving:
- WER of 19.14 and CER of 5.93 on Common Voice.
- WER of 17.4 and CER of 7.13 on Common Voice-7.0.
- WER of 18.99 and CER of 8.91 on Common Voice-8.0.
Guide: Running Locally
To run the Hindi Model with LM Vakyansh locally, follow these steps:
- Environment Setup: Ensure you have Python and
pip
installed. - Install Libraries: Install the Hugging Face Transformers and PyTorch libraries.
pip install transformers torch
- Download Model: Use the Hugging Face
transformers
library to load the model.from transformers import Wav2Vec2ForCTC, Wav2Vec2Tokenizer tokenizer = Wav2Vec2Tokenizer.from_pretrained("Harveenchadha/hindi_model_with_lm_vakyansh") model = Wav2Vec2ForCTC.from_pretrained("Harveenchadha/hindi_model_with_lm_vakyansh")
- Inference: Prepare your audio inputs and perform inference using the model.
For more intensive computation, consider using cloud GPUs such as those provided by AWS, Google Cloud, or Azure to facilitate faster processing and model inference.
License
The Hindi Model with LM Vakyansh is licensed under the Apache 2.0 License, allowing for both personal and commercial use.