hindi_model_with_lm_vakyansh

Harveenchadha

Hindi Model with LM Vakyansh

Introduction

The Hindi Model with LM Vakyansh is an automatic speech recognition (ASR) model tailored for the Hindi language. It is part of the Hugging Face model collection and utilizes the wav2vec2 architecture. This model has been evaluated on datasets such as Mozilla Foundation's Common Voice and is listed on the Hugging Face ASR leaderboard.

Architecture

The model is based on the wav2vec2 architecture, which is known for its effectiveness in speech recognition tasks. It leverages large-scale pre-trained models to improve recognition accuracy with the Hindi language.

Training

The model was trained using the indic-voice dataset and evaluated on multiple versions of the Common Voice dataset. The performance was measured using Word Error Rate (WER) and Character Error Rate (CER) metrics, achieving:

  • WER of 19.14 and CER of 5.93 on Common Voice.
  • WER of 17.4 and CER of 7.13 on Common Voice-7.0.
  • WER of 18.99 and CER of 8.91 on Common Voice-8.0.

Guide: Running Locally

To run the Hindi Model with LM Vakyansh locally, follow these steps:

  1. Environment Setup: Ensure you have Python and pip installed.
  2. Install Libraries: Install the Hugging Face Transformers and PyTorch libraries.
    pip install transformers torch
    
  3. Download Model: Use the Hugging Face transformers library to load the model.
    from transformers import Wav2Vec2ForCTC, Wav2Vec2Tokenizer
    
    tokenizer = Wav2Vec2Tokenizer.from_pretrained("Harveenchadha/hindi_model_with_lm_vakyansh")
    model = Wav2Vec2ForCTC.from_pretrained("Harveenchadha/hindi_model_with_lm_vakyansh")
    
  4. Inference: Prepare your audio inputs and perform inference using the model.

For more intensive computation, consider using cloud GPUs such as those provided by AWS, Google Cloud, or Azure to facilitate faster processing and model inference.

License

The Hindi Model with LM Vakyansh is licensed under the Apache 2.0 License, allowing for both personal and commercial use.

More Related APIs in Automatic Speech Recognition