Quran_speech_recognizer

Nuwaisir

Quran Speech Recognizer

Introduction

The Quran Speech Recognizer is an application designed to listen to a user's Quran recitation and identify the corresponding position in the Quran. This tool aids in providing feedback and guidance on Quranic recitation.

Architecture

The application leverages transfer learning by fine-tuning the pretrained wav2vec2-large-xlsr-53-arabic model. The fine-tuning process utilizes data from the Quran ASR Challenge dataset available on Kaggle.

Training

The training involved using a publicly available pretrained model from Hugging Face (elgeish/wav2vec2-large-xlsr-53-arabic) and further fine-tuning it with specific Quranic recitation data. This specialized training enhances the model's ability to accurately recognize and transcribe Quranic speech.

Guide: Running Locally

To run the Quran Speech Recognizer locally:

  1. Clone the repository and navigate to the directory.
  2. Open the run_ui.ipynb Jupyter Notebook.
  3. Execute all the cells within the notebook.
    • The final cell records a 5-second audio clip of your recitation (modifiable) and transcribes it.
    • The transcription is matched against the 30th Juzz (Surah 78-114) of the Quran, displaying the closest matching text.
  4. Adjust the search range in the sixth cell if needed to cover more of the Quran.

For improved performance, consider using cloud GPUs such as those provided by AWS, Google Cloud Platform, or Azure to expedite the processing and enhance model efficiency.

License

The usage and distribution of this application follow the licensing terms as per the model's page on Hugging Face. Please refer to the model's license for detailed information.

More Related APIs in Automatic Speech Recognition