Quran_speech_recognizer
NuwaisirQuran Speech Recognizer
Introduction
The Quran Speech Recognizer is an application designed to listen to a user's Quran recitation and identify the corresponding position in the Quran. This tool aids in providing feedback and guidance on Quranic recitation.
Architecture
The application leverages transfer learning by fine-tuning the pretrained wav2vec2-large-xlsr-53-arabic
model. The fine-tuning process utilizes data from the Quran ASR Challenge dataset available on Kaggle.
Training
The training involved using a publicly available pretrained model from Hugging Face (elgeish/wav2vec2-large-xlsr-53-arabic
) and further fine-tuning it with specific Quranic recitation data. This specialized training enhances the model's ability to accurately recognize and transcribe Quranic speech.
Guide: Running Locally
To run the Quran Speech Recognizer locally:
- Clone the repository and navigate to the directory.
- Open the
run_ui.ipynb
Jupyter Notebook. - Execute all the cells within the notebook.
- The final cell records a 5-second audio clip of your recitation (modifiable) and transcribes it.
- The transcription is matched against the 30th Juzz (Surah 78-114) of the Quran, displaying the closest matching text.
- Adjust the search range in the sixth cell if needed to cover more of the Quran.
For improved performance, consider using cloud GPUs such as those provided by AWS, Google Cloud Platform, or Azure to expedite the processing and enhance model efficiency.
License
The usage and distribution of this application follow the licensing terms as per the model's page on Hugging Face. Please refer to the model's license for detailed information.