whisper large v3 mlx
mlx-communityIntroduction
The whisper-large-v3-mlx
model is part of the MLX community's offerings on Hugging Face, designed for efficient speech transcription using the mlx-whisper
library.
Architecture
The model is built within the MLX library framework, leveraging large-scale architectures to facilitate accurate speech-to-text capabilities.
Training
Specific details on the model's training process are not provided in the README. However, it typically involves fine-tuning pre-trained models on extensive speech datasets to enhance transcription accuracy.
Guide: Running Locally
To use the whisper-large-v3-mlx
model:
-
Install MLX-Whisper:
Ensure you have Python installed, then execute:pip install mlx-whisper
-
Transcribe Speech:
Utilize the following code snippet to transcribe speech files:import mlx_whisper result = mlx_whisper.transcribe( speech_file, path_or_hf_repo="mlx-community/whisper-large-v3-mlx")
-
Hardware Suggestions:
For optimal performance, consider using cloud GPUs from providers such as AWS, Google Cloud, or Azure.
License
The model is licensed under the MIT License, allowing for extensive reuse and modification with appropriate credit.