whisper.cpp
ggerganovIntroduction
whisper.cpp
is a project that converts OpenAI's Whisper models into a format suitable for use with GGML. It is designed for automatic speech recognition tasks and supports various model configurations.
Architecture
The project provides multiple configurations of Whisper models, ranging from tiny
to large
, with various quantization levels (q5
, q8
). These models are optimized to balance disk space and computational efficiency. The models are available in different sizes and configurations to suit various use cases and hardware capabilities.
Training
The models in whisper.cpp
are derived from OpenAI's Whisper models and converted to a format compatible with GGML. The training process itself is based on the original Whisper model training pipelines, adapted for efficient inference in the GGML ecosystem.
Guide: Running Locally
-
Clone the Repository:
git clone https://github.com/ggerganov/whisper.cpp cd whisper.cpp
-
Download the Models: Choose and download the desired model variant from the available options like
tiny
,base
,small
,medium
, orlarge
. -
Install Dependencies: Ensure you have the necessary dependencies installed. This might include libraries for audio processing and GGML support.
-
Run the Model: Execute the model using the provided scripts or integrate it into your existing pipeline for automatic speech recognition tasks.
-
Hardware Recommendations: For optimal performance, consider using cloud-based GPUs such as those available on AWS, Google Cloud Platform, or Azure. These resources can handle the computational demands of larger models more efficiently.
License
The project is licensed under the MIT License, allowing for flexible use, modification, and distribution.