faster whisper large v3 turbo ct2
deepdmlIntroduction
The FASTER-WHISPER-LARGE-V3-TURBO-CT2 model is an automatic speech recognition model supporting 100 languages. It is designed for use with CTranslate2 and similar projects, such as faster-whisper. The model is available under the MIT license.
Architecture
The model is a conversion of the deepdml/whisper-large-v3-turbo into the CTranslate2 format, allowing for optimized performance in applications using CTranslate2. This conversion supports audio processing and automatic speech recognition tasks.
Training
The model's weights are stored in FP16 format for optimized performance. Users can adjust the compute type during loading using the compute_type
option in CTranslate2 to suit specific needs.
Guide: Running Locally
- Install Required Libraries: Ensure you have CTranslate2 and faster-whisper installed.
- Load the Model: Use the following example to load and run the model:
from faster_whisper import WhisperModel model = WhisperModel("deepdml/faster-whisper-large-v3-turbo-ct2") segments, info = model.transcribe("audio.mp3") for segment in segments: print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
- Run on Cloud GPUs: For optimal performance, consider using cloud platforms like AWS, GCP, or Azure that provide GPU support.
License
This model is released under the MIT License, allowing for flexible use and distribution in various projects.