moonshine
UsefulSensorsIntroduction
Moonshine is a model developed by Useful Sensors for automatic speech recognition (ASR), specifically designed to transcribe English speech into text. It aims to enable real-time transcription on low-cost hardware. The model card provides details on the model’s architecture, training, and intended usage.
Architecture
Moonshine is a sequence-to-sequence ASR and speech translation model. It has two variants: a tiny model with 27 million parameters and a base model with 61 million parameters. Both models support English transcription, with the base model also capable of multilingual tasks.
Training
The Moonshine models were trained on 200,000 hours of audio data and associated transcripts sourced from the internet and public datasets available on Hugging Face. The models have been optimized for platforms with limited memory and computational resources. Evaluation shows improved accuracy over similar ASR systems, though challenges like text hallucination and repetitive output persist.
Guide: Running Locally
- Install
uv
for environment management: Follow the installation guide. - Set up and activate a virtual environment:
uv venv env_moonshine source env_moonshine/bin/activate
- Install the Moonshine package:
- For PyTorch backend:
uv pip install useful-moonshine@git+https://github.com/usefulsensors/moonshine.git export KERAS_BACKEND=torch
- For TensorFlow backend:
uv pip install useful-moonshine[tensorflow]@git+https://github.com/usefulsensors/moonshine.git export KERAS_BACKEND=tensorflow
- For JAX backend:
uv pip install useful-moonshine[jax]@git+https://github.com/usefulsensors/moonshine.git export KERAS_BACKEND=jax # Use useful-moonshine[jax-cuda] for JAX on GPU
- For PyTorch backend:
- Test transcription:
The first argument is the path to the audio file, and the second is the model name.import moonshine moonshine.transcribe(moonshine.ASSETS_DIR / 'beckett.wav', 'moonshine/tiny')
Cloud GPUs: Consider using services like AWS, Azure, or Google Cloud for GPU support if needed.
License
Moonshine is released under the MIT License, allowing for broad usage and modification.