sashimi release

krandiash

Introduction

SaShiMi is a project focused on audio generation using state-space models. Developed by Karan Goel, Albert Gu, Chris Donahue, and Christopher Ré, this repository includes the necessary code and artifacts to replicate the findings presented in their paper, "It's Raw! Audio Generation with State-Space Models." The paper can be accessed here.

Architecture

The architecture leverages state-space models for generating audio, providing a novel approach to audio synthesis. This repository encompasses the necessary code and resources to understand and implement the model.

Training

The training process utilizes state-space models to synthesize audio, using the artifacts provided in the repository. Detailed instructions for training can be found on the GitHub page.

Guide: Running Locally

  1. Clone the Repository:

    • Use git clone https://github.com/HazyResearch/state-spaces.git to clone the repository.
  2. Navigate to the SaShiMi Directory:

    • cd state-spaces/sashimi
  3. Install Dependencies:

    • Ensure that Python and necessary libraries are installed. Use a virtual environment if possible.
    • Install required packages with pip install -r requirements.txt.
  4. Run the Model:

    • Follow the specific instructions in the repository's documentation to run the model and generate audio.
  5. Cloud GPUs:

    • For efficient training and processing, consider using cloud GPUs such as those offered by AWS, GCP, or Azure.

License

The repository and its contents are subject to the license specified in the project documentation. Ensure compliance with all licensing terms when using the code and artifacts in your work.

More Related APIs