lakhclean_mmmtrack_4bars_d 2048

ai-guru

Introduction

The LAKHCLEAN_MMMTRACK_4BARS_D-2048 model is a GPT-2-based model designed for music generation. It is trained to represent music pieces as text, thereby converting music generation tasks into language generation problems. The model generates four bars of music at a time, with a resolution of 16th notes and a 4/4 time signature.

Architecture

  • Model Type: GPT-2
  • Decoders: 6
  • Attention Heads: 8 per decoder
  • Context Length: 2048
  • Embedding Dimensions: 512

Training

The model was trained on approximately 15,000 MIDI files from the Lakhclean dataset. It utilizes note density conditioning for its music generation tasks. Several variations of this model have been developed, including those with different resolutions, bar inpainting, and chord conditioning.

Guide: Running Locally

To run this model locally, follow these steps:

  1. Clone the Repository: Download the model files from the Hugging Face repository.
  2. Set Up Environment:
    • Install Python and necessary libraries (e.g., PyTorch, Transformers).
  3. Load the Model: Use the Transformers library to load the model and tokenizer.
  4. Generate Music: Use the provided notebook in the repository to generate symbolic music and render it.

For an optimal experience, especially for model training and inference, consider using cloud GPU services such as Google Colab.

License

You are free to use this model in open-source environments without charge, but please credit the author. For commercial use, contact the author to discuss licensing terms, as fees may apply.

More Related APIs in Text Generation