Introduction

The CCMUSIC-DATABASE model for song structure analysis employs methodologies from the Harmonix set, using structural features for boundary identification and 2D-Fourier Magnitude Coefficients (2D-FMC) for segment labeling based on acoustic similarity. The model is implemented using the Music Structure Analysis Framework (MSAF) and utilizes Constant-Q Transform (CQT) features as input. It measures performance using the F-measure across metrics like Hit Rate for boundary retrieval and Pairwise Frame Clustering and Entropy Scores for segment labeling, evaluated via mir_eval.

Architecture

The model architecture leverages structural segmentation and acoustic similarity techniques, utilizing CQT features for input and employing MSAF for processing. It focuses on identifying boundaries and labeling segments accurately, using evaluation metrics like the F-measure to assess performance on tasks such as boundary retrieval and segment labeling.

Training

The model was trained using the CCMUSIC-DATABASE dataset here, which is designed for music information retrieval research. The dataset includes diverse music samples to improve the model's ability to classify and segment music accurately.

Guide: Running Locally

To run the model locally, follow these steps:

  1. Clone the Repository:
    git clone https://www.modelscope.cn/ccmusic-database/song_structure.git
    
  2. Install Dependencies:
    pip install modelscope
    
  3. Run the Model: Use the ModelScope API to download and use the model:
    from modelscope import snapshot_download
    model_dir = snapshot_download('ccmusic-database/song_structure')
    

For optimal performance, consider using cloud GPUs from services like AWS, Azure, or Google Cloud.

License

This project is licensed under the MIT License, allowing for wide use and modification with proper attribution.

More Related APIs in Audio Classification