bel_canto
ccmusic-databaseIntroduction
The Classical and Ethnic Vocal Style Classification model is designed to differentiate between classical and ethnic vocal styles using audio samples sung by professional vocalists. The model is fine-tuned on a dataset of four categories pre-processed into spectrograms. Initially pretrained in the computer vision domain, the model’s backbone network is adapted for vocal style classification, enabling it to learn general audio features and adjust for subtle differences in vocal styles. This model is beneficial for the music industry and cultural preservation by accurately categorizing vocal performances.
Architecture
The model utilizes a backbone network initially pretrained on computer vision tasks, which helps in recognizing general features. It is then fine-tuned with audio-specific data to adapt to the intricacies of vocal styles. The input comprises spectrograms, allowing the model to analyze temporal and frequency components effectively, enhancing its discriminative ability between classical and ethnic vocal styles.
Training
The model undergoes fine-tuning on a specific dataset, using techniques that adjust the pretrained features to the task of vocal style classification. The dataset consists of spectrogram representations of audio samples from classical and ethnic singing traditions, allowing the model to capture and learn unique vocal patterns.
Guide: Running Locally
- Download the Model:
from modelscope import snapshot_download model_dir = snapshot_download('ccmusic-database/bel_canto')
- Clone the Repository:
GIT_LFS_SKIP_SMUDGE=1 git clone git@hf.co:ccmusic-database/bel_canto cd bel_canto
- Run Locally: Ensure you have the necessary dependencies and set up your environment as per the repository's instructions.
For enhanced performance, consider using cloud GPU services like Google Cloud, AWS, or Azure to manage intensive processing tasks.
License
The model is licensed under the MIT License, allowing for commercial use, distribution, modification, and private use.