Leaderboard
This site provides the overall results of MARBLE benchmark under the constrained evaluation track. Details of the setting can be found at the submit page.
Noted
- The results of sequence labelling tasks (Beat Tracking and Source Seperation) are not available for specific models due to the cost limitation of the models.
- The results of Jukebox-5B model on MTG tasks hit the computational wall of one week on a single consumer GPU (RTX3090) in MARBLE.
TaggingKeyGenreRhythmEmotionInstrumentPitchTechSingerInstrumentMoodThemeGenreTop50Source SeparationOn Available Results
Dataset | MTT | GS | GTZAN | GTZAN | EMO | Nsynth | NSynth | VocalSet | VocalSet | MTG | MTG | MTG | MTG | MUSDB | ALL TASKS | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Task | Tagging | Key | Genre | Rhythm | Emotion | Instrument | Pitch | Tech | Singer | Instrument | MoodTheme | Genre | Top50 | Source Separation | On Available Results | |||||||||
Metric | ROC | AP | AccRefined | Acc | F1beat | R2V | R2A | Acc | Acc | Acc | Acc | ROC | AP | ROC | AP | ROC | AP | ROC | AP | SDRvocals | SDRdrums | SDRbass | SDRother | Average Score |
MAP-MERT-v0-95M | 90.7 | 38.2 | 64.1 | 74.8 | 88.3 | 52.9 | 69.9 | 70.4 | 92.3 | 73.6 | 77.0 | 76.6 | 18.7 | 75.9 | 13.7 | 86.9 | 18.5 | 82.8 | 28.8 | 5.6 | 5.6 | 4.0 | 3.0 | 62.3 |
MAP-MERT-v0-95M-public | 90.7 | 38.4 | 67.3 | 72.8 | 88.1 | 59.1 | 72.8 | 70.4 | 92.3 | 75.6 | 78.0 | 77.5 | 19.6 | 76.2 | 13.3 | 87.2 | 18.8 | 83.0 | 28.9 | 5.5 | 5.5 | 3.7 | 3.0 | 63.0 |
MAP-MERT-v1-95M | 91.0 | 39.3 | 63.5 | 74.8 | 88.3 | 55.5 | 76.3 | 70.7 | 92.6 | 74.2 | 83.7 | 77.5 | 19.4 | 76.4 | 13.4 | 87.1 | 18.8 | 83.0 | 29.0 | 5.5 | 5.5 | 3.8 | 3.1 | 63.3 |
MAP-MERT-v1-330M | 91.1 | 39.5 | 61.7 | 77.6 | 87.9 | 59.0 | 75.8 | 72.6 | 94.4 | 76.9 | 87.1 | 78.1 | 19.8 | 76.5 | 14.0 | 86.7 | 18.6 | 83.4 | 29.9 | 5.3 | 5.6 | 3.6 | 3.0 | 64.2 |
MAP-Music2Vec | 90.0 | 36.2 | 50.6 | 74.1 | 68.2 | 52.1 | 71.0 | 69.3 | 93.1 | 71.1 | 81.4 | 76.1 | 19.2 | 76.7 | 14.3 | 87.1 | 18.8 | 83.0 | 29.2 | 5.5 | 5.5 | 4.1 | 3.0 | 59.9 |
MusiCNN | 90.3 | 37.8 | 14.4 | 73.5 | - | 44.0 | 68.8 | 72.6 | 64.1 | 70.3 | 57.0 | 74.0 | 17.2 | 74.0 | 12.6 | 86.0 | 17.5 | 82.0 | 27.5 | - | - | - | - | - |
CLMR | 89.5 | 36.0 | 14.8 | 65.2 | - | 44.4 | 70.3 | 67.9 | 47.0 | 58.1 | 49.9 | 73.5 | 17.0 | 73.5 | 12.6 | 84.6 | 16.2 | 81.3 | 26.4 | - | - | - | - | - |
Jukebox-5B | 91.4 | 40.6 | 63.8 | 77.9 | - | 57.0 | 73.0 | 70.4 | 91.6 | 76.7 | 82.6 | 78.5 | 22.0 | 77.6 | 15.3 | 88.0 | 20.5 | 83.4 | 30.4 | - | - | - | - | - |
MULE | 91.2 | 40.1 | 64.9 | 75.5 | - | 60.7 | 73.1 | 74.6 | 88.5 | 75.5 | 87.5 | 76.6 | 19.2 | 78.0 | 15.4 | 88.0 | 20.4 | 83.7 | 30.6 | - | - | - | - | - |
We will provide multiple leaderboads for different task categories in near future, including:
- overall
- sample classification
- sequence labelling