dataset | meta data | contents | with audio |
---|---|---|---|
200DrumMachines | 7371 one-shots | yes | |
AAM | onsets, pitches, instruments, melody instrument, keys, chords, tempo, beats | 3000 (artificial) tracks | yes |
ACM_MIRUM | tempo | 1410 excerpts (60s) | yes |
ACPAS | aligned audio and scores | 2189 performances of 497 scores | downloadable |
AcousticBrainz-Genre | 15-31 genres with 265-745 subgenres | audio features for over 2000000 songs | no |
ADC2004 | predominant pitch | 20 excerpts | yes |
ADTOF | drum onsets, beats | 20 excerpts | no |
AdoVoc Pro | Monophonic and polyphonic audio files of a set of common Flamenco singing | 2 Female Singers 1 Male Singer | yes |
AED | 28 event classes | 5223 audio snippets | yes |
AIST Dance DB | street dance videos | 13,940 videos for 60 pieces | yes |
Amg1608 | valence & arousal | 1608 excerpts (30s) | no |
AMT-pilot | structure by multiple annotators | 8 songs | yes |
AMS | key, harmony, phrases | 54 movements | no |
APL | piano practice | 620 segments | yes |
artist20 | 20 artists | 1413 songs | no |
ASAP | 222 compositions & note-level alignment | 1068 MIDI performances | partially |
AudioSet | 632 event classes | 2084320 clips (10s) | no |
AVASPEECH-SMAD | speech, music | 45 hours | yes |
bach10, bach10 update | multitrack & aligned MIDI | 10 chorales | yes |
BAF | fingerprinting | 2000 track and 3425 TV audio snippets (60s) | on request |
ballroom | 8 genres & tempo & (down-)beats | 698 excerpts (30s) | yes |
beatboxset1 | percussion annotation | 14 clips | yes |
BPS-FH | functional annotation | 32 sonatas | no |
C224a | 14 genres | 224 artists | no |
C3ka | 18 genres | 3000 artists | no |
C49ka-C111ka | genres | 48800/110588 artists | no |
CadenzaWoodwind | syn. woodwind multitracks | 19 synthesized quartets | yes |
CAL10k | tags | 10870 songs | no |
CAL500 | tags | 502 songs | yes |
CarnaticRhythm | sama & beats | 176 pieces | on request |
CASD | chords by 4 annotators | 50 songs | no |
CBFdataset | 4 playing techniques (Chinese Bamboo Flute) | 10 performers | yes |
ChoirSet | MIDI, F0, beats | 2 songs, 81 takes | yes |
CCMixter | vocal & background track | 50 mixes | yes |
CCOM-HuQin | playing techniques and instruments | 845 single clips, 10 annoted excerpts | yes |
ChoCo | chords, key | 20k+ songs | no |
Chopin22 | aligned MIDI | 44 recordings | yes |
CloserMusicDB | BPM & tags | 106 tracks | yes |
Clotho | 5 descriptive captions | 4981 snippets | yes |
CMMSD | note/rest/transition & onsets & vibrato | 36 excerpts | no |
Coidach | 55 genres | 26420 songs | no |
corpusCOFLA | editorial & predominant melody | 1800 flamenco recordings | no |
covers80 | cover songs | 80 song pairs | yes |
Cross-Composer | 11 composers & piece & key & era & instrumentation | 1100 chromagrams and chord labels | no |
Cross-Era | composer & piece & key & era & instrumentation | 2000 chromagrams and chord labels | no |
CSD | MIDI, lyrics, performance | 50 + 50 songs (Korean, English) | yes |
CSD | pitch | 48 recordings | yes |
Compmusic datasets | Carnatic, Hindustani, Turkish-Makam, Beijing-Opera, Arab-Andalusian | Visit website for details | Visit website for details |
dadaGP | guitarPro tablatures | 26,181 songs | no |
DALI | aligned notes and lyrics | 5358 songs | no |
DAMP | karaoke performances & aligned lyrics & pronunciation assessment | 34000 monophonic recordings | yes |
Da-TACOS | cover songs | 25000 songs | no |
DEAM | valence & arousal | 1802 excerpts | yes |
DEAPDataset | valence & arousal & dominance & physiological data | 120 music video excerpts | no |
DESED | 10 audio event classes | approx 20k 10s clips (unlabeled, weakly/strongly labeled) | yes |
DIM-SIM | triplet similarity | 4000 snippets by 5-12 listeners | no |
Discogs-VI | musical version/covers | 348k songs with 1.9M versions | no |
DREANSS | onset times & perc. instruments | 18 excerpts | yes |
DrumPt | 4 playing techniques | app. 2000 annotations | yes (see ENST) |
E-GMD | drum timing, drummer, kit | 45537 MIDI files | yes |
EEP | Multitrack, bowing descriptions | 23 string quartets | yes |
ElBongosero | 3184 participants | 6035 tapped recordings | yes |
ElectronicMusic | year, composer gender | 1878 works | no |
EMO-Soundscapes | arousal & valence | 1213 soundscape recordings | yes |
EmoMusic | arousal & valence | 744 excerpts (45s) | yes |
EMOPIA | emotion | 1,087 music clips from 387 songs | yes |
Emotify | induced emotion | 400 excerpts | yes |
EMusic | arousal & valence | 100 excerpts (experimental music) | yes |
ENEPP | performance assessments | score-aligned audio | yes |
EnsembleSet | different mix formats | 80 synthesized chamber ensemble pieces | yes |
ENST-Drums | onset times & perc. instruments & playing technique | 318 segments | yes |
EPIC-Sounds | 44 audio event classes | 78,400 segments | yes |
Erkomaishvili | F0, note onsets, segments | 118 songs | yes |
Extendedballroom | 9 genres & tempo & | 4000 excerpts (30s) | downloadable |
ExtraSensory | 51 context labels | 300000 sensor recordings from 60 users | yes |
ffuhrmann | 11 predom. instr. | 6951 excerpts/220 songs | yes/no |
fifteen-songs-dataset | 15 grateful dead songs | 2617 cover performances | yes |
Filosax | beat, chord, sections, sax pitch | 48 multitrack jazz recordings | yes |
FlaBase | editorial & biographical & musicological information on flamenco, 1102 artists & 74 palos & 2860 albums | 13311 tracks | no |
FSLD | tempo, key, instrumentation, genre | 3000 annotated loops | yes |
FMA-full | 161 genres | 106574 songs | yes |
FMA-large | 161 genres | 106574 excerpts (30s) | yes |
FMA-medium | 16 genres | 25000 excerpts (30s) | yes |
FMA-small | 8 genres | 8000 excerpts (30s) | yes |
FSD-Kaggle2019 | 80 tags | 29000 clips | yes |
FSD50K | 200 audio event classes | 51,197 audio clips | yes |
Fugue | structure & cadences | 36 fugues (Bach & Shostakovich | no |
GAPS | aligned MIDI and video | 14h of guitar performance | yes |
GiantMIDI-Piano dataset | composers, transcribed score | 10854 MIDI files | no |
GigaMIDI | genre, performance | 1.4M MIDI files | no |
GiantStepsKey | key | 604 files | no |
GiantStepsTempo | tempo (alternate) | 664 files | no |
GMD | genre & valence & arousal | 1400 songs | downloadable |
GNMID14 | timestamp & country | 110M music ID matches | no |
Good-sounds.org | 12 instruments, pitch, sound quality | 8750 notes | yes |
GrooveMD | drum timing, drummer | 1150 MIDI files | yes (rendered) |
GPT | 7 guitar playing techniques | 6580 clips | yes |
GSD | start/stop of guitar solos | 60 songs | no |
GTZAN | 10 genres & tempo & key1 & key2 & beat/downbeat & metrical levels | 1000 excerpts (30s) | yes |
GuitarSet | midi & pitch & beat & chords | 360 guitar excerpts (30s) | yes |
Hainsworth | tempo | 245 excerpts (60s) | yes |
HF1 | onset, offset, pitch, 5 emotions | 5x8 songs | yes |
HarmonixSet | beats, downbeats, structure | 912 pop songs | no |
HAYDN QUARTETS | harmonic analysis in **harm syntax | 6 scores | no |
HHDS | multitrack & style & tempo | 18 songs | yes |
HJDB | downbeat | 236 excerpts | yes |
holzapfel:onset | onset times | 78 excerpts | yes |
homburg | 9 genres | 1889 excerpts (10s) | yes |
IADS | valence & arousal & dominance | 111 sound snippets | yes |
Multitrack | multitrack & style | 12 songs | yes |
IDMT-SMT-Audio-Effects | effects on bass and guitar notes | 55044 recordings | yes |
IDMT-SMT-Bass | bass performance styles | 4300 excerpts | yes |
IDMT-SMT-Bass-SINGLE-TRACK | style annotated bass lines | 17 bass lines (?) | yes |
IDMT-SMT-Drums | onset times & perc. instruments | 518 files | yes |
IDMT-SMT-Guitar | 9 guitar playing techniques | 4700+400 note events | yes |
iKala | singing voice & background | 252 excerpts (30s) | yes |
ImprovisingDuos | video and audio for improv | 24 snippets | yes |
INRIA:DSD100 | multitrack | 100 songs | yes |
INRIA:EuroVision | structure | 124 songs | no |
INRIA:Quaero | structure | 159 songs | no |
IRMAS | 11 instruments | 2874 excerpts | yes |
ISMIR2004Genre | 6 genres | 729 excerpts (30s) | yes |
ISMIR2004Tempo | tempo | 465 excerpts (20s) | yes |
IsoVAT | valence, arousal, tension | 90 MIDI snippets | yes |
Jazz Audio-Aligned Harmony Dataset | structure & key & chords & beats | 113 songs | no |
jaCapella | genre | 35 songs (multi-track) | yes |
Jamendo-VAD | voice activity | 61+16+16 songs | yes |
JGDB | multitrack & MIDI | random generated excerpts | yes |
JKU-ScoFo | audio & MIDI | 16 recordings | yes |
Jordan:Classical | structure | 15 pieces | yes |
Jordan:Jazz | structure | 15 pieces | yes |
JLSDD | symbolic scores | 77 duos (Josquin & La Rue) | no |
KBSF | Data extracted from songfacts.com | details on webpage | no |
LabROSA:APT | MIDI | 29 piano excerpts | yes |
LabROSA:MIDI | audio & MIDI | 4 songs | yes |
last.fm | listening habits | 992 last.fm users | no |
LFM-1b | listening habits | 120000 users | no |
LIND | lyrics-based artist and genre graphs | 42802 artists/214 genres | no |
ListenBrainz | 876M listening events | 28000 users | no |
LM-SVR | singing clean & larynx | 3.5h of recordings | yes |
LMD | MIDI & tempo & key | 176581 MIDI files | no |
M-DJCUE | cue points | 134 tracks | no |
MASS | Multitracks | 10s-40s | yes |
MAESTRO | audio aligned MIDI & velocity & sustain & scores & note-level alignment | 172 hours of piano | yes |
magnatagatune | similarity | 25863 excerpts (30s) | yes |
MAPS | piano notes/chords/pieces & tempo/key | 238 pieces | yes |
MARD | album reviews | 66566 songs | no |
MARG-AMT | MIDI pitch & onset/offset times | 30 melodies | yes |
MAST | vocal performance assessment | 1018 performances | no |
MAST-Rhythm | rhythm performance assessment | 3721 performances | yes |
McGill Billboard | chords | 740 songs | no |
MDBDrums | onset times & perc. instrument & playing technique | 23 excerpts | yes |
Medley-solos-DB | 8 instruments | 21572 clips (3s) | yes |
Medley2K | Medley transitions | 2000 medleys, 7712 transitions | no |
MedleyDB | multitrack & genre & melody f0 & instrument activation | 122 songs | yes |
MELON playlist dataset | Mel spectrograms and 148,826 playlists | 649,091 songs | no |
MeloSol | symbolic | 783 melodies | no |
MER500 | emotion | 500 clips | yes |
MetaScore | tags & text | 186 MIDI files | no |
MidiCaps | text captions | 168385 MIDI files | no |
MMD | artist, title metadata | 436631 MIDI files | no |
MIR-1K | vocal and background | 1000 excerpts | yes |
mirex05Train | predominant pitch | 13 excerpts | yes |
mirex06Train | tempo & beats | 20 excerpts (30s) | yes |
MLHD | listening history | 594415 users with 21079612671 listening events from 6685542 songs | no |
MLPMF | 7 perceptual features | 5000 audio files | yes |
MMTD | listening behavior | 1086808 tweets | no |
Modal | onset times | 71 snippets | yes |
MOODetector:Bi-Modal | lyrics & mood | 133 excerpts | yes |
MOODetector:Multi-Modal | lyrics & MIDI & mood | 903 excerpts (30s) | yes |
moodswings | arousal & valence | 240 excerpts (30s) | no |
MozartStringQuartets | structure, cadences | 32 movements | no |
MSMD | piano notes/chords/pieces, synthetic audio, aligned MIDI, aligned sheet music images, OMR | 497 pieces | no |
MSD | genre & mood & proprietary features | 1000000 songs | no |
MusAV | arousal & valence | 2092 excerpts (30s) | yes |
Music4All | tags, lyrics | 109,269 excerpts (30s) | on request |
MTC | phrases & key & meter | 18000 melodies | partially |
MTC | phrases & key & meter | 18000 melodies | partially |
mshoxxDB | multi-track aligned MIDI | 18 EDM songs | yes |
MTG-Jamendo | tags (genre, instruments, mood) | 55000 tracks | yes |
MTG-QBH | title & artist | 118 queries/481 songs | yes/no |
MUSDB-18 | multitrack | 150 songs | yes |
MusicBench | chords, beats, tempo, key, captions | app. 53000 excerpts | yes |
musiclef2012 | tags | 1355 songs | no |
MusicMicro | music listening patterns | 136866 users | no |
MusicNet | pitch and onsets | 330 recordings | implicitly |
MuVi-Sync | chords and loudness | 748 music videos | no |
MVD | vocal/scream activity | 57 metal songs | no |
NES-MDB | multi-track MIDI and aligned audio | 5000 songs | on request |
Nine Inch Nails Multitracks | multitrack | 66 songs | yes |
NMED-H | EEG | 24 trials x 16 excerpts (4.5min) | no |
NMED-RP | EEG | 20 trials x 10 excerpts (4.5min) | no |
NMED-TNaturalistic Music EEG Dataset: | EEG | 30 trials x 16 excerpts (30sec) | no |
NSynth | instrument and pitch | 305979 single notes | yes |
NUS-48E | aligned phonemes | 48 pairs of sung and spoken | yes |
ODB | onset times | 19 excerpts | yes |
Onset_Leveau | onset times | 21 excerpts | yes |
OpenBMAT | 6 classes for music presence | 1647 excerpts (60s) | yes |
OpenMIC-2018 | 20 instruments | 20000 excerpts (10s) | yes |
Orchset | predominant pitch | 64 excerpts | yes |
PCD | multi-track | 81 snippets | yes |
Phenicx-Anechoic | multi-track audio & aligned MIDI | 4 pieces | yes |
PHENICX emotion: | Excerpts of the Eroica Symphony by Beethoven plus audio descriptors from Essentia | 15 excerpts | yes |
PHENICX conduct dataset | Motion capture, recordings | 24 experts | yes |
PHENICX Symphonies Recordings | Multitracks, Video | 5 Symphonies | yes |
PGD | gestures, intention, video | 210 clips | yes |
Phonation | pitch & vowel & phonation mode | 900 monophonic snippets | yes |
PlaylistDataset | playlists | 75262 songs/2840553 transitions | no |
POP909 | MIDI songs | 909 piano arrangements | yes |
QBT-Extended | taps | 3365 queries/51 songs | MIDI |
QMUL:Beatles | structure & key & chords & beats & Harmonic Function | 181 songs | no |
QMUL:King | structure & key & chords | 14 songs | no |
QMUL:MichaelJackson | structure | 38 songs | no |
QMUL:MixEvaluation | multitrack & mixes | 18 songs/180 mixes | yes |
QMUL:Queen | structure/key & chords | 51/31 songs | no |
QMUL:RSS | structure | 60 songs | no |
QMUL:Zweieck | structure & key & chords & beats | 18 songs | no |
Quartet | Multitrack, Video, motion track | 96 recordings | yes |
RealBook | chords | 2486 songs | no |
QUASI | multitrack | 11 songs | yes |
Robbie Williams Annotations (Zanoni-Giorgi) | chords & keys & beats | 65 songs | no |
RockCorpus | chords & melody & bars | 200 songs | no |
RWC | lyrics & 10 genre & 50 instruments & chords & structure & aligned MIDI | 115 songs/50 classical/100 songs | yes |
S3 | note events & form | 4 symphony scores | no |
SALAMI | structure | 1447 songs | no |
SAMBASET | recording date, escolas, beats | 392 sambas | no |
Sanidha | multi-track Carnatic audio & video | 5 concerts | yes |
Sargon | structure | 4 songs | yes |
Semantic Artist Similarity | artist biographies & similarity | 268+2336 artists | no |
Schenker | MusicXML & Schenker analysis | 41 pieces | no |
SCP | EEG | 108/648 trials x 12 stimuli (5s) | yes |
SDD | start of samples | 80 songs & 80 samples | no |
SDDS | 10 snares, 4 dampenings, 53 mics | 2522 shots | yes |
SEILS | scores in different symbolic formats | 30 madrigals | no |
Seyerlehner:1517-Artists | 19 genres | 3180 songs | yes |
Seyerlehner:Annotated | 19 genres | 190 songs | yes |
Seyerlehner:Pop | tempo | 1105 songs | yes |
Seyerlehner:Unique | 14 genres | 3115 excerpts (30s) | yes |
SHS100K | cover songs | ca. 10,000 songs with 100,000 tracks | no |
SISEC | multitrack & mix | 5 excerpts | yes |
Slakh | synthesized audio and mixes | 2100 mixes | yes |
SMC:MIREX | tempo & beat positions | 217 excerpts | yes |
SMD | audio & aligned MIDI | 50 recordings | yes |
SongDescriber | captions | 706 recordings | yes |
SoundTracks | valence & energy & tension & mood | 360+110 excerpts | yes |
SPAM | structure | 50 songs | no |
Shazam Research Dataset: Offsets | in-song query times | 188M queries over 20 songs | no |
Su-AMT | onset times & pitch | 10 excerpts | yes |
SUPRA-RW | piano roll performances | 478 performances | yes |
SWD | key, chords, lyrics, structure | 2+5 cycle performances | partly |
TextureStringQuartets | texture | 11 movements | no |
TAFFC | mood quadrants | 900 excerpts | yes |
Traditional Flute Dataset | audio & aligned MIDI | 30 excerpts | yes |
ThisIsMyJam | favorite songs & artists | 131k users | no |
TinySOL | instrument, pitch, dynamics,string number | 2913 isolated notes | yes |
TONAS | pitch | 72 single-voiced excerpts | yes |
TPD | popularity rating | 23385 songs | no |
Tunebot | title & artist | 10000 queries/? songs | yes/no |
UIOWA:MIS | single instrument notes | many | yes |
UMA-Piano | piano chords | 275040 recordings | yes |
USM-SED | 27 audio classes | 20000 stereo snippts | yes |
UnmixDB | DJ mix parameters | 37 playlists | yes |
URBAN-SED | 9 event classes | 10000 recordings | yes |
UrbanSound8k | 10 event classes | 8732 slices | yes |
URMP | score-aligned video and audio | 44 recordings | yes |
uspop2002 | tags & genre & chords | 8752 songs | no |
VGD | EMG, playing techniques | 960 recordings | yes |
Vocadito | monophonic pitch, lyrics | 40 excerpts in 7 languages | yes |
VocalNotes | monophonic pitch, note segmentation with different annotators | around 10 excerpts | on request |
VocalSet | 17 vocal techniques, f0 and lyrics | 3560 recordings | yes |
YousicianUkulele | evaluated notes and chords | 500000 exercises by 1000 users | no |
WRD | aligned scores, keys, singing | 4 operas | yes |
WJazzD | onset, pitches | 456 Jazz solos | no |
-
Star
(147)
You must be signed in to star a gist -
Fork
(14)
You must be signed in to fork a gist
-
-
Save alexanderlerch/e3516bffc08ea77b429c419051ab793a to your computer and use it in GitHub Desktop.
Great dataset! Thank you!
The entry for the SALAMI dataset needs to be updated as they released the second half of their dataset.
It now contains structure annotations for 1447 songs.
Oops, never saw the comments here. Will update soon. Thanks!
good job,thanks!!!
Hello, is there any updated link for the UMA database?
And the MAPS one too, please!
I'll look into it, but it might take some time. Meanwhile, you can try to reach out to the authors and ask them directly. Please let me know in case you hear back.
Thank you for your prompt response! I managed to download them by looking at the revisions and copying the link directly from the code. I am unsure as to why this worked as opposed to just clicking the hyperlink, but either way, thank you very much Alexander!
Hello, is there any updated link for the UMA database?
I managed to download them by looking at the revisions and copying the link directly from the code. I am unsure as to why this worked as opposed to just clicking the hyperlink, but either way, thank you very much Alexander!
Clicking on the link doesn't work, but what works is right click -> save as -> override the browser's complaint that the download is not secure. It would be nice to put this info next to the link.
@gusauriemo @VG-account1 updated the two links discussed, thanks for your input!
Does anyone know of any other database for single instrument notes: transverse flute, oboe, bassoon, clarinet, saxophone (.wav format)?
I'm already using Tinysol and GoodSounds
Does anyone know of any other database for single instrument notes: transverse flute, oboe, bassoon, clarinet, saxophone (.wav format)?
I'm already using Tinysol and GoodSounds
@sinamusique The IOWA dataset comes to mind, see "UIOWA:MIS" above
Does anyone know of any other database for single instrument notes: transverse flute, oboe, bassoon, clarinet, saxophone (.wav format)?
I'm already using Tinysol and GoodSounds@sinamusique The IOWA dataset comes to mind, see "UIOWA:MIS" above
@alexanderlerch The UIOWA dataset are .aiff, I need .wav and with dynamic variations, mainly pp, mf and ff.
As far as I understand, the format conversion between aiff and wav is trivial. IIRC, the main difference is whether the data is stored in little-endian or big-endian.
As far as I understand, the format conversion between aiff and wav is trivial. IIRC, the main difference is whether the data is stored in little-endian or big-endian.
Ok, I'm going to try it!
Thank you @alexanderlerch
@sinamusique Yes, automatic batch conversion between audio formats is possible. Similarly, you can export such sounds from SoundFonts. (How to do that depends on the software / programming language you use.) There are many SoundFonts (for example https://archive.org/details/musyng-kite or http://virtualplaying.com/virtual-playing-orchestra/ and its sources linked therein), some of them might have good woodwind samples with different dynamic markings.
@sinamusique Yes, automatic batch conversion between audio formats is possible. Similarly, you can export such sounds from SoundFonts. (How to do that depends on the software / programming language you use.) There are many SoundFonts (for example https://archive.org/details/musyng-kite or http://virtualplaying.com/virtual-playing-orchestra/ and its sources linked therein), some of them might have good woodwind samples with different dynamic markings.
Thanks @VG-account1 It's a good option!
For anyone seaching for the MAPS dataset, here's a new link where it can be accessed:
https://amubox.univ-amu.fr/index.php/s/iNG0xc5Td1Nv4rR
For anyone seaching for the MAPS dataset, here's a new link where it can be accessed: https://amubox.univ-amu.fr/index.php/s/iNG0xc5Td1Nv4rR
Thanks @MaxMalmer, it's updated now (and the previous link was there erroneously anyway, not sure how that happened).
Hi Prof Lerch, CCOM-HuQin has released the full dataset with more than 12000 single clips and 57 excerpts with annotations. Here is the link:
https://zenodo.org/record/8140034
Hope this will help! :)
Hey there, René from the University of Würzburg. I have been looking everywhere for a copy of the ACM_Mirum_tempo audio files. Someone had some success finding them? The link provided is sadly down.
I wanted to suggest the addition of some massive datasets recently introduced. As mentioned on the linked post on the website for the book. (Currently comments aren't enabled on that post). Please excuse me if this wasn't the right place to post this.