Team member: Niral Patel, Eunjeong Lee, Bill Pezzullo, Teshanee Williams, Abby Pearson
The Urban Sound Dataset was generated by Justin Salamon, Christopher Jacoby & Juan Pablo Bello, and the classification study was published at the 22nd ACM International Conference on Multimedia in 2014. The dataset has 8,732 labeled sound excerpts which are classified as 10 sounds: air conditioner, car horn, children playing, dog barking, drilling, engine idling, gun shot, jack hammer, sirens, and street music. The audio files are stored in 10 folders, the excerpts from same file are stored in same folder. For each floder and the class, according to the metadata, the count of classes in each folder is shown below. It shows they are not balanced. The authors recommend using 'cross-validation' instead of using shuffling such as train_test_split() method in sklearn. We explain this in Experiment page.
Eunjeong Lee, ejlee127 at gmail dot com, last updated in Nov. 2020