Urban Sound Classification using Machine Learning


For the final project of BootCamp Data Analytics, we challenged ourselves to learn new area, 'Sound Classification' using machine learning. Wow, what a big terminology 'sound (audio)' or 'machine learning'. The left image is the wave form of 'dog-bark' sound. How to convert the analogue features to digital data, what features are unique for each sound? For the beginners like us, there were many questions. We, thankfully, found many sources of videos and articles, here we want to share them along with our experiments. For the python codes, inputs(csv files) for machine learning and presentation ppt, you can visit our github.

Team member: Niral Patel, Eunjeong Lee, Bill Pezzullo, Teshanee Williams, Abby Pearson

Dataset: UrbanSound8K

The Urban Sound Dataset was generated by Justin Salamon, Christopher Jacoby & Juan Pablo Bello, and the classification study was published at the 22nd ACM International Conference on Multimedia in 2014. The dataset has 8,732 labeled sound excerpts which are classified as 10 sounds: air conditioner, car horn, children playing, dog barking, drilling, engine idling, gun shot, jack hammer, sirens, and street music. The audio files are stored in 10 folders, the excerpts from same file are stored in same folder. For each floder and the class, according to the metadata, the count of classes in each folder is shown below. It shows they are not balanced. The authors recommend using 'cross-validation' instead of using shuffling such as train_test_split() method in sklearn. We explain this in Experiment page.

Eunjeong Lee, ejlee127 at gmail dot com, last updated in Nov. 2020