Lassifying Bird Sounds and Music Genres Using Machine
Session Number
CMPS 14
Advisor(s)
Dr. Phadmakar Patankar, Illinois Mathematics and Science Academy
Discipline
Computer Science
Start Date
17-4-2025 10:15 AM
End Date
17-4-2025 10:30 AM
Abstract
This project focuses on classifying bird sounds and music genres using machine learning techniques. The BirdCLEF 2023 and GTZAN datasets are used to train and evaluate various machine learning models, including random forests, convolutional neural networks (CNNs), recurrent neural networks (RNNs), and logistic/linear regression. The objective is to determine the most effective and efficient approach for audio classification by analyzing performance metrics such as accuracy, training time, and computational efficiency. Preprocessing steps involve noise reduction, feature extraction using Mel spectrograms and MFCCs, and data augmentation techniques such as time-stretching and pitch-shifting to enhance model generalization. Models are trained on diverse audio samples, and their classification performance is compared to identify the optimal method for each dataset. By improving automatic sound recognition, this research has practical applications in environmental monitoring for wildlife conservation and in music information retrieval for genre classification, recommendation systems, and music analysis.
Lassifying Bird Sounds and Music Genres Using Machine
This project focuses on classifying bird sounds and music genres using machine learning techniques. The BirdCLEF 2023 and GTZAN datasets are used to train and evaluate various machine learning models, including random forests, convolutional neural networks (CNNs), recurrent neural networks (RNNs), and logistic/linear regression. The objective is to determine the most effective and efficient approach for audio classification by analyzing performance metrics such as accuracy, training time, and computational efficiency. Preprocessing steps involve noise reduction, feature extraction using Mel spectrograms and MFCCs, and data augmentation techniques such as time-stretching and pitch-shifting to enhance model generalization. Models are trained on diverse audio samples, and their classification performance is compared to identify the optimal method for each dataset. By improving automatic sound recognition, this research has practical applications in environmental monitoring for wildlife conservation and in music information retrieval for genre classification, recommendation systems, and music analysis.