Lassifying Bird Sounds and Music Genres Using Machine

Session Number

CMPS 14

Advisor(s)

Dr. Phadmakar Patankar, Illinois Mathematics and Science Academy

Discipline

Computer Science

Start Date

17-4-2025 10:15 AM

End Date

17-4-2025 10:30 AM

Abstract

This project focuses on classifying bird sounds and music genres using machine learning techniques. The BirdCLEF 2023 and GTZAN datasets are used to train and evaluate various machine learning models, including random forests, convolutional neural networks (CNNs), recurrent neural networks (RNNs), and logistic/linear regression. The objective is to determine the most effective and efficient approach for audio classification by analyzing performance metrics such as accuracy, training time, and computational efficiency. Preprocessing steps involve noise reduction, feature extraction using Mel spectrograms and MFCCs, and data augmentation techniques such as time-stretching and pitch-shifting to enhance model generalization. Models are trained on diverse audio samples, and their classification performance is compared to identify the optimal method for each dataset. By improving automatic sound recognition, this research has practical applications in environmental monitoring for wildlife conservation and in music information retrieval for genre classification, recommendation systems, and music analysis.

Share

COinS
 
Apr 17th, 10:15 AM Apr 17th, 10:30 AM

Lassifying Bird Sounds and Music Genres Using Machine

This project focuses on classifying bird sounds and music genres using machine learning techniques. The BirdCLEF 2023 and GTZAN datasets are used to train and evaluate various machine learning models, including random forests, convolutional neural networks (CNNs), recurrent neural networks (RNNs), and logistic/linear regression. The objective is to determine the most effective and efficient approach for audio classification by analyzing performance metrics such as accuracy, training time, and computational efficiency. Preprocessing steps involve noise reduction, feature extraction using Mel spectrograms and MFCCs, and data augmentation techniques such as time-stretching and pitch-shifting to enhance model generalization. Models are trained on diverse audio samples, and their classification performance is compared to identify the optimal method for each dataset. By improving automatic sound recognition, this research has practical applications in environmental monitoring for wildlife conservation and in music information retrieval for genre classification, recommendation systems, and music analysis.