Using Single-Cell Analysis and Machine Learning to Predict Gastroesophageal Reflux Disease (GERD) and Systemic Sclerosis (SSc)
Session Number
Project ID: CMPS 16
Advisor(s)
Dr. Deborah Winter, Northwester University Feinberg School of Medicine
Discipline
Computer Science
Start Date
17-4-2024 8:15 AM
End Date
17-4-2024 8:30 AM
Abstract
Our project utilized various computational techniques, including machine learning, to address an existing issue with diagnosing esophageal diseases, specifically Gastroesophageal Reflux Disease (GERD) and Systemic Sclerosis (SSc). In the early stages of both diseases, symptoms such as chest pain, heartburn, and regurgitation are similar, potentially causing one to be mistaken for the other. Additionally, both of their current diagnoses are complicated as they are a cumulative decision based on multiple tests.
Using single-cell data from thirty-three patients, our project focuses on gaining a better understanding of GERD and SSc on a single-cell level while using machine learning models to diagnose and differentiate between both diseases. We performed exploratory data analysis and tested four different models on the dataset: K-Nearest Neighbors, Support Vector Machine, Logistic Regression, and Random Forest. After evaluating the accuracy of the models on six independent samples, Logistic Regression had the highest accuracy of 78%. These results are promising and provide hope for future applications of machine learning in efficiently diagnosing and differentiating between GERD and SSc.
Using Single-Cell Analysis and Machine Learning to Predict Gastroesophageal Reflux Disease (GERD) and Systemic Sclerosis (SSc)
Our project utilized various computational techniques, including machine learning, to address an existing issue with diagnosing esophageal diseases, specifically Gastroesophageal Reflux Disease (GERD) and Systemic Sclerosis (SSc). In the early stages of both diseases, symptoms such as chest pain, heartburn, and regurgitation are similar, potentially causing one to be mistaken for the other. Additionally, both of their current diagnoses are complicated as they are a cumulative decision based on multiple tests.
Using single-cell data from thirty-three patients, our project focuses on gaining a better understanding of GERD and SSc on a single-cell level while using machine learning models to diagnose and differentiate between both diseases. We performed exploratory data analysis and tested four different models on the dataset: K-Nearest Neighbors, Support Vector Machine, Logistic Regression, and Random Forest. After evaluating the accuracy of the models on six independent samples, Logistic Regression had the highest accuracy of 78%. These results are promising and provide hope for future applications of machine learning in efficiently diagnosing and differentiating between GERD and SSc.