Using Machine Learning to Determine Peptide Sequences with High Heme Binding Propensity
Session Number
Project ID: CMPS 08
Advisor(s)
Dr. Chris Fry
Dr. Henry Chan, Argonne National Laboratory
Discipline
Computer Science
Start Date
17-4-2024 8:55 AM
End Date
17-4-2024 9:10 AM
Abstract
Self-assembling peptides, or chains of amino acids that form various structures in response to environmental conditions, have a variety of uses in material science as well as biomedicine. These uses include drug delivery, or as drugs themselves. In material sciences, self-assembling peptides can be used to create materials with a variety of properties. Our work hopes to utilize machine learning to find patterns in peptide sequences to streamline material discovery. Spectroscopy data was collected of 100+ synthesized peptide sequences on both visible and infrared spectra. These spectra were used to calculate values for heme-binding propensity and alpha-helix propensity. This data was organized, processed by smoothing and baselining the spectrograph graphs, and a dataset was created. A neural network was created which takes in the sequence of peptides as an input, and predicts the values for alpha helix propensity and heme binding propensity. This model was used to create new sequences with high predicted values of alpha helix propensity and heme binding propensity. We will synthesize these sequences and determine their true alpha helix and heme binding propensities using their spectra, to determine if a machine-learning approach is viable for this application, or other material science applications in the future.
Using Machine Learning to Determine Peptide Sequences with High Heme Binding Propensity
Self-assembling peptides, or chains of amino acids that form various structures in response to environmental conditions, have a variety of uses in material science as well as biomedicine. These uses include drug delivery, or as drugs themselves. In material sciences, self-assembling peptides can be used to create materials with a variety of properties. Our work hopes to utilize machine learning to find patterns in peptide sequences to streamline material discovery. Spectroscopy data was collected of 100+ synthesized peptide sequences on both visible and infrared spectra. These spectra were used to calculate values for heme-binding propensity and alpha-helix propensity. This data was organized, processed by smoothing and baselining the spectrograph graphs, and a dataset was created. A neural network was created which takes in the sequence of peptides as an input, and predicts the values for alpha helix propensity and heme binding propensity. This model was used to create new sequences with high predicted values of alpha helix propensity and heme binding propensity. We will synthesize these sequences and determine their true alpha helix and heme binding propensities using their spectra, to determine if a machine-learning approach is viable for this application, or other material science applications in the future.