RIPPLE: Residue Interaction Prediction Pipeline with Language Embeddings
Session Number
CMPS(ai) 09
Advisor(s)
Sugyan Dixit, Discovery Partners Institute
Discipline
Computer Science
Start Date
17-4-2025 10:30 AM
End Date
17-4-2025 10:45 AM
Abstract
Protein dynamics play critical roles in biological functions such as enzyme catalysis, signal transduction, and molecular interactions. Thus, dynamics become vital when modeling binding pocket stability during drug development. While experimental methods like X-ray crystallography, NMR, and Cryo-EM provide valuable structural insights, they remain time-consuming, expensive, and infeasible for complex protein types. Molecular dynamics simulations offer computational alternatives but demand substantial computational resources and time. Recent breakthroughs in protein structure prediction through deep learning models such as AlphaFold and ESMFold have revolutionized low-cost protein structure prediction; however, they fail to capture essential dynamic conformational changes. We introduce RIPPLE, a novel deep-learning framework that relies on ESM2-sequence-based embeddings to predict protein flexibility through Root Mean Square Fluctuation (RMSF) values to address these limitations. Our approach extracts Cα coordinates and residue sequences from protein structures to generate contextually rich embeddings using the ESM2 pre-trained language model. RIPPLE then processes these embeddings through multi-head attention mechanisms to capture long and short-range interactions between residues. Finally, RIPPLE uses the attended embeddings to predict RMSF values at the residue level. By providing rapid and reliable protein flexibility predictions, RIPPLE enables more efficient drug discovery processes and deeper insights into protein function across diverse biological contexts.
RIPPLE: Residue Interaction Prediction Pipeline with Language Embeddings
Protein dynamics play critical roles in biological functions such as enzyme catalysis, signal transduction, and molecular interactions. Thus, dynamics become vital when modeling binding pocket stability during drug development. While experimental methods like X-ray crystallography, NMR, and Cryo-EM provide valuable structural insights, they remain time-consuming, expensive, and infeasible for complex protein types. Molecular dynamics simulations offer computational alternatives but demand substantial computational resources and time. Recent breakthroughs in protein structure prediction through deep learning models such as AlphaFold and ESMFold have revolutionized low-cost protein structure prediction; however, they fail to capture essential dynamic conformational changes. We introduce RIPPLE, a novel deep-learning framework that relies on ESM2-sequence-based embeddings to predict protein flexibility through Root Mean Square Fluctuation (RMSF) values to address these limitations. Our approach extracts Cα coordinates and residue sequences from protein structures to generate contextually rich embeddings using the ESM2 pre-trained language model. RIPPLE then processes these embeddings through multi-head attention mechanisms to capture long and short-range interactions between residues. Finally, RIPPLE uses the attended embeddings to predict RMSF values at the residue level. By providing rapid and reliable protein flexibility predictions, RIPPLE enables more efficient drug discovery processes and deeper insights into protein function across diverse biological contexts.