AI Transformer Architecture: A Control Systems Perspective*
Session Number
3
Advisor(s)
Dr. Ashwin Mohan Ph.D., SYNAPSE Lab, IMSA
Location
A150
Discipline
Computer Science
Start Date
15-4-2026 2:15 PM
End Date
15-4-2026 3:00 PM
Abstract
Transformers are a type of neural network designed to process sequential data all at once rather than in order. Today, they have applications in everything from chatbots to translational tools, yet we continue to formulate a clear mathematical understanding of how they process information. Existing studies demonstrate transformer functionality, but do not investigate them from first-principles. This study applies an emerging approach using systems principles to analyze the behavior of transformers. Specifically, we investigated BERT, GPT2, and RoBERTa (define these chatbots), using a control systems approach by developing transfer functions for every major component to better investigate how these omnipresent systems operate. Based on this approach, we propose a simplified transformer to validate our findings. Our results demonstrate that such approaches could serve as a valuable tool to investigate complex higher order, nonlinear systems. Future work will investigate the role of brain regions and correlate them to transformer-like models. Ultimately, this work provides a comprehensive mathematical bridge between transformers and biological neural processing.
AI Transformer Architecture: A Control Systems Perspective*
A150
Transformers are a type of neural network designed to process sequential data all at once rather than in order. Today, they have applications in everything from chatbots to translational tools, yet we continue to formulate a clear mathematical understanding of how they process information. Existing studies demonstrate transformer functionality, but do not investigate them from first-principles. This study applies an emerging approach using systems principles to analyze the behavior of transformers. Specifically, we investigated BERT, GPT2, and RoBERTa (define these chatbots), using a control systems approach by developing transfer functions for every major component to better investigate how these omnipresent systems operate. Based on this approach, we propose a simplified transformer to validate our findings. Our results demonstrate that such approaches could serve as a valuable tool to investigate complex higher order, nonlinear systems. Future work will investigate the role of brain regions and correlate them to transformer-like models. Ultimately, this work provides a comprehensive mathematical bridge between transformers and biological neural processing.