AI Transformer Architecture: A Control Systems Perspective*

Session Number

3

Advisor(s)

Dr. Ashwin Mohan Ph.D., SYNAPSE Lab, IMSA

Location

A150

Discipline

Computer Science

Start Date

15-4-2026 2:15 PM

End Date

15-4-2026 3:00 PM

Abstract

Transformers are a type of neural network designed to process sequential data all at once rather than in order. Today, they have applications in everything from chatbots to translational tools, yet we continue to formulate a clear mathematical understanding of how they process information. Existing studies demonstrate transformer functionality, but do not investigate them from first-principles. This study applies an emerging approach using systems principles to analyze the behavior of transformers. Specifically, we investigated BERT, GPT2, and RoBERTa (define these chatbots), using a control systems approach by developing transfer functions for every major component to better investigate how these omnipresent systems operate. Based on this approach, we propose a simplified transformer to validate our findings. Our results demonstrate that such approaches could serve as a valuable tool to investigate complex higher order, nonlinear systems. Future work will investigate the role of brain regions and correlate them to transformer-like models. Ultimately, this work provides a comprehensive mathematical bridge between transformers and biological neural processing.

Share

COinS
 
Apr 15th, 2:15 PM Apr 15th, 3:00 PM

AI Transformer Architecture: A Control Systems Perspective*

A150

Transformers are a type of neural network designed to process sequential data all at once rather than in order. Today, they have applications in everything from chatbots to translational tools, yet we continue to formulate a clear mathematical understanding of how they process information. Existing studies demonstrate transformer functionality, but do not investigate them from first-principles. This study applies an emerging approach using systems principles to analyze the behavior of transformers. Specifically, we investigated BERT, GPT2, and RoBERTa (define these chatbots), using a control systems approach by developing transfer functions for every major component to better investigate how these omnipresent systems operate. Based on this approach, we propose a simplified transformer to validate our findings. Our results demonstrate that such approaches could serve as a valuable tool to investigate complex higher order, nonlinear systems. Future work will investigate the role of brain regions and correlate them to transformer-like models. Ultimately, this work provides a comprehensive mathematical bridge between transformers and biological neural processing.