Implementing Mixed Memorization-Based Inference Recurrent Models of Visual Attention Using Thresholds for Energy Efficiency
Session Number
Project ID: CMPS 35
Advisor(s)
Amit Ranjan Trivedi
Maeesha Binte Hashem, University of Illinois at Chicago
Discipline
Computer Science
Start Date
17-4-2024 8:35 AM
End Date
17-4-2024 8:50 AM
Abstract
Rapid advancement of deep neural networks has significantly improved various tasks such as image and speech recognition. However, as complexity of the models increases, computational costs and the number of parameters increase, making it more difficult to be implemented on resource-limited devices. This paper proposes a novel memorization-based inference (MBI) model that is compute-free and size-agnostic. Our work capitalizes on the inference mechanism of the recurrent attention model (RAM), where only a small window of input domain (glimpse) is processed in a one-time step, and the outputs from multiple glimpses are combined through a hidden vector to determine the overall classification output of the problem. By leveraging the low-dimensionality of glimpse, our inference procedure stores key-value pairs consisting of glimpse location, patch vector, etc. in a table. The computations are obviated during inference by utilizing the table to read out key-value pairs and performing compute-free inference by memorization. MBI is only accurate to a certain degree and is coupled with standard Deep Neural Networks to improve accuracy. By exploiting Bayesian optimization and clustering, the necessary lookups are reduced, and accuracy is improved.
Implementing Mixed Memorization-Based Inference Recurrent Models of Visual Attention Using Thresholds for Energy Efficiency
Rapid advancement of deep neural networks has significantly improved various tasks such as image and speech recognition. However, as complexity of the models increases, computational costs and the number of parameters increase, making it more difficult to be implemented on resource-limited devices. This paper proposes a novel memorization-based inference (MBI) model that is compute-free and size-agnostic. Our work capitalizes on the inference mechanism of the recurrent attention model (RAM), where only a small window of input domain (glimpse) is processed in a one-time step, and the outputs from multiple glimpses are combined through a hidden vector to determine the overall classification output of the problem. By leveraging the low-dimensionality of glimpse, our inference procedure stores key-value pairs consisting of glimpse location, patch vector, etc. in a table. The computations are obviated during inference by utilizing the table to read out key-value pairs and performing compute-free inference by memorization. MBI is only accurate to a certain degree and is coupled with standard Deep Neural Networks to improve accuracy. By exploiting Bayesian optimization and clustering, the necessary lookups are reduced, and accuracy is improved.