Implementing Mixed Memorization-Based Inference Recurrent Models of Visual Attention Using Thresholds for Energy Efficiency

Session Number

Project ID: CMPS 35

Advisor(s)

Amit Ranjan Trivedi

Maeesha Binte Hashem, University of Illinois at Chicago

Discipline

Computer Science

Start Date

17-4-2024 8:35 AM

End Date

17-4-2024 8:50 AM

Abstract

Rapid advancement of deep neural networks has significantly improved various tasks such as image and speech recognition. However, as complexity of the models increases, computational costs and the number of parameters increase, making it more difficult to be implemented on resource-limited devices. This paper proposes a novel memorization-based inference (MBI) model that is compute-free and size-agnostic. Our work capitalizes on the inference mechanism of the recurrent attention model (RAM), where only a small window of input domain (glimpse) is processed in a one-time step, and the outputs from multiple glimpses are combined through a hidden vector to determine the overall classification output of the problem. By leveraging the low-dimensionality of glimpse, our inference procedure stores key-value pairs consisting of glimpse location, patch vector, etc. in a table. The computations are obviated during inference by utilizing the table to read out key-value pairs and performing compute-free inference by memorization. MBI is only accurate to a certain degree and is coupled with standard Deep Neural Networks to improve accuracy. By exploiting Bayesian optimization and clustering, the necessary lookups are reduced, and accuracy is improved.

Share

COinS
 
Apr 17th, 8:35 AM Apr 17th, 8:50 AM

Implementing Mixed Memorization-Based Inference Recurrent Models of Visual Attention Using Thresholds for Energy Efficiency

Rapid advancement of deep neural networks has significantly improved various tasks such as image and speech recognition. However, as complexity of the models increases, computational costs and the number of parameters increase, making it more difficult to be implemented on resource-limited devices. This paper proposes a novel memorization-based inference (MBI) model that is compute-free and size-agnostic. Our work capitalizes on the inference mechanism of the recurrent attention model (RAM), where only a small window of input domain (glimpse) is processed in a one-time step, and the outputs from multiple glimpses are combined through a hidden vector to determine the overall classification output of the problem. By leveraging the low-dimensionality of glimpse, our inference procedure stores key-value pairs consisting of glimpse location, patch vector, etc. in a table. The computations are obviated during inference by utilizing the table to read out key-value pairs and performing compute-free inference by memorization. MBI is only accurate to a certain degree and is coupled with standard Deep Neural Networks to improve accuracy. By exploiting Bayesian optimization and clustering, the necessary lookups are reduced, and accuracy is improved.