Paper Title
Brain Inspired Attention Model For Action Recognition

Abstract
We develop a brain inspired attention model using a recurrent neural network to recognize the action in videos using attentional search. The proposed model, which is based on visual attention, scans images by making a sequence of attentional jumps or saccades. By integrating the information gathered by the sequence of saccades, the model recognizes the action in the video. The model consists primarily of two modules: the Classifier Network and the Saccade Network. Whereas the Classifier network predicts the class of the action going on in the video frame, the Saccade network predicts the saccadic jump in the next frame of the video. When the maximum value of the predicted class crosses a threshold, the model halts making saccades and outputs its class prediction. Correct prediction results in a positive reward, which is used to train the model by Q-learning. The model is trained on 250 grayscale videos of UCF-11 Dataset consisting of two classes, “horse riding” and “bike riding” and validated on 62 videos for 3 epochs. The size of the three concentric attention windows was used ((50,70), (100,140), and (200,280)). The jump length of the attention windows from one location to the next location is taken as 50. We achieve a training accuracy of 95.58% and a validation accuracy of 60.31%. Simulations show that the attention window fixates on the location in the frame where the relevant information of the ongoing action is present to give a correct recognition in the video. For example, in case of “horse riding” the attention window fixates on the tail, four legs, and the head of the horse whereas in case of “bike riding” the attention window fixates on the back seat and the wheels of the bicycle. Keywords - Attention, Saccade, Deep Q-learning, and Action Recognition