The Impact of The Soft Errors in Convolutional Neural Network on Graphics Processing Unit: SqueezeNet as Case Study
Recently, Convolutional Neural Networks (CNNs) have gained a real interest in computer vision-based tasks, such as object detection and classification. Due to the enormous computing power requirements and high memory access of CNN algorithms, several hardware accelerators are utilized to accelerate their execution. Graphics Processing Units (GPUs) are currently the most dominant CNN accelerators. However, the hunt for these GPUs’ extreme performance has led to a high density of computing resources inside a small chip, causing them to frequently experience several types of faults. Soft errors pose a particular threat because the high-level parallelism in these GPUs can reproduce multiple errors from one fault. Silent data corruption (SDC) of soft errors can ultimately lead to wrong predictions, such as misclassification and misdetection. This paper analyzes the reliability of the SqueezeNet to identify which part of the model is more vulnerable to soft error. To achieve this, we injected the SqueezeNet run on top of NVIDIA’s GPU, using the SASSIFI fault injector as the major evaluator tool. Our experiments demonstrate that our technique can detect and correct Malfunction SDCs errors up through the average error percentage that caused a malfunction (errors that lead to misclassification) from 4.85% to 0.06% by hardening only the vulnerable part of the SqueezeNet model. This is a high improvement in the model reliability.
Keywords - Convolutional Neural Networks (CNNs), Soft error, Reliability, SqueezeNet, GPUs.