Paper Title
Towards Pedestrian Detection based on Deep Learning Sensor Fusion

Abstract
Pedestrian detection plays an important role in the Autonomy of driving cars, mobile robot navigation, surveillance, among other applications. One of the common methods for pedestrian detection is based on bounding boxes, which useeither a camera, lidar, or both. However, bounding boxes do not accurately detect objects according to their shape, creating ambiguity. There exist other methods for detecting objects at the pixel level, such as semantic segmentation. Moreover, sensors can complement each other based on their field of view and other characteristics. Therefore, this paper presents a ROS application method for fusing three sensors, namely lidar, radar, and RGB,and explores the process of generating three sensor data in realtime. The method uses a trained network model based on SegNet, which is a pixel-wise semantic segmentation network, to fuse sensor data and detect pedestrians. The model was trained using60 images, with 10 for evaluation and 10 for training. Although the training set is small, experimental results show a training and testing mean intersection over union of 99.66 % and 96.22%, respectively. Despite the network model having experienced overfitting, it performed well during the evaluation of 85 images in pedestrian detection mode, with a mean intersection overunion of 0.59 %. This is a promising result towards pedestrian detection under the modality of deep learning sensor fusion. Additionally, this paper uses a radar-lidar calibration method based on singular value decomposition to align radar and laserdata. Keywords - Calibration, CNN, ROS, Sensor Fusion.