Quinn Graehling, Jonathan Schierl
Download Project (1.2 MB)
Multi-modal data is useful for complex imaging scenarios due to the exclusivity of information found in each modality, but there is a lack of meaningful comparisons of different modalities for object detection. In our work, we propose three contributions: (1) Release of a multi-modal ground-based small object detection dataset, (2) A performance comparison of 2D and 3D modalities using state-of-the-art algorithms on data captured in the same environment, and (3) a multi-modal fusion framework for 2D/3D sensing. The new dataset encompasses various small objects for detection in EO, IR, and LiDAR modalities. The labeled data has comparable resolutions across each modality for better performance analysis. The modality comparison conducted in this work uses advanced deep learning algorithms, such as Mask R-CNN for 2D imaging and PointNet++ for 3D imaging. The comparisons are conducted with similar parameter sizes and the results are analyzed for specific instances where each modality performed the best. To complement the effectiveness of different data modalities, we developed a fusion strategy to combine the region proposals of one modality with the classification strength of a different modality for accurate detection and region segmentation. We investigated the functionality of the You Only Look Once (YOLO) algorithm, which computes partitioned image classification and region proposals in parallel for detection. Our fusion strategy learns the optimum features of different modality combinations for appropriate candidate selection for classification. The effectiveness of the proposed fusion method is being evaluated on the multi-modal dataset for object detection and segmentation and we observe superior performance when compared to single-modality algorithms.
Theus H. Aspiras, Vijayan K. Asari
Primary Advisor's Department
Electrical and Computer Engineering
Stander Symposium project, School of Engineering
United Nations Sustainable Development Goals
Industry, Innovation, and Infrastructure
"Multi-modal Data Analysis and Fusion for Robust Object Detection in 2D/3D Sensing" (2020). Stander Symposium Projects. 1961.