Quinn Graehling, Jonathan Schierl



Download Project (1.2 MB)


Multi-modal data is useful for complex imaging scenarios due to the exclusivity of information found in each modality, but there is a lack of meaningful comparisons of different modalities for object detection. In our work, we propose three contributions: (1) Release of a multi-modal ground-based small object detection dataset, (2) A performance comparison of 2D and 3D modalities using state-of-the-art algorithms on data captured in the same environment, and (3) a multi-modal fusion framework for 2D/3D sensing. The new dataset encompasses various small objects for detection in EO, IR, and LiDAR modalities. The labeled data has comparable resolutions across each modality for better performance analysis. The modality comparison conducted in this work uses advanced deep learning algorithms, such as Mask R-CNN for 2D imaging and PointNet++ for 3D imaging. The comparisons are conducted with similar parameter sizes and the results are analyzed for specific instances where each modality performed the best. To complement the effectiveness of different data modalities, we developed a fusion strategy to combine the region proposals of one modality with the classification strength of a different modality for accurate detection and region segmentation. We investigated the functionality of the You Only Look Once (YOLO) algorithm, which computes partitioned image classification and region proposals in parallel for detection. Our fusion strategy learns the optimum features of different modality combinations for appropriate candidate selection for classification. The effectiveness of the proposed fusion method is being evaluated on the multi-modal dataset for object detection and segmentation and we observe superior performance when compared to single-modality algorithms.

Publication Date


Project Designation

Graduate Research

Primary Advisor

Theus H. Aspiras, Vijayan K. Asari

Primary Advisor's Department

Electrical and Computer Engineering


Stander Symposium project, School of Engineering

United Nations Sustainable Development Goals

Industry, Innovation, and Infrastructure

Multi-modal Data Analysis and Fusion for Robust Object Detection in 2D/3D Sensing