Radar Tensor-Guided Multi-Model Fusion Framework with Monocular Image for Three-Dimensional Object Detection
Date of Award
5-9-2026
Degree Name
M.S. in Electrical Engineering
Department
Department of Electrical and Computer Engineering
Advisor/Chair
Vijayan Asari
Abstract
Robust object detection in outdoor environments, particularly for autonomous driving, requires reliable performance under diverse and adverse weather conditions. Millimeter-wave (mmWave) radar offers a significant advantage in such scenarios due to its resilience to weather-related degradation. In this work, we propose a transformer-based 3D object detection framework that integrates radar and camera modalities through a novel cross-modal fusion architecture. Specifically, semantic information derived from the 4D radar tensor is utilized to guide image feature learning, thereby promoting effective cross-modal integration and enhancing detection performance. To effectively extract and fuse information from both modalities, we employ a deformable attention mechanism with a normalized spatial grid, enabling multi-scale feature aggregation based on relative spatial locations. This strategy enhances the utilization of radar’s inherent robustness and improves detection reliability in challenging outdoor environments. Experimental results demonstrate consistent performance gains over baseline and existing multi-modal approaches, with improved detection accuracy across both bird-eye-view (BEV) and 3D detection tasks.
Keywords
Electrical Engineering
Rights Statement
Copyright 2026, author.
Recommended Citation
Tsai, Cheng-Jui, "Radar Tensor-Guided Multi-Model Fusion Framework with Monocular Image for Three-Dimensional Object Detection" (2026). Graduate Theses and Dissertations. 7671.
https://ecommons.udayton.edu/graduate_theses/7671

Comments
OCLC No. 1591830107