Maritime Multi-modal Multi-view Bird Eye View Scene Segmentation

Image credit: author

Abstract

In this paper, we address the challenge of enabling accurate and robust perception in marine autonomous systems for unmanned maritime operations. Our approach integrates data from multiple sensors, including cameras and radars, to overcome the limitations of traditional sensor fusion methods. We propose a novel cross-attention transformer-based multi-modal sensor fusion technique, specifically tailored for marine navigation. This method not only leverages deep learning to fuse complex data modalities effectively but also reconstructs a comprehensive Bird-eye-view of the environment using multi-view RGB and LWIR images. Our experimental results demonstrate the method’s effectiveness in various challenging scenarios, contributing significantly to the development of more advanced and reliable marine autonomous systems. This approach utilizes multi-modal data, integrates the temporal fusion domain, and remains robust against sensor-calibration errors, marking a notable advancement in autonomous maritime technology.

Publication
Under review
Dimitrios Dagdilelis
Dimitrios Dagdilelis
ML/AI Engineer

I’m an AI Engineer who loves building intelligent systems that solve real-world problems and push the boundaries of what’s possible. From optimizing processes to creating scalable machine learning models, I thrive at the intersection of data, innovation, and impact.