Multimodal Large Models-Driven Precise Perception in Complex Low-Altitude UAV Environments: A Survey on Adaptive Edge Intelligence and Swarm Collaboration

Authors

  • Chenchen He Author
  • Dongran Sun Author
  • Xuexue Zhang Author

DOI:

https://doi.org/10.62306/d54q8p10

Keywords:

Multimodal large models, UAV perception, low-altitude environments, federated learning, self-evolving adaptation

Abstract

Unmanned aerial vehicles (UAVs) equipped with long-endurance remote sensing capabilities have revolutionized applications in economic development, national defense, emergency response, and disaster monitoring. However, traditional centralized processing paradigms suffer from high latency, resource inefficiency, and poor adaptability to dynamic low-altitude environments. This survey reviews advancements in multimodal large models (MLMs) for precise detection and perception in complex low-altitude scenarios, emphasizing three core challenges: enhancing intelligent terminal perception for adaptive learning, bolstering multi-UAV collaborative coverage through federated evolution, and achieving high-fidelity 3D perception via multimodal fusion. We synthesize recent developments in self-evolving online learning frameworks, asynchronous distributed federated optimization, and Transformer-based MLMs tailored for heterogeneous sensor data (e.g., LiDAR and multi-view cameras). Key contributions include a taxonomy of adaptive algorithms mitigating catastrophic forgetting and data heterogeneity, alongside benchmarks for edge deployment in resource-constrained UAV systems. By highlighting gaps in unsupervised multimodal alignment and real-time scalability, this work outlines future directions toward autonomous, resilient UAV swarms, fostering innovations in edge intelligence for spatial information technologies.

Downloads

Published

2025-12-20

Issue

Section

Articles