You can find the full technical details on arXiv: VL-Nav .
Leverages a 3D scene graph and image memory to help Vision Language Models (VLMs) replan tasks in real-time.
is an advanced robotic navigation framework that combines neural reasoning (the "brain") with symbolic guidance (the "logic") to help robots navigate complex environments. Unlike traditional methods that might lead to aimless wandering, VL-Nav uses a NeSy (Neuro-Symbolic) Task Planner and an Exploration System to understand abstract human instructions. Useful Text Blocks 1. The "Problem & Solution" Pitch (Good for Intros)
For related open-source frameworks, check repositories like oobvlm on GitHub.
"In rigorous testing, including the , VL-Nav achieved a 75–83% success rate across indoor and outdoor settings. In real-world deployments, it maintained an 86.3% success rate , demonstrating reliability over long-range trajectories of up to 483 meters." Resources for Further Development