esearchers have developed an AI system that guides robots using language-based instructions, improving navigation tasks without relying on extensive visual data. This method converts visual observations into text captions, allowing a language model to direct the robot’s movements. While not outperforming vision-based systems, it excels in data-limited scenarios and combines well with visual inputs for better performance.