Visual-Language-Guided Task Planning for Horticultural Robots
Developed a modular crop-monitoring framework that leverages Visual Language Models (VLMs) for robotic task planning and created a benchmark for evaluating performance in monoculture and polyculture agricultural environments. Results showed strong performance on short-horizon tasks but significant degradation on complex long-horizon tasks, highlighting current limitations of VLMs in handling noisy semantic maps and maintaining reliable context for sustained robotic operations.