Home News Latest Papers Tutorials Datasets Wiki SOTA LLM Models GPU Leaderboard Events

English

Vision And Language Navigation On Touchdown

Metrics

Task Completion (TC)

Results

Performance results of various models on this benchmark

Model Name	Task Completion (TC)	Paper Title	Repository
Gated Attention (GA)	5.5	Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments
FLAME	40.20	FLAME: Learning to Navigate with Multimodal LLM in Urban Environments
VLN Transformer	14.9	Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation
ORAR + junction type + heading delta	29.1	Analyzing Generalization of Vision and Language Navigation to Unseen Outdoor Areas
Gated Attention (GA)	11.9	Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation
Retouch-RConcat	12.8	Retouchdown: Adding Touchdown to StreetLearn as a Shareable Resource for Language Grounding Tasks in Street View
RConcat	11.8	Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation
ARC	14.13	Learning to Stop: A Simple yet Effective Approach to Urban Vision-Language Navigation	-
ORAR	24.2	Analyzing Generalization of Vision and Language Navigation to Unseen Outdoor Areas
VLN Transformer +M-50 +style	16.2	Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation
RConcat	10.7	Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments
ARC + L2STOP	16.68	Learning to Stop: A Simple yet Effective Approach to Urban Vision-Language Navigation	-

0 of 12 row(s) selected.