'"The Wallpaper is Ugly": Indoor Localization Using Vision and Language'

‘”The Wallpaper is Ugly”: Indoor Localization Using Vision and Language’

April 4, 2024

“We study the task of locating a user in a mapped indoor environment using natural language queries and images from the environment. Building on recent pretrained vision-language models, we learn a similarity score between text descriptions and images of locations in the environment. … Our approach is capable of localizing on environments, text, and images that were not seen during training. One model, finetuned CLIP, outperformed humans in our evaluation.”

Find the paper and full list of authors in the 32nd IEEE International Conference on Robot and Human Interactive Communication proceedings.

View on Site

Lawson L.S. Wong

Computer Science

‘”The Wallpaper is Ugly”: Indoor Localization Using Vision and Language’

Related

NSF grant awarded for adaptive clothing

Patent for ‘lightweight pose estimation network’ goes to Fu

DARPA grant to enhance mixed reality security

Patents for experimental virtual reality methods

Patent for efficient computation

‘Human Mobility Is Well Described by Closed-Form Gravity-Like Models Learned Automatically from Data’

‘Foundations of Scalable Systems’

‘Network Coding for Engineers’

‘Practical Business Analytics Using R and Python’