Acknowledge

Everyday Object Meets Vision-and-Language Navigation Agent via Backdoor

Keji He; Kehan Chen; Jiawang Bai; Yan Huang; Qi Wu; Shu-Tao Xia; Liang Wang;

Paper,

Accepted to NeurIPS 2024

Abstract

Vision-and-Language Navigation (VLN) requires an agent to dynamically explore environments following natural language. The VLN agent, closely integrated into daily lives, poses a substantial threat to the security of privacy and property upon the occurrence of malicious behavior. However, this serious issue has long been overlooked. In this paper, we pioneer the exploration of an object-aware backdoored VLN, achieved by implanting object-aware backdoors during the training phase. Tailored to the unique VLN nature of cross-modality and continuous decision-making, we propose a novel backdoored VLN paradigm: IPR Backdoor. This enables the agent to act in abnormal behavior once encountering the object triggers during language-guided navigation in unseen environments, thereby executing an attack on the target scene. Experiments demonstrate the effectiveness of our method in both physical and digital spaces across different VLN agents, as well as its robustness to various visual and textual variations. Additionally, our method also well ensures navigation performance in normal scenarios with remarkable stealthiness.

Method

TODOs

Release vln train and evaluation Code.
Release trigger vision encoder pretrain code.
Release trigger feature extraction code.
Release model weight and config.

Setup

Installation

This repo keeps the same installation settings as the HAMT. The installation details (simulator, environment, annotations, and pretrained models) can be referred here.

Install requirements:

conda create --name vlnatt python=3.9
conda activate vlnhamt
pip install torch==2.0.0 torchvision==0.15.1 torchaudio==2.0.1
pip install -r requirement.txt

Download data from Dropbox and put files into project directions.

Running

Training

cd baselines
sh scripts/train_hamt_physical_attack.sh

Evaluation

cd baselines
sh scripts/test_hamt_physical_attack.sh

Acknowledge

Our implementations are partially inspired by HAMT.

Thanks for the great work!

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
baselines		baselines
preprocess		preprocess
pretrain		pretrain
visualization		visualization
.gitignore		.gitignore
README.md		README.md
framework.png		framework.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Everyday Object Meets Vision-and-Language Navigation Agent via Backdoor

Paper,

Accepted to NeurIPS 2024

Abstract

Method

TODOs

Setup

Installation

Running

Acknowledge

About

Releases

Packages

Languages

Chenkehan21/VLN-ATT

Folders and files

Latest commit

History

Repository files navigation

Everyday Object Meets Vision-and-Language Navigation Agent via Backdoor

Paper,

Accepted to NeurIPS 2024

Abstract

Method

TODOs

Setup

Installation

Running

Acknowledge

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages