SNLI-Attack-Analysis

Setup for starters_code

Clone this repo and switch the working directory

git clone https://github.com/ckvermaAI/NLP_project.git
cd ./starters_code

Install the requirements

pip install -r requirements.txt

Run the model

# Training (checkpoints will be saved under output_dir)
python3 run.py --do_train --task nli --dataset snli --output_dir ./trained_model/

# Evaluation
python3 run.py --do_eval --task nli --dataset snli --model ./trained_model/ --output_dir ./eval_output/

Setup for universal_triggers

Clone this repo and install requirements for universal_triggers

git clone https://github.com/ckvermaAI/allennlp-fork.git
pip install allennlp-fork/

Run the script

python universal_triggers/triggers.py

Generating the triggers

# Use the hotflip attack to generate universal triggers

# 1) Attack on entailment class, to flip the label to contradiction
python triggers.py --label_filter entailment --target_label 1 2>&1 | tee hotflip/entailment-contradiction.log
# 2) Attack on entailment class, to flip the label to neutral
python triggers.py --label_filter entailment --target_label 2 2>&1 | tee hotflip/entailment-neutral.log

# 3) Attack on contradiction class, to flip the label to entailment
python triggers.py --label_filter contradiction --target_label 0  2>&1 | tee hotflip/contradiction-entailment.log
# 4) Attack on contradiction class, to flip the label to neutral
python triggers.py --label_filter contradiction --target_label 2 2>&1 | tee hotflip/contradiction-neutral.log

# 5) Attack on neutral class, to flip the label to entailment
python triggers.py --label_filter neutral --target_label 0 2>&1 | tee hotflip/neutral-entailment.log
# 6) Attack on neutral class, to flip the label to contradiction
python triggers.py --label_filter neutral --target_label 1 2>&1 | tee hotflip/neutral-contradiction.log

Creating the dataset with generated triggers

Use the ./universal_triggers/build_dataset.py to generate the dataset
Update the inputs to create_dataset function as required and run the scrips

Source code

Starters code

Pulled from https://github.com/gregdurrett/fp-dataset-artifacts

Universal triggers

Pulled from https://github.com/Eric-Wallace/universal-triggers/

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
helper		helper
starters_code		starters_code
universal_triggers		universal_triggers
.gitignore		.gitignore
README.md		README.md
SNLI-attack-analysis.pdf		SNLI-attack-analysis.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SNLI-Attack-Analysis

Setup for starters_code

Setup for universal_triggers

Generating the triggers

Creating the dataset with generated triggers

Source code

Starters code

Universal triggers

About

Releases

Packages

Contributors 2

Languages

ckvermaAI/SNLI-Attack-Analysis

Folders and files

Latest commit

History

Repository files navigation

SNLI-Attack-Analysis

Setup for starters_code

Setup for universal_triggers

Generating the triggers

Creating the dataset with generated triggers

Source code

Starters code

Universal triggers

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages