This repository contains the code accompanying our ICLR 2025 paper:
Nikola Jovanović, Robin Staab, Maximilian Baader and Martin Vechev. 2025. Ward: Provable RAG Dataset Inference via LLM Watermarks. In Proceedings of ICLR ’25.
This project uses uv
for package management. To install uv
, run:
curl -LsSf https://astral.sh/uv/install.sh | sh
The dependencies will be installed on the first run of Ward.
The models that are queried via API use the following keys that should be set as environment variables:
- OpenAI models (
gpt3.5
,gpt4o
):OPENAI_API_KEY
- Anthropic models (
claude*
):ANTHROPIC_API_KEY
- Models hosted on Together (
llama*
andqwen*
):TOGETHER_API_KEY
The project structure is as follows.
main.py
is the main entry point for the code. The default code path runs the RAG-DI evaluation based on the provided config (callingsettingone()
andsettingtwo()
indriver.py
).- It can also be used to invoke baseline attacker training (
train_attackers(cfg, data_dir)
indriver.py
), and the quality check experiment from Sec. 5.3 (quality_check(cfg, data_dir, cache_filenames)
indriver.py
).
- It can also be used to invoke baseline attacker training (
datagen.py
was used to generate FARAD.src/
contains the rest of the code, namely:src/attackers
contains implementations of Ward (watermark_attacker.py
) and the baselines (FACTS, SIB, and AAG).src/config
contains definitions of our Pydantic configuration files. Refer toragwm_config.py
for detailed explanations of each field.src/models
contains model classes for all our models (on both RAG provider and data owner side), and the P-SP paraphrasing quality evaluation.src/utils
contains utility functions for file handling and logging.src/watermarks
contains watermark implementations to be used by the data owner.rag_system.py
implements the RAG systems, both the simulated and the end-to-end one (Sec 5.3), including the MemFree decoding (Sec 5.2).
cache/
starts as empty files and will contain the cached intermediate steps of the methods. For more complex experiments make sure to clear this.configs/
contains YAML configuration files (corresponding tosrc/config/ragwm_config.py
) for our main experiments reported in the paper.main_experiment.yaml
corresponds to our main results in Sec. 5.1 and 5.2.memfree.yaml
is used for our robustness experiment in Sec. 5.2.endtoend*.yaml
are used for our Modeling retrieval experiments in Sec. 5.3.abl_wm_*.yaml
andabl_rag_*.yaml
are used for the corresponding parts of our ablations in Sec. 5.4.
farad/
holds the FARAD dataset.
Our code can be run with uv
by providing a path to a YAML configuration file. For example,
uv run python3 main.py configs/example.yaml
will run the FACTS
baseline with the Def-P prompt and GPT3.5
as the RAG model, in the Hard setting. See main_experiment.yaml
for a full configuration of our main experiments.
Nikola Jovanović, [email protected]
Robin Staab, [email protected]
If you use our code please cite the following.
@inproceedings{jovanovic2025ward,
author = {Jovanović, Nikola and Staab, Robin and Baader, Maximilian and Vechev, Martin},
title = {Ward: Provable RAG Dataset Inference via LLM Watermarks},
journal = {{ICLR}},
year = {2025}
}