diff --git a/.gitignore b/.gitignore index dbf7443..f832e9d 100644 --- a/.gitignore +++ b/.gitignore @@ -36,4 +36,7 @@ target/ .vscode lib/ *.cmd -*.ipynb \ No newline at end of file +*.ipynb +.python-version +mlm_bias.egg-info/ +dist/ \ No newline at end of file diff --git a/CITATION.cff b/CITATION.cff new file mode 100644 index 0000000..ae2c7b6 --- /dev/null +++ b/CITATION.cff @@ -0,0 +1,22 @@ +cff-version: 1.2.0 +message: If you use this software, please cite both the software and corresponding paper. +authors: + - family-names: Zalkikar + given-names: Rahul +title: mlm-bias +version: 0.1.5 +url: https://github.com/zalkikar/mlm-bias +preferred-citation: + authors: + - family-names: Zalkikar + given-names: Rahul + - family-names: Chandra + given-names: Kanchan + title: Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality + year: 2024 + version: 2.0 + doi: 10.48550/arXiv.2402.13954 + url: https://arxiv.org/abs/2402.13954 + arxiv: 2402.13954 + eprint: 2402.13954 + primary-class: cs.CL diff --git a/README.md b/README.md index a7ca703..12cdf72 100644 --- a/README.md +++ b/README.md @@ -1,65 +1,91 @@ # Measuring Biases in Masked Language Models for PyTorch Transformers -Evaluate biases in (pre-trained or re-trained) masked language models (MLMs), such as those available thorugh [HuggingFace](https://huggingface.co/models) using multiple state-of-the-art methods to compute a biase score for each bias type in benchmark datasets [CrowS-Pairs (CPS)](https://github.com/nyu-mll/crows-pairs) and [StereoSet (SS)](https://github.com/moinnadeem/StereoSet) (intrasentence), or a custom linebyline dataset with files `bias_types.txt` containing bias categories, and `dis.txt` and `adv.txt` (to create sentence pairs), where `dis.txt` contains sentences with bias against disadvantaged groups (stereotypical) and `adv.txt` contains sentences with bias against advantaged groups (anti-stereotypical). Additionally, compare relative bias between two MLMs (and compare re-trained MLMs with their pre-trained base). +Evaluate biases in pre-trained or re-trained masked language models (MLMs), such as those available through [HuggingFace](https://huggingface.co/models). This package computes bias scores across various bias types, using benchmark datasets like [CrowS-Pairs (CPS)](https://github.com/nyu-mll/crows-pairs) and [StereoSet (SS)](https://github.com/moinnadeem/StereoSet) (intrasentence), or custom datasets. You can also compare relative bias between two MLMs, or evaluate re-trained MLMs versus their pre-trained base models. ## Evaluation Methods -Bias scores for an MLM are computed based on implemented measures for sentence pairs in the dataset. +**Bias scores for an MLM** are computed for sentence pairs in the dataset using measures that represent MLM preference (or prediction quality). Bias against disadvantaged groups for a sentence pair is represented by a higher relative measure value for a sentence in `adv` compared to `dis`. -Measures computed using an iterative masking experiment, where an MLM masks one token at a time until all tokens are masked once (so we have `n` logits or predictions for a sentence with `n` tokens) are reported below (see current citation in [Citation](#citation)). As a result these measures take longer to compute. Measures are defined to represent MLM preference (or prediction quality). Bias against disadvantaged groups for a sentence pair is represented by a higher relative measure value for a sentence in `adv` compared to `dis`. -- `CRR` (Difference in reciprocal rank of a predicted token (always equal to 1) and the reciprocal rank of a masked token -- `CRRA` (`CRR` with Attention weights) -- Δ`P` (Difference in log-liklihood of a predicted token and the masked token) -- Δ`PA` (Δ`P` with Attention weights) +**Iterative Masking Experiment (IME)**: For each sentence, an MLM masks one token at a time until all tokens are masked once, generating `n` logits or predictions for a sentence with `n` tokens. -Measures that are computed with a single encoded input (see citations in [References](#references) for more details): -- `CSPS` (https://arxiv.org/abs/2010.00133; CrowS-Pairs Scores is a log-likelihood score for an MLM selecting unmodified tokens given modified ones -- `SSS` (https://arxiv.org/abs/2004.09456; StereoSet Score is a log-likelihood score for an MLM selecting modified tokens given unmodified ones -- `AUL` (https://arxiv.org/abs/2104.07496; All Unmasked Likelihood is a log-likelihood score generated by predicting all tokens in a single unmasked input -- `AULA`; https://arxiv.org/abs/2104.07496; `AUL` with Attention weights) +### Measures + +We use state-of-the-art measures computed under the **IME**: + +- **`CRR`**: Difference in reciprocal rank of a predicted token (always equal to 1) and the reciprocal rank of a masked token [arXiv](https://arxiv.org/abs/2402.13954) +- **`CRRA`**: `CRR` with Attention weights [arXiv](https://arxiv.org/abs/2402.13954) +- **Δ`P`**: Difference in log-liklihood of a predicted token and the masked token [arXiv](https://arxiv.org/abs/2402.13954) +- **Δ`PA`**: Δ`P` with Attention weights [arXiv](https://arxiv.org/abs/2402.13954) + +Measures computed with a single encoded input (see [References](#references) for more details): +- **`CSPS`**: CrowS-Pairs Scores is a log-likelihood score for an MLM selecting unmodified tokens given modified ones [arXiv](https://arxiv.org/abs/2010.00133) +- **`SSS`**: StereoSet Score is a log-likelihood score for an MLM selecting modified tokens given unmodified ones [arXiv](https://arxiv.org/abs/2004.09456) +- **`AUL`**: All Unmasked Likelihood is a log-likelihood score generated by predicting all tokens in a single unmasked input [arXiv](https://arxiv.org/abs/2104.07496) +- **`AULA`**: `AUL` with Attention weights [arXiv](https://arxiv.org/abs/2104.07496) + +*Note: Measures computed using ***IME*** take longer to compute.* ## Setup -`pip install mlm-bias` +```bash +pip install mlm-bias +``` ```python import mlm_bias + +# Load the CPS dataset cps_dataset = mlm_bias.BiasBenchmarkDataset("cps") cps_dataset.sample(indices=list(range(10))) + +# Specify the model model = "bert-base-uncased" + +# Initialize the BiasMLM evaluator mlm_bias = mlm_bias.BiasMLM(model, cps_dataset) + +# Evaluate the model result = mlm_bias.evaluate(inc_attention=True) + +# Save the results result.save("./bert-base-uncased") ``` ## Example Script -Clone this repo: +Clone the repository and install the package: + ```bash git clone https://github.com/zalkikar/mlm-bias.git cd mlm-bias python3 -m pip install . ``` -Using the `mlm_bias.py` example script: +Run the `mlm_bias.py` example script: + ```bash mlm_bias.py [-h] --data {cps,ss,custom} --model MODEL [--model2 MODEL2] [--output OUTPUT] [--measures {all,crr,crra,dp,dpa,aul,aula,csps,sss}] [--start S] [--end E] ``` +Example arguments: + ```bash -# single mlm +# Single MLM python3 mlm_bias.py --data cps --model roberta-base --start 0 --end 30 python3 mlm_bias.py --data ss --model bert-base-uncased --start 0 --end 30 -# relative + +# Relative between two MLMs python3 mlm_bias.py --data cps --model roberta-base --start 0 --end 30 --model2 bert-base-uncased ``` -With default arguments: -- `/data` will have `cps.csv` (CPS) and/or `ss.csv` (SS) -- `/eval` will have `out.txt` with computed measures and pickled results objects +Output directories (default arguments): +- `/data` contains `cps.csv` (CPS) and/or `ss.csv` (SS). +- `/eval` contains `out.txt` with computed bias scores and pickled result objects. + + +### Example Output: -Example command output: -```console +```bash Created output directory. Created Data Directory |██████████████████████████████| 1/1 [100%] in 0s ETA: 0s Downloaded Data [CrowSPairs] |██████████████████████████████| 1/1 [100%] in 0s ETA: 0s @@ -79,74 +105,107 @@ SSS total = 53.333 CSPS total = 63.33 ``` +## Custom Datasets + +Compute bias scores for a custom dataset directory with the following line-by-line files: + +- `bias_types.txt` containing bias categories. +- `dis.txt` and `adv.txt` containing sentence pairs, where: + - `dis.txt` contains sentences with bias against disadvantaged groups (stereotypical) and + - `adv.txt` contains sentences with bias against advantaged groups (anti-stereotypical). + ## Citation If using this for research, please cite the following: -``` -@misc{zalkikar-chandra-2024, - author = {Rahul Zalkikar and Kanchan Chandra}, - title = {Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality}, - year = {2024} +```bibtex +@misc{zalkikar2024measuringsocialbiasesmasked, + title={Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality}, + author={Rahul Zalkikar and Kanchan Chandra}, + year={2024}, + eprint={2402.13954}, + archivePrefix={arXiv}, + primaryClass={cs.CL}, + url={https://arxiv.org/abs/2402.13954} } ``` ## References -``` -@InProceedings{Kaneko:AUL:2022, - author={Masahiro Kaneko and Danushka Bollegala}, - title={Unmasking the Mask -- Evaluating Social Biases in Masked Language Models}, - booktitle = {Proceedings of the 36th AAAI Conference on Artificial Intelligence}, - year = {2022}, - month = {February}, - address = {Vancouver, BC, Canada} +```bibtex +@article{Kaneko_Bollegala_2022, + title={Unmasking the Mask – Evaluating Social Biases in Masked Language Models}, + volume={36}, + url={https://ojs.aaai.org/index.php/AAAI/article/view/21453}, + DOI={10.1609/aaai.v36i11.21453}, + number={11}, + journal={Proceedings of the AAAI Conference on Artificial Intelligence}, + author={Kaneko, Masahiro and Bollegala, Danushka}, + year={2022}, + month={Jun.}, + pages={11954-11962} } ``` -``` -@article{salutari-etal-2023, - author = {Flavia Salutari and Jerome Ramos and Hosein A Rahmani and Leonardo Linguaglossa and Aldo Lipani.}, - title = {Quantifying -the Bias of Transformer-Based Language Models for African American English in Masked Language -Modeling.}, - journal = {The Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) 2023}, - month = may, - year = {2023}, - url = {https://telecom-paris.hal.science/hal-04067844}, +```bibtex +@InProceedings{10.1007/978-3-031-33374-3_42, + author="Salutari, Flavia + and Ramos, Jerome + and Rahmani, Hossein A. + and Linguaglossa, Leonardo + and Lipani, Aldo", + editor="Kashima, Hisashi + and Ide, Tsuyoshi + and Peng, Wen-Chih", + title="Quantifying the Bias of Transformer-Based Language Models for African American English in Masked Language Modeling", + booktitle="Advances in Knowledge Discovery and Data Mining", + year="2023", + publisher="Springer Nature Switzerland", + address="Cham", + pages="532--543", + isbn="978-3-031-33374-3" } ``` -``` +```bibtex @inproceedings{nangia-etal-2020-crows, - title = "{C}row{S}-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models", - author = "Nangia, Nikita and - Vania, Clara and - Bhalerao, Rasika and - Bowman, Samuel R.", - editor = "Webber, Bonnie and - Cohn, Trevor and - He, Yulan and - Liu, Yang", - booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)", - month = nov, - year = "2020", - address = "Online", - publisher = "Association for Computational Linguistics", - url = "https://aclanthology.org/2020.emnlp-main.154", - doi = "10.18653/v1/2020.emnlp-main.154", - pages = "1953--1967", - abstract = "Pretrained language models, especially masked language models (MLMs) have seen success across many NLP tasks. However, there is ample evidence that they use the cultural biases that are undoubtedly present in the corpora they are trained on, implicitly creating harm with biased representations. To measure some forms of social bias in language models against protected demographic groups in the US, we introduce the Crowdsourced Stereotype Pairs benchmark (CrowS-Pairs). CrowS-Pairs has 1508 examples that cover stereotypes dealing with nine types of bias, like race, religion, and age. In CrowS-Pairs a model is presented with two sentences: one that is more stereotyping and another that is less stereotyping. The data focuses on stereotypes about historically disadvantaged groups and contrasts them with advantaged groups. We find that all three of the widely-used MLMs we evaluate substantially favor sentences that express stereotypes in every category in CrowS-Pairs. As work on building less biased models advances, this dataset can be used as a benchmark to evaluate progress.", + title = "{C}row{S}-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models", + author = "Nangia, Nikita and + Vania, Clara and + Bhalerao, Rasika and + Bowman, Samuel R.", + editor = "Webber, Bonnie and + Cohn, Trevor and + He, Yulan and + Liu, Yang", + booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)", + month = nov, + year = "2020", + address = "Online", + publisher = "Association for Computational Linguistics", + url = "https://aclanthology.org/2020.emnlp-main.154", + doi = "10.18653/v1/2020.emnlp-main.154", + pages = "1953--1967" } ``` -``` -@misc{nadeem2020stereoset, - title={StereoSet: Measuring stereotypical bias in pretrained language models}, - author={Moin Nadeem and Anna Bethke and Siva Reddy}, - year={2020}, - eprint={2004.09456}, - archivePrefix={arXiv}, - primaryClass={cs.CL} +```bibtex +@inproceedings{nadeem-etal-2021-stereoset, + title = "{S}tereo{S}et: Measuring stereotypical bias in pretrained language models", + author = "Nadeem, Moin and + Bethke, Anna and + Reddy, Siva", + editor = "Zong, Chengqing and + Xia, Fei and + Li, Wenjie and + Navigli, Roberto", + booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)", + month = aug, + year = "2021", + address = "Online", + publisher = "Association for Computational Linguistics", + url = "https://aclanthology.org/2021.acl-long.416", + doi = "10.18653/v1/2021.acl-long.416", + pages = "5356--5371" } ``` diff --git a/mlm_bias/__version__.py b/mlm_bias/__version__.py index acf3be3..de49d1f 100644 --- a/mlm_bias/__version__.py +++ b/mlm_bias/__version__.py @@ -1 +1 @@ -__version__ = "0.1.3" \ No newline at end of file +__version__ = "0.1.5" \ No newline at end of file diff --git a/setup.cfg b/setup.cfg index be9fdab..1156bdb 100644 --- a/setup.cfg +++ b/setup.cfg @@ -1,6 +1,6 @@ [metadata] name = mlm-bias -version = 0.1.4 +version = 0.1.5 author = Rahul Zalkikar author_email = rayzck9@gmail.com description = Bias Evaluation Methods for Masked Language Models implemented in PyTorch diff --git a/setup.py b/setup.py index 7063d19..6d84086 100644 --- a/setup.py +++ b/setup.py @@ -5,7 +5,7 @@ setup( name='mlm-bias', - version='0.1.4', + version='0.1.5', author='Rahul Zalkikar', author_email='rayzck9@gmail.com', description='Bias Evaluation Methods for Masked Language Models implemented in PyTorch',