This GitHub page consists of the dataset and implementation of our paper "What is your favorite gender MLM?: Gender Bias Evaluation in Multilingual Masked Language Models." Our work distinguishes itself from other works through its unique features and characteristics such as:
Strengths
- It provides a multi-lingual gender lexicon in English, German, Spanish, Portuguese, and Chinese.
- It evaluates the gender bias of language models on any corpus in these five languages.
- The evaluation corpus and the language model can be easily altered to assess gender bias.
- Multilingual Gender Lexicon
- MGL in five languages, English, German, Spanish, Portuguese, and Chinese are within "eval_words" folder in the repository.
- Encoded as a pickle file, each file is classified with respect to gender and language.
- In generating the pairs of sentences for evaluating gender bias of language models, each file is required as input.
- Lexicon_based and Model_based Sentence Extraction
- Given this MGL from eval_words folder, lexicon_based and model_based sentence extraction is conducted through "extract.py" file.
- Within this file, one can change the evaluation corpus by modifying the arguments of this Python file.
- The required arguments to pass are the language of the corpus(model), the male gender lexicon, the female gender lexicon, and the corpus.
- This file first tokenizes the corpus, extracts the sentences containing the gendered word, generates the sentences, and writes the sentences in pickle format.
- One can also use Jupyter Notebook to make the sentence that is shown in "extraction_chn.ipynb" file.
- An illustration of how this pipeline works is shown in the main function of "extract.py" file.
- Multilingual Bias Evaluation Metrics
- Using the sentences, Strict Bias Metrics that quantify gender bias of language models can be evaluated in "MBE_Calculation.ipynb" file.
- With the size of our corpus being approximately 30,000 sentences for each language, our evaluation for each language took less than 10 minutes for each language.