Skip to content

Commit

Permalink
Merge pull request #874 from TransformerLensOrg/dev
Browse files Browse the repository at this point in the history
Release v2.15.0
  • Loading branch information
bryce13950 authored Feb 20, 2025
2 parents 5e328e9 + 555f355 commit e65fafb
Show file tree
Hide file tree
Showing 17 changed files with 1,091 additions and 149 deletions.
147 changes: 120 additions & 27 deletions demos/BERT.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
"metadata": {},
"source": [
"# BERT in TransformerLens\n",
"This demo shows how to use BERT in TransformerLens for the Masked Language Modelling task."
"This demo shows how to use BERT in TransformerLens for the Masked Language Modelling and Next Sentence Prediction task."
]
},
{
Expand All @@ -29,16 +29,14 @@
},
{
"cell_type": "code",
"execution_count": 15,
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Running as a Jupyter notebook - intended for development only!\n",
"The autoreload extension is already loaded. To reload it, use:\n",
" %reload_ext autoreload\n"
"Running as a Jupyter notebook - intended for development only!\n"
]
},
{
Expand Down Expand Up @@ -92,7 +90,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 2,
"metadata": {},
"outputs": [
{
Expand All @@ -116,7 +114,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 3,
"metadata": {},
"outputs": [
{
Expand All @@ -136,7 +134,7 @@
"<circuitsvis.utils.render.RenderedHTML at 0x13a9760d0>"
]
},
"execution_count": 4,
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
Expand All @@ -150,7 +148,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -159,12 +157,12 @@
"\n",
"from transformers import AutoTokenizer\n",
"\n",
"from transformer_lens import HookedEncoder"
"from transformer_lens import HookedEncoder, BertNextSentencePrediction"
]
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 5,
"metadata": {},
"outputs": [
{
Expand All @@ -173,7 +171,7 @@
"<torch.autograd.grad_mode.set_grad_enabled at 0x2a285a790>"
]
},
"execution_count": 6,
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
Expand All @@ -189,12 +187,12 @@
"source": [
"# BERT\n",
"\n",
"In this section, we will load a pretrained BERT model and use it for the Masked Language Modelling task"
"In this section, we will load a pretrained BERT model and use it for the Masked Language Modelling and Next Sentence Prediction task"
]
},
{
"cell_type": "code",
"execution_count": 14,
"execution_count": 6,
"metadata": {},
"outputs": [
{
Expand Down Expand Up @@ -225,37 +223,132 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Use the \"[MASK]\" token to mask any tokens which you would like the model to predict."
"## Masked Language Modelling\n",
"Use the \"[MASK]\" token to mask any tokens which you would like the model to predict. \n",
"When specifying return_type=\"predictions\" the prediction of the model is returned, alternatively (and by default) the function returns logits. \n",
"You can also specify None as return type for which nothing is returned"
]
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 7,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Prompt: The [MASK] is bright today.\n",
"Prediction: \"sun\"\n"
]
}
],
"source": [
"prompt = \"The [MASK] is bright today.\"\n",
"\n",
"prediction = bert(prompt, return_type=\"predictions\")\n",
"\n",
"print(f\"Prompt: {prompt}\")\n",
"print(f'Prediction: \"{prediction}\"')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can also input a list of prompts:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Prompt: ['The [MASK] is bright today.', 'She [MASK] to the store.', 'The dog [MASK] the ball.']\n",
"Prediction: \"['Prediction 0: sun', 'Prediction 1: went', 'Prediction 2: caught']\"\n"
]
}
],
"source": [
"prompts = [\"The [MASK] is bright today.\", \"She [MASK] to the store.\", \"The dog [MASK] the ball.\"]\n",
"\n",
"predictions = bert(prompts, return_type=\"predictions\")\n",
"\n",
"print(f\"Prompt: {prompts}\")\n",
"print(f'Prediction: \"{predictions}\"')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Next Sentence Prediction\n",
"To carry out Next Sentence Prediction, you have to use the class BertNextSentencePrediction, and pass a HookedEncoder in its constructor. \n",
"Then, create a list with the two sentences you want to perform NSP on as elements and use that as input to the forward function. \n",
"The model will then predict the probability of the sentence at position 1 following (i.e. being the next sentence) to the sentence at position 0."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Sentence A: A man walked into a grocery store.\n",
"Sentence B: He bought an apple.\n",
"Prediction: \"The sentences are sequential\"\n"
]
}
],
"source": [
"prompt = \"BERT: Pre-training of Deep Bidirectional [MASK] for Language Understanding\"\n",
"nsp = BertNextSentencePrediction(bert)\n",
"sentence_a = \"A man walked into a grocery store.\"\n",
"sentence_b = \"He bought an apple.\"\n",
"\n",
"input_ids = tokenizer(prompt, return_tensors=\"pt\")[\"input_ids\"]\n",
"mask_index = (input_ids.squeeze() == tokenizer.mask_token_id).nonzero().item()"
"input = [sentence_a, sentence_b]\n",
"\n",
"predictions = nsp(input, return_type=\"predictions\")\n",
"\n",
"print(f\"Sentence A: {sentence_a}\")\n",
"print(f\"Sentence B: {sentence_b}\")\n",
"print(f'Prediction: \"{predictions}\"')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Inputting tokens directly\n",
"You can also input tokens instead of a string or a list of strings into the model, which could look something like this"
]
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Prompt: BERT: Pre-training of Deep Bidirectional [MASK] for Language Understanding\n",
"Prediction: \"Systems\"\n"
"Prompt: The [MASK] is bright today.\n",
"Prediction: \"sun\"\n"
]
}
],
"source": [
"logprobs = bert(input_ids)[input_ids == tokenizer.mask_token_id].log_softmax(dim=-1)\n",
"prompt = \"The [MASK] is bright today.\"\n",
"\n",
"tokens = tokenizer(prompt, return_tensors=\"pt\")[\"input_ids\"]\n",
"logits = bert(tokens) # Since we are not specifying return_type, we get the logits\n",
"logprobs = logits[tokens == tokenizer.mask_token_id].log_softmax(dim=-1)\n",
"prediction = tokenizer.decode(logprobs.argmax(dim=-1).item())\n",
"\n",
"print(f\"Prompt: {prompt}\")\n",
Expand All @@ -267,13 +360,13 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Better luck next time, BERT."
"Well done, BERT!"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
Expand All @@ -287,7 +380,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.8"
"version": "3.10.15"
},
"orig_nbformat": 4
},
Expand Down
Loading

0 comments on commit e65fafb

Please sign in to comment.