Skip to content

Commit

Permalink
remove models that don't support generic summaries.
Browse files Browse the repository at this point in the history
  • Loading branch information
njbrake committed Jan 17, 2025
1 parent 3d5b413 commit fc1d5d1
Show file tree
Hide file tree
Showing 7 changed files with 38 additions and 187 deletions.
35 changes: 0 additions & 35 deletions docs/source/get-started/suggested-models.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,8 +83,6 @@ launched it.
| Model Type | Model | HuggingFace | API | llamafile |
|------------|------------------------------------------|-------------|-----|-----------|
| seq2seq | facebook/bart-large-cnn | X | | |
| seq2seq | longformer-qmsum-meeting-summarization | X | | |
| seq2seq | mrm8488/t5-base-finetuned-summarize-news | X | | |
| seq2seq | Falconsai/text_summarization | X | | |
| causal | gpt-4o-mini, gpt-4o | | X | |
| causal | open-mistral-7b | | X | |
Expand All @@ -110,39 +108,6 @@ evaluation are:
| `no_repeat_ngram_size` | All n-grams of that size can only occur once | 3 |
| `num_beams` | Number of beams for beam search | 4 |

## Longformer QMSum Meeting Summarization

The [`longformer-qmsum-meeting-summarization`](https://huggingface.co/mikeadimech/longformer-qmsum-meeting-summarization)
model is a fine-tuned version of [alenai/led-base-16384](https://huggingface.co/allenai/led-base-16384)
for summarization.

As described in [Longformer: The Long-Document Transformer](https://arxiv.org/pdf/2004.05150.pdf) by
Iz Beltagy, Matthew E. Peters, Arman Cohan, `led-base-16384` was initialized from `bart-base` since
both models share the exact same architecture, but modified for long-range summarization and
question answering.

The model has 162M parameters (FP32), and the model size is 648MB. There are no
summarization-specific parameters for this model.

## T5 Base Finetuned Summarize News

The [`mrm8488/t5-base-finetuned-summarize-news`](https://huggingface.co/mrm8488/t5-base-finetuned-summarize-news)
model is a [Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html)
base fine-tuned on [News Summary](https://www.kaggle.com/sunnysai12345/news-summary) dataset for
summarization downstream task.

The model has 223M parameters (FP32), and the model size is 892MB. The default parameters used for
evaluation are:

| Parameter Name | Description | Value |
|------------------------|--------------------------------------------------------|-------|
| `max_length` | Maximum length of the summary | 200 |
| `min_length` | Minimum length of the summary | 30 |
| `length_penalty` | Length penalty to apply during beam search | 2.0 |
| `early_stopping` | Controls the stopping condition for beam-based methods | true |
| `no_repeat_ngram_size` | All n-grams of that size can only occur once | 3 |
| `num_beams` | Number of beams for beam search | 4 |

## Falconsai Text Summarization

The [`Falconsai/text_summarization`](https://huggingface.co/Falconsai/text_summarization) model is
Expand Down
2 changes: 0 additions & 2 deletions lumigator/python/mzai/backend/backend/config_templates.py
Original file line number Diff line number Diff line change
Expand Up @@ -161,8 +161,6 @@
JobType.EVALUATION: {
"default": causal_eval_template,
"hf://facebook/bart-large-cnn": bart_eval_template,
"hf://mikeadimech/longformer-qmsum-meeting-summarization": seq2seq_eval_template,
"hf://mrm8488/t5-base-finetuned-summarize-news": seq2seq_eval_template,
"hf://Falconsai/text_summarization": seq2seq_eval_template,
"hf://mistralai/Mistral-7B-Instruct-v0.3": causal_eval_template,
"oai://gpt-4o-mini": oai_eval_template,
Expand Down
28 changes: 0 additions & 28 deletions lumigator/python/mzai/backend/backend/models.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,34 +15,6 @@
no_repeat_ngram_size: 3
num_beams: 4

- name: mikeadimech/longformer-qmsum-meeting-summarization
uri: hf://mikeadimech/longformer-qmsum-meeting-summarization
website_url: https://huggingface.co/mikeadimech/longformer-qmsum-meeting-summarization
description: Longformer is a transformer model that is capable of processing long sequences.
info:
parameter_count: 162M
tensor_type: F32
model_size: 648MB
tasks:
- summarization:

- name: mrm8488/t5-base-finetuned-summarize-news
uri: hf://mrm8488/t5-base-finetuned-summarize-news
website_url: https://huggingface.co/mrm8488/t5-base-finetuned-summarize-news
description: Google's T5 base fine-tuned on News Summary dataset for summarization downstream task.
info:
parameter_count: 223M
tensor_type: F32
model_size: 892MB
tasks:
- summarization:
max_length: 200
min_length: 30
length_penalty: 2.0
early_stopping: true
no_repeat_ngram_size: 3
num_beams: 4

- name: Falconsai/text_summarization
uri: hf://Falconsai/text_summarization
website_url: https://huggingface.co/Falconsai/text_summarization
Expand Down
37 changes: 0 additions & 37 deletions lumigator/python/mzai/backend/backend/tests/data/models.json
Original file line number Diff line number Diff line change
Expand Up @@ -23,43 +23,6 @@
}
]
},
{
"name": "mikeadimech/longformer-qmsum-meeting-summarization",
"uri": "hf://mikeadimech/longformer-qmsum-meeting-summarization",
"description": "Longformer is a transformer model that is capable of processing long sequences.",
"info": {
"parameter_count": "162M",
"tensor_type": "F32",
"model_size": "648MB"
},
"tasks": [
{
"summarization": null
}
]
},
{
"name": "mrm8488/t5-base-finetuned-summarize-news",
"uri": "hf://mrm8488/t5-base-finetuned-summarize-news",
"description": "Google's T5 base fine-tuned on News Summary dataset for summarization downstream task.",
"info": {
"parameter_count": "223M",
"tensor_type": "F32",
"model_size": "892MB"
},
"tasks": [
{
"summarization": {
"max_length": 200,
"min_length": 30,
"length_penalty": 2,
"early_stopping": true,
"no_repeat_ngram_size": 3,
"num_beams": 4
}
}
]
},
{
"name": "Falconsai/text_summarization",
"uri": "hf://Falconsai/text_summarization",
Expand Down
39 changes: 0 additions & 39 deletions lumigator/python/mzai/sdk/tests/data/models.json
Original file line number Diff line number Diff line change
Expand Up @@ -24,45 +24,6 @@
}
]
},
{
"name": "mikeadimech/longformer-qmsum-meeting-summarization",
"uri": "hf://mikeadimech/longformer-qmsum-meeting-summarization",
"website_url": "https://huggingface.co/mikeadimech/longformer-qmsum-meeting-summarization/discussions",
"description": "Longformer is a transformer model that is capable of processing long sequences.",
"info": {
"parameter_count": "162M",
"tensor_type": "F32",
"model_size": "648MB"
},
"tasks": [
{
"summarization": null
}
]
},
{
"name": "mrm8488/t5-base-finetuned-summarize-news",
"uri": "hf://mrm8488/t5-base-finetuned-summarize-news",
"website_url": "https://huggingface.co/mrm8488/t5-base-finetuned-summarize-news",
"description": "Google's T5 base fine-tuned on News Summary dataset for summarization downstream task.",
"info": {
"parameter_count": "223M",
"tensor_type": "F32",
"model_size": "892MB"
},
"tasks": [
{
"summarization": {
"max_length": 200,
"min_length": 30,
"length_penalty": 2,
"early_stopping": true,
"no_repeat_ngram_size": 3,
"num_beams": 4
}
}
]
},
{
"name": "Falconsai/text_summarization",
"uri": "hf://Falconsai/text_summarization",
Expand Down
4 changes: 1 addition & 3 deletions notebooks/assets/model_info.csv
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
model_name,RAM_MiB,RAM_GB
hf://facebook/bart-large-cnn,2709MiB,2.71
hf://mikeadimech/longformer-qmsum-meeting-summarization,2027MiB,2.03
hf://mrm8488/t5-base-finetuned-summarize-news,3085MiB,3.09
hf://Falconsai/text_summarization,1423MiB,1.43
hf://mistralai/Mistral-7B-Instruct-v0.3,30645MiB,30.65
mistral://open-mistral-7b,30645MiB,30.65
Expand All @@ -10,4 +8,4 @@ hf://meta-llama/Meta-Llama-3-8B,34189MiB,34.19
hf://microsoft/Phi-3-mini-4k-instruct,19455MiB,19.46
oai://gpt-4o-mini,,
oai://gpt-4-turbo,,
oai://gpt-3.5-turbo-0125,,
oai://gpt-3.5-turbo-0125,,
80 changes: 37 additions & 43 deletions notebooks/walkthrough.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -226,31 +226,23 @@
"metadata": {},
"outputs": [],
"source": [
"# Importing packages we need to work with data \n",
"# Importing packages we need to work with data\n",
"# python standard libraries\n",
"import os\n",
"import time\n",
"import json\n",
"\n",
"# Random string generator\n",
"import random\n",
"import string\n",
"import shortuuid\n",
"import matplotlib.pyplot as plt\n",
"\n",
"# third-party libraries\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt\n",
"from datasets import load_dataset\n",
"from IPython.display import clear_output\n",
"\n",
"from lumigator_sdk.lumigator import LumigatorClient\n",
"from lumigator_schemas.datasets import DatasetFormat\n",
"from lumigator_schemas.jobs import JobType, JobEvalCreate\n",
"\n",
"from utils import job_result_download, results_to_table, get_nested_value\n",
"from lumigator_schemas.jobs import JobEvalCreate, JobType\n",
"from lumigator_sdk.lumigator import LumigatorClient\n",
"from utils import get_nested_value, job_result_download, results_to_table\n",
"\n",
"# wrap columns for inspection\n",
"pd.set_option('display.max_colwidth', 0)\n",
"pd.set_option(\"display.max_colwidth\", 0)\n",
"# stylesheet for visibility\n",
"plt.style.use(\"fast\")\n",
"\n",
Expand All @@ -265,8 +257,8 @@
"metadata": {},
"outputs": [],
"source": [
"LUMIGATOR_SERVICE_HOST = os.getenv('LUMIGATOR_SERVICE_HOST', 'localhost')\n",
"LUMIGATOR_SERVICE_PORT = os.getenv('LUMIGATOR_SERVICE_PORT', '8000')"
"LUMIGATOR_SERVICE_HOST = os.getenv(\"LUMIGATOR_SERVICE_HOST\", \"localhost\")\n",
"LUMIGATOR_SERVICE_PORT = os.getenv(\"LUMIGATOR_SERVICE_PORT\", \"8000\")"
]
},
{
Expand Down Expand Up @@ -376,20 +368,28 @@
"source": [
"# The dataset is available at https://huggingface.co/datasets/knkarthick/dialogsum\n",
"# and can be directly downloaded with the `load_dataset` method\n",
"dataset = 'knkarthick/dialogsum'\n",
"ds = load_dataset(dataset, split='validation')\n",
"df = ds.to_pandas()"
"dataset = \"knkarthick/dialogsum\"\n",
"ds = load_dataset(dataset, split=\"validation\")\n",
"df = ds.to_pandas() # noqa: PD901"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b7d23cb4",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"id": "3cb05409-aad2-42e2-bc23-743575ad924e",
"metadata": {},
"outputs": [],
"source": [
"# Examine a single sample \n",
"df['dialogue'].iloc[0]"
"# Examine a single sample\n",
"df[\"dialogue\"].iloc[0]"
]
},
{
Expand All @@ -400,7 +400,7 @@
"outputs": [],
"source": [
"# Add a function to do some simple character counts for model input\n",
"df['char_count'] = df['dialogue'].str.len()"
"df[\"char_count\"] = df[\"dialogue\"].str.len()"
]
},
{
Expand All @@ -422,7 +422,7 @@
"outputs": [],
"source": [
"# Show statistics about characters count\n",
"df['char_count'].describe()"
"df[\"char_count\"].describe()"
]
},
{
Expand All @@ -434,15 +434,14 @@
"source": [
"# Generate plot of character counts\n",
"fig, ax = plt.subplots(figsize=(12, 6))\n",
"ax.hist(df['char_count'], bins=30)\n",
"ax.set_xlabel('Character Count')\n",
"ax.set_ylabel('Frequency')\n",
"ax.hist(df[\"char_count\"], bins=30)\n",
"ax.set_xlabel(\"Character Count\")\n",
"ax.set_ylabel(\"Frequency\")\n",
"\n",
"stats = df['char_count'].describe().apply(lambda x: f\"{x:.0f}\")\n",
"stats = df[\"char_count\"].describe().apply(lambda x: f\"{x:.0f}\")\n",
"\n",
"# Add text boxes for statistics\n",
"plt.text(1.05, 0.95, stats.to_string(), \n",
" transform=ax.transAxes, verticalalignment='top')\n",
"plt.text(1.05, 0.95, stats.to_string(), transform=ax.transAxes, verticalalignment=\"top\")\n",
"\n",
"# Adjust layout\n",
"plt.tight_layout()\n",
Expand Down Expand Up @@ -490,14 +489,11 @@
"metadata": {},
"outputs": [],
"source": [
"lm_client = LumigatorClient(\n",
" f\"{LUMIGATOR_SERVICE_HOST}:{LUMIGATOR_SERVICE_PORT}\"\n",
")\n",
"from pathlib import Path\n",
"\n",
"lm_client = LumigatorClient(f\"{LUMIGATOR_SERVICE_HOST}:{LUMIGATOR_SERVICE_PORT}\")\n",
"\n",
"lm_client.datasets.create_dataset(\n",
" open(dataset_name, \"rb\"),\n",
" DatasetFormat.JOB\n",
")"
"lm_client.datasets.create_dataset(Path.open(dataset_name, \"rb\"), DatasetFormat.JOB)"
]
},
{
Expand Down Expand Up @@ -569,8 +565,6 @@
"#\n",
"# Encoder-Decoder models\n",
"# 'hf://facebook/bart-large-cnn',\n",
"# 'hf://mikeadimech/longformer-qmsum-meeting-summarization', \n",
"# 'hf://mrm8488/t5-base-finetuned-summarize-news',\n",
"# 'hf://Falconsai/text_summarization',\n",
"#\n",
"# Decoder models\n",
Expand All @@ -582,7 +576,7 @@
"# \"oai://gpt-3.5-turbo-0125\",\n",
"#\n",
"models = [\n",
" 'hf://facebook/bart-large-cnn',\n",
" \"hf://facebook/bart-large-cnn\",\n",
"]"
]
},
Expand Down Expand Up @@ -621,7 +615,7 @@
" \"description\": \"Test run.\",\n",
" \"model\": model,\n",
" \"dataset\": dataset_id,\n",
" \"max_samples\": max_samples\n",
" \"max_samples\": max_samples,\n",
" }\n",
" descr = f\"Testing {model} summarization model on {dataset_name}\"\n",
" responses.append(lm_client.jobs.create_job(JobType.EVALUATION, JobEvalCreate(**job_args)))"
Expand Down Expand Up @@ -767,9 +761,9 @@
],
"metadata": {
"kernelspec": {
"display_name": "venv",
"display_name": ".venv",
"language": "python",
"name": "venv"
"name": "python3"
},
"language_info": {
"codemirror_mode": {
Expand All @@ -781,7 +775,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.10"
"version": "3.11.11"
}
},
"nbformat": 4,
Expand Down

0 comments on commit fc1d5d1

Please sign in to comment.