diff --git a/README.md b/README.md
index 9ff4898..270c053 100644
--- a/README.md
+++ b/README.md
@@ -18,6 +18,7 @@ reproduction of results, and guides on usage.
 > a citrus fruit native to the Philippines and used in traditional Filipino cuisine.
 
 ## 📰 News
+- [2024-08-01] Released new NER-only models based on [GLiNER](https://github.com/urchade/GLiNER)! You can find the models in [this HuggingFace collection](https://huggingface.co/collections/ljvmiranda921/calamancy-models-for-tagalog-nlp-65629cc46ef2a1d0f9605c87). Span-Marker and calamanCy models are still superior, but GLiNER offers a lot of extensibility on unseen entity labels. You can find the training pipeline [here](https://github.com/ljvmiranda921/calamanCy/tree/master/models/v0.1.0-gliner).  
 - [2023-12-05] We released the paper [**calamanCy: A Tagalog Natural Language Processing Toolkit**](https://aclanthology.org/2023.nlposs-1.1/) and will be presented in the NLP-OSS workshop at EMNLP 2023! Feel free to check out the [Tagalog NLP collection in HuggingFace](https://huggingface.co/collections/ljvmiranda921/calamancy-models-for-tagalog-nlp-65629cc46ef2a1d0f9605c87).
 - [2023-11-01] The named entity recognition (NER) dataset used to train the NER component of calamanCy has now a corresponding paper: [**Developing a Named Entity Recognition Dataset for Tagalog**](https://aclanthology.org/2023.nlposs-1.1/)! It will be presented in the SEALP workshop at IJCNLP-AACL 2023! The dataset is also available [in HuggingFace](https://huggingface.co/datasets/ljvmiranda921/tlunified-ner).
 
diff --git a/experiments/refresh_evals_0924/project.yml b/experiments/refresh_evals_0924/project.yml
new file mode 100644
index 0000000..0accd19
--- /dev/null
+++ b/experiments/refresh_evals_0924/project.yml
@@ -0,0 +1 @@
+title: "Benchmarking new models on TLUnfied-NER data"
diff --git a/experiments/refresh_evals_0924/requirements.txt b/experiments/refresh_evals_0924/requirements.txt
new file mode 100644
index 0000000..f7ddeff
--- /dev/null
+++ b/experiments/refresh_evals_0924/requirements.txt
@@ -0,0 +1,3 @@
+spacy
+spacy-llm==0.7.2
+datasets
\ No newline at end of file
diff --git a/models/v0.1.0-gliner/.gitignore b/models/v0.1.0-gliner/.gitignore
new file mode 100644
index 0000000..6f8aca1
--- /dev/null
+++ b/models/v0.1.0-gliner/.gitignore
@@ -0,0 +1 @@
+metrics
\ No newline at end of file
diff --git a/models/v0.1.0-gliner/README.md b/models/v0.1.0-gliner/README.md
new file mode 100644
index 0000000..13e12e6
--- /dev/null
+++ b/models/v0.1.0-gliner/README.md
@@ -0,0 +1,120 @@
+<!-- WEASEL: AUTO-GENERATED DOCS START (do not remove) -->
+
+# 🪐 Weasel Project: Release v0.1.0-gliner
+
+This is a spaCy project that trains and evaluates new v0.1.0-gliner models.
+[GliNER](https://github.com/urchade/GLiNER) (Generalist and Lightweight Model for Named Entity Recognition) is a powerful model capable of identifying any entity type using a BERT-like encoder.
+In this project, we finetune the GliNER model using the TLUnified-NER dataset.
+
+To replicate training, first you need to install the required dependencies:
+
+```sh
+pip install -r requirements.txt
+```
+
+## Training
+
+To train a GliNER model, run the `finetune-gliner` workflow while passing the size:
+
+```sh
+# Available options: 'small', 'medium', 'large'
+python -m spacy project run finetune-gliner . --vars.size small
+```
+
+The models are currently based on the [v2.5 version of GliNER](https://huggingface.co/collections/urchade/gliner-v25-66743e64ab975c859119d1eb).
+
+## Evaluation
+
+To perform evals, run the `eval-gliner` workflow while passing the size:
+
+```sh
+# Available options: 'small', 'medium', 'large'
+python -m spacy project run eval-gliner . --vars.size small
+```
+
+This will evaluate on TLUnified-NER's test set ([Miranda, 2023](https://aclanthology.org/2023.sealp-1.2.pdf)) and the Tagalog subsets of
+Universal NER ([Mayhew et al., 2024](https://aclanthology.org/2024.naacl-long.243/)).
+
+The evaluation results for TLUnified-NER are shown in the table below (reported numbers are F1-scores):
+
+|                  | PER   | ORG   | LOC   | Overall |
+|------------------|-------|-------|-------|---------|
+| [tl_gliner_small](https://huggingface.co/ljvmiranda921/tl_gliner_small)  | 86.76 | 78.72 | 86.78 | 84.83   |
+| [tl_gliner_medium](https://huggingface.co/ljvmiranda921/tl_gliner_medium) | 87.46 | 79.71 | 86.75 | 85.40   |
+| [tl_gliner_large](https://huggingface.co/ljvmiranda921/tl_gliner_large)  | 86.75 | 80.20 | 86.76 | 85.72   |
+| [tl_calamancy_trf](https://huggingface.co/ljvmiranda921/tl_calamancy_trf) | 91.95 | **84.84** | 88.92 | 88.03   |
+| [span-marker](https://huggingface.co/tomaarsen/span-marker-roberta-tagalog-base-tlunified)      | **92.57** | 82.04 | **90.56** | **89.62**   |
+
+In general, GliNER gets decent scores, but nothing beats regular finetuning on BERT-based models as seen in [tl_calamancy_trf](https://huggingface.co/ljvmiranda921/tl_calamancy_trf) and [span_marker](https://huggingface.co/tomaarsen/span-marker-roberta-tagalog-base-tlunified).
+The performance on Universal NER is generally worse (the highest is around ~50%), compared to the reported results in the Universal NER paper (we finetuned on RoBERTa as well).
+One possible reason is that the annotation guidelines for TULunified-NER are more loose, because we consider some entities that Universal NER ignores.
+At the same time, the text distribution of the two datasets are widely different.
+
+Nevertheless, I'm still releasing these GliNER models as they are very extensible to other entity types (and it's also nice to have a finetuned version of GliNER for Tagalog!).
+I haven't done any extensive hyperparameter tuning here so it might be nice if someone can contribute better config parameters to bump up these scores.
+
+## Citation
+
+Please cite the following papers when using these models:
+
+```
+@misc{zaratiana2023gliner,
+    title={GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer}, 
+    author={Urchade Zaratiana and Nadi Tomeh and Pierre Holat and Thierry Charnois},
+    year={2023},
+    eprint={2311.08526},
+    archivePrefix={arXiv},
+    primaryClass={cs.CL}
+}
+```
+
+```
+@inproceedings{miranda-2023-calamancy,
+  title = "calaman{C}y: A {T}agalog Natural Language Processing Toolkit",
+  author = "Miranda, Lester James",
+  booktitle = "Proceedings of the 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS 2023)",
+  month = dec,
+  year = "2023",
+  address = "Singapore, Singapore",
+  publisher = "Empirical Methods in Natural Language Processing",
+  url = "https://aclanthology.org/2023.nlposs-1.1",
+  pages = "1--7",
+} 
+```
+
+If you're using the NER dataset:
+
+```
+@inproceedings{miranda-2023-developing,
+  title = "Developing a Named Entity Recognition Dataset for {T}agalog",
+  author = "Miranda, Lester James",
+  booktitle = "Proceedings of the First Workshop in South East Asian Language Processing",
+  month = nov,
+  year = "2023",
+  address = "Nusa Dua, Bali, Indonesia",
+  publisher = "Association for Computational Linguistics",
+  url = "https://aclanthology.org/2023.sealp-1.2",
+  doi = "10.18653/v1/2023.sealp-1.2",
+  pages = "13--20",
+}
+```
+
+
+## 📋 project.yml
+
+The [`project.yml`](project.yml) defines the data assets required by the
+project, as well as the available commands and workflows. For details, see the
+[Weasel documentation](https://github.com/explosion/weasel).
+
+### ⏯ Commands
+
+The following commands are defined by the project. They
+can be executed using [`weasel run [name]`](https://github.com/explosion/weasel/tree/main/docs/cli.md#rocket-run).
+Commands are only re-run if their inputs have changed.
+
+| Command | Description |
+| --- | --- |
+| `finetune-gliner` | Finetune the GliNER model using TLUnified-NER |
+| `eval-gliner` | Evaluate trained GliNER models on the TLUnified-NER and Universal NER test sets |
+
+<!-- WEASEL: AUTO-GENERATED DOCS END (do not remove) -->
\ No newline at end of file
diff --git a/models/v0.1.0-gliner/evaluate.py b/models/v0.1.0-gliner/evaluate.py
new file mode 100644
index 0000000..40ec7ed
--- /dev/null
+++ b/models/v0.1.0-gliner/evaluate.py
@@ -0,0 +1,127 @@
+from pathlib import Path
+from typing import Dict, Iterable, Optional
+from copy import deepcopy
+
+import spacy
+import torch
+import typer
+import srsly
+from datasets import Dataset, load_dataset
+from spacy.scorer import Scorer
+from spacy.tokens import Doc, Span
+from spacy.training import Example
+from wasabi import msg
+
+
+def main(
+    # fmt: off
+    output_path: Path = typer.Argument(..., help="Path to store the metrics in JSON format."),
+    model_name: str = typer.Option("ljvmiranda921/tl_gliner_small", show_default=True, help="GliNER model to use for evaluation."),
+    dataset: str = typer.Option("ljvmiranda921/tlunified-ner", help="Dataset to evaluate upon."),
+    threshold: float = typer.Option(0.5, help="The threshold of the GliNER model (controls the degree to which a hit is considered an entity)."),
+    dataset_config: Optional[str] = typer.Option(None, help="Configuration for loading the dataset."),
+    chunk_size: int = typer.Option(250, help="Size of the text chunk to be processed at once."),
+    label_map: str = typer.Option("person::PER,organization::ORG,location::LOC", help="Mapping between GliNER labels and the dataset's actual labels (separated by a double-colon '::')."),
+    # fmt: on
+):
+    label_map: Dict[str, str] = process_labels(label_map)
+    msg.text(f"Using label map: {label_map}")
+
+    msg.info("Processing test dataset")
+    ds = load_dataset(dataset, dataset_config, split="test", trust_remote_code=True)
+    ref_docs = convert_hf_to_spacy_docs(ds)
+
+    msg.info("Loading GliNER model")
+    nlp = spacy.blank("tl")
+    nlp.add_pipe(
+        "gliner_spacy",
+        config={
+            "gliner_model": model_name,
+            "chunk_size": chunk_size,
+            "labels": list(label_map.keys()),
+            "threshold": threshold,
+            "style": "ent",
+            "map_location": "cuda" if torch.cuda.is_available() else "cpu",
+        },
+    )
+    msg.text("Getting predictions")
+    docs = deepcopy(ref_docs)
+    pred_docs = list(nlp.pipe(docs))
+    pred_docs = [update_entity_labels(doc, label_map) for doc in pred_docs]
+
+    # Get the scores
+    examples = [
+        Example(reference=ref, predicted=pred) for ref, pred in zip(ref_docs, pred_docs)
+    ]
+    scores = Scorer.score_spans(examples, "ents")
+
+    msg.info(f"Results for {dataset} ({model_name})")
+    msg.text(scores)
+    srsly.write_json(output_path, data=scores, indent=2)
+    msg.good(f"Saving outputs to {output_path}")
+
+
+def process_labels(label_map: str) -> Dict[str, str]:
+    return {m.split("::")[0]: m.split("::")[1] for m in label_map.split(",")}
+
+
+def convert_hf_to_spacy_docs(dataset: "Dataset") -> Iterable[Doc]:
+    nlp = spacy.blank("tl")
+    examples = dataset.to_list()
+    entity_types = {
+        idx: feature.split("-")[1]
+        for idx, feature in enumerate(dataset.features["ner_tags"].feature.names)
+        if feature != "O"  # don't include empty
+    }
+    msg.text(f"Using entity types: {entity_types}")
+
+    docs = []
+    for example in examples:
+        tokens = example["tokens"]
+        ner_tags = example["ner_tags"]
+        doc = Doc(nlp.vocab, words=tokens)
+
+        entities = []
+        start_idx = None
+        entity_type = None
+
+        for idx, tag in enumerate(ner_tags):
+            if tag in entity_types:
+                if start_idx is None:
+                    start_idx = idx
+                    entity_type = entity_types[tag]
+                elif entity_type != entity_types.get(tag, None):
+                    entities.append(Span(doc, start_idx, idx, label=entity_type))
+                    start_idx = idx
+                    entity_type = entity_types[tag]
+            else:
+                if start_idx is not None:
+                    entities.append(Span(doc, start_idx, idx, label=entity_type))
+                    start_idx = None
+
+        if start_idx is not None:
+            entities.append(Span(doc, start_idx, len(tokens), label=entity_type))
+        doc.ents = entities
+        docs.append(doc)
+
+    return docs
+
+
+def update_entity_labels(doc: Doc, label_mapping: Dict[str, str]) -> Doc:
+    updated_ents = []
+    for ent in doc.ents:
+        new_label = label_mapping.get(ent.label_.lower(), ent.label_)
+        updated_span = Span(doc, ent.start, ent.end, label=new_label)
+        updated_ents.append(updated_span)
+
+    new_doc = Doc(
+        doc.vocab,
+        words=[token.text for token in doc],
+        spaces=[token.whitespace_ for token in doc],
+    )
+    new_doc.ents = updated_ents
+    return new_doc
+
+
+if __name__ == "__main__":
+    typer.run(main)
diff --git a/models/v0.1.0-gliner/project.yml b/models/v0.1.0-gliner/project.yml
new file mode 100644
index 0000000..a7b60e1
--- /dev/null
+++ b/models/v0.1.0-gliner/project.yml
@@ -0,0 +1,160 @@
+title: "Release v0.1.0-gliner"
+description: |
+  This is a spaCy project that trains and evaluates new v0.1.0-gliner models.
+  [GliNER](https://github.com/urchade/GLiNER) (Generalist and Lightweight Model for Named Entity Recognition) is a powerful model capable of identifying any entity type using a BERT-like encoder.
+  In this project, we finetune the GliNER model using the TLUnified-NER dataset.
+
+  To replicate training, first you need to install the required dependencies:
+
+  ```sh
+  pip install -r requirements.txt
+  ```
+
+  ## Training
+
+  To train a GliNER model, run the `finetune-gliner` workflow while passing the size:
+
+  ```sh
+  # Available options: 'small', 'medium', 'large'
+  python -m spacy project run finetune-gliner . --vars.size small
+  ```
+
+  The models are currently based on the [v2.5 version of GliNER](https://huggingface.co/collections/urchade/gliner-v25-66743e64ab975c859119d1eb).
+
+  ## Evaluation
+
+  To perform evals, run the `eval-gliner` workflow while passing the size:
+
+  ```sh
+  # Available options: 'small', 'medium', 'large'
+  python -m spacy project run eval-gliner . --vars.size small
+  ```
+
+  This will evaluate on TLUnified-NER's test set ([Miranda, 2023](https://aclanthology.org/2023.sealp-1.2.pdf)) and the Tagalog subsets of
+  Universal NER ([Mayhew et al., 2024](https://aclanthology.org/2024.naacl-long.243/)).
+
+  The evaluation results for TLUnified-NER are shown in the table below (reported numbers are F1-scores):
+
+  |                  | PER   | ORG   | LOC   | Overall |
+  |------------------|-------|-------|-------|---------|
+  | [tl_gliner_small](https://huggingface.co/ljvmiranda921/tl_gliner_small)  | 86.76 | 78.72 | 86.78 | 84.83   |
+  | [tl_gliner_medium](https://huggingface.co/ljvmiranda921/tl_gliner_medium) | 87.46 | 79.71 | 86.75 | 85.40   |
+  | [tl_gliner_large](https://huggingface.co/ljvmiranda921/tl_gliner_large)  | 86.75 | 80.20 | 86.76 | 85.72   |
+  | [tl_calamancy_trf](https://huggingface.co/ljvmiranda921/tl_calamancy_trf) | 91.95 | **84.84** | 88.92 | 88.03   |
+  | [span-marker](https://huggingface.co/tomaarsen/span-marker-roberta-tagalog-base-tlunified)      | **92.57** | 82.04 | **90.56** | **89.62**   |
+
+  In general, GliNER gets decent scores, but nothing beats regular finetuning on BERT-based models as seen in [tl_calamancy_trf](https://huggingface.co/ljvmiranda921/tl_calamancy_trf) and [span_marker](https://huggingface.co/tomaarsen/span-marker-roberta-tagalog-base-tlunified).
+  The performance on Universal NER is generally worse (the highest is around ~50%), compared to the reported results in the Universal NER paper (we finetuned on RoBERTa as well).
+  One possible reason is that the annotation guidelines for TULunified-NER are more loose, because we consider some entities that Universal NER ignores.
+  At the same time, the text distribution of the two datasets are widely different.
+
+  Nevertheless, I'm still releasing these GliNER models as they are very extensible to other entity types (and it's also nice to have a finetuned version of GliNER for Tagalog!).
+  I haven't done any extensive hyperparameter tuning here so it might be nice if someone can contribute better config parameters to bump up these scores.
+
+  ## Citation
+
+  Please cite the following papers when using these models:
+
+  ```
+  @misc{zaratiana2023gliner,
+      title={GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer}, 
+      author={Urchade Zaratiana and Nadi Tomeh and Pierre Holat and Thierry Charnois},
+      year={2023},
+      eprint={2311.08526},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL}
+  }
+  ```
+
+  ```
+  @inproceedings{miranda-2023-calamancy,
+    title = "calaman{C}y: A {T}agalog Natural Language Processing Toolkit",
+    author = "Miranda, Lester James",
+    booktitle = "Proceedings of the 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS 2023)",
+    month = dec,
+    year = "2023",
+    address = "Singapore, Singapore",
+    publisher = "Empirical Methods in Natural Language Processing",
+    url = "https://aclanthology.org/2023.nlposs-1.1",
+    pages = "1--7",
+  } 
+  ```
+
+  If you're using the NER dataset:
+
+  ```
+  @inproceedings{miranda-2023-developing,
+    title = "Developing a Named Entity Recognition Dataset for {T}agalog",
+    author = "Miranda, Lester James",
+    booktitle = "Proceedings of the First Workshop in South East Asian Language Processing",
+    month = nov,
+    year = "2023",
+    address = "Nusa Dua, Bali, Indonesia",
+    publisher = "Association for Computational Linguistics",
+    url = "https://aclanthology.org/2023.sealp-1.2",
+    doi = "10.18653/v1/2023.sealp-1.2",
+    pages = "13--20",
+  }
+  ```
+
+vars:
+  version: 0.1.0
+  # Training
+  size: small
+  num_steps: 10000
+  batch_size: 8
+
+directories:
+  - "checkpoints"
+  - "models"
+  - "metrics"
+
+env:
+  HF_TOKEN: HF_TOKEN
+  TOKENIZERS_PARALLELISM: TOKENIZERS_PARALLELISM
+
+commands:
+  - name: "finetune-gliner"
+    help: "Finetune the GliNER model using TLUnified-NER"
+    script:
+      - mkdir -p models/gliner_${vars.size}
+      - mkdir -p checkpoints/ckpt_gliner_${vars.size}
+      - >-
+        python train.py 
+          gliner-community/gliner_${vars.size}-v2.5
+          models/gliner_${vars.size}
+          --checkpoint-dir checkpoints/ckpt_gliner_${vars.size}
+          --push-to-hub ljvmiranda921/tl_gliner_${vars.size}
+          --num-steps ${vars.num_steps}
+          --batch-size ${vars.batch_size}
+          --dataset ljvmiranda921/tlunified-ner
+    outputs:
+      - models/gliner_${vars.size}
+      - checkpoints/ckpt_gliner_${vars.size}
+
+  - name: "eval-gliner"
+    help: "Evaluate trained GliNER models on the TLUnified-NER and Universal NER test sets"
+    script:
+      # TLUnified-NER
+      - >-
+        python evaluate.py
+          metrics/model___tl_gliner_${vars.size}_dataset___ljvmiranda921-tlunified-ner.json
+          --model-name ljvmiranda921/tl_gliner_${vars.size}
+          --dataset ljvmiranda921/tlunified-ner
+          --label-map person::PER,location::LOC,organization::ORG
+      # Universal NER (tl_trg)
+      - >-
+        python evaluate.py
+          metrics/model___tl_gliner_${vars.size}_dataset___universalner-universal_ner.json
+          --model-name ljvmiranda921/tl_gliner_${vars.size}
+          --dataset universalner/universal_ner
+          --dataset-config tl_trg
+          --label-map person::PER,location::LOC,organization::ORG
+      # Universal NER (tl_ugnayan)
+      - >-
+        python evaluate.py
+          metrics/model___tl_gliner_${vars.size}_dataset___universalner-universal_ner.json
+          --model-name ljvmiranda921/tl_gliner_${vars.size}
+          --dataset universalner/universal_ner
+          --dataset-config tl_ugnayan
+          --label-map person::PER,location::LOC,organization::ORG
diff --git a/models/v0.1.0-gliner/requirements.txt b/models/v0.1.0-gliner/requirements.txt
new file mode 100644
index 0000000..603c273
--- /dev/null
+++ b/models/v0.1.0-gliner/requirements.txt
@@ -0,0 +1,7 @@
+gliner==0.2.8
+accelerate
+spacy
+gliner-spacy
+huggingface_hub
+datasets
+conllu
\ No newline at end of file
diff --git a/models/v0.1.0-gliner/train.py b/models/v0.1.0-gliner/train.py
new file mode 100644
index 0000000..8ae46a7
--- /dev/null
+++ b/models/v0.1.0-gliner/train.py
@@ -0,0 +1,134 @@
+import os
+from pathlib import Path
+from typing import Optional
+
+import torch
+import typer
+from datasets import load_dataset
+from gliner import GLiNER
+from gliner.data_processing.collator import DataCollator
+from gliner.training import Trainer, TrainingArguments
+from wasabi import msg
+
+
+def main(
+    # fmt: off
+    base_model: str = typer.Argument(..., help="Base model used for training."),
+    output_dir: Path = typer.Argument(..., help="Path to store the output model."),
+    checkpoint_dir: Path = typer.Option(Path("checkpoints"), help="Path for storing checkpoints."),
+    push_to_hub: Optional[str] = typer.Option(None, help="If set, will upload the trained model to the provided Huggingface model namespace."),
+    num_steps: int = typer.Option(500, help="Number of steps to run training."),
+    batch_size: int = typer.Option(8, help="Batch size used for training."),
+    dataset: str = typer.Option("ljvmiranda/tlunified-ner", help="Path to the TLUnified-NER dataset."),
+    # fmt: on
+):
+
+    if push_to_hub:
+        api_token = os.getenv("HF_TOKEN")
+        if not api_token:
+            msg.fail("HF_TOKEN is missing! Won't be able to --push-to-hub", exits=1)
+
+    # Load and Format the dataset
+    msg.info(f"Formatting the {dataset} dataset")
+    ds = load_dataset(dataset)
+
+    def format_to_gliner(example):
+        id2label = {
+            1: "person",
+            2: "person",
+            3: "organization",
+            4: "organization",
+            5: "location",
+            6: "location",
+        }
+
+        tokens = example["tokens"]
+        ner_tags = example["ner_tags"]
+
+        ner = []
+        current_entity = None
+        for idx, tag in enumerate(ner_tags):
+            if tag in id2label:
+                if current_entity is None:
+                    current_entity = [idx, idx, id2label[tag]]
+                elif (
+                    tag == ner_tags[current_entity[0]]
+                    or tag == ner_tags[current_entity[0]] + 1
+                ):
+                    current_entity[1] = idx
+                else:
+                    ner.append(current_entity)
+                    current_entity = [idx, idx, id2label[tag]]
+            else:
+                if current_entity is not None:
+                    ner.append(current_entity)
+                    current_entity = None
+
+        if current_entity is not None:
+            ner.append(current_entity)
+
+        return {"tokenized_text": tokens, "ner": ner}
+
+    train_dataset = [format_to_gliner(eg) for eg in ds["train"].to_list()]
+    eval_dataset = [format_to_gliner(eg) for eg in ds["validation"].to_list()]
+
+    # Perform training
+    device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
+    model = GLiNER.from_pretrained(base_model)
+
+    data_collator = DataCollator(
+        model.config,
+        data_processor=model.data_processor,
+        prepare_labels=True,
+    )
+    model.to(device)
+
+    data_size = len(train_dataset)
+    num_batches = data_size // batch_size
+    num_epochs = max(1, num_steps // num_batches)
+
+    msg.info(
+        f"Finetuning the {base_model} model, saving checkpoints to {checkpoint_dir}"
+    )
+
+    training_args = TrainingArguments(
+        output_dir=str(checkpoint_dir),
+        learning_rate=5e-6,
+        weight_decay=0.01,
+        others_lr=1e-5,
+        others_weight_decay=0.01,
+        lr_scheduler_type="linear",  # cosine
+        warmup_ratio=0.1,
+        per_device_train_batch_size=batch_size,
+        per_device_eval_batch_size=batch_size,
+        num_train_epochs=num_epochs,
+        evaluation_strategy="steps",
+        save_steps=num_steps * 2,
+        save_total_limit=10,
+        dataloader_num_workers=0,
+        use_cpu=False,
+        report_to="none",
+        load_best_model_at_end=True,
+    )
+
+    trainer = Trainer(
+        model=model,
+        args=training_args,
+        train_dataset=train_dataset,
+        eval_dataset=eval_dataset,
+        tokenizer=model.data_processor.transformer_tokenizer,
+        data_collator=data_collator,
+    )
+
+    trainer.train()
+    trainer.save_model(str(output_dir))
+    msg.good(f"Best model saved to {output_dir}")
+
+    if push_to_hub:
+        msg.info(f"Pushing model to HuggingFace Hub")
+        model = GLiNER.from_pretrained(output_dir)
+        model.push_to_hub(push_to_hub, token=api_token)
+
+
+if __name__ == "__main__":
+    typer.run(main)