Skip to content

models bert large cased

github-actions[bot] edited this page Oct 27, 2023 · 28 revisions

bert-large-cased

Overview

BERT is a pre-trained language model created by the Hugging Face team that uses masked language modeling (MLM) on a large corpus of English data. Its primary uses are for sequence classification and question answering, and it is not intended for text generation. It is important to note that this particular BERT model is cased, making a distinction between 'english' and 'English'. The model can be fine-tuned for downstream tasks, and it is described as having 24 layers, 1024 hidden dimensions, 16 attention heads, and 336M parameters. It's most effective on use cases that involve using the entire sentence to make decisions, whereas tasks such as text-generations, you should use GPT2.
Please Note: This model accepts masks in [mask] format. See Sample input for reference. 

The above summary was generated using ChatGPT. Review the original model card to understand the data used to train the model, evaluation metrics, license, intended uses, limitations and bias before using the model.

Inference samples

Inference type Python sample (Notebook) CLI with YAML
Real time fill-mask-online-endpoint.ipynb fill-mask-online-endpoint.sh
Batch fill-mask-batch-endpoint.ipynb coming soon

Finetuning samples

Task Use case Dataset Python sample (Notebook) CLI with YAML
Text Classification Emotion Detection Emotion emotion-detection.ipynb emotion-detection.sh
Token Classification Named Entity Recognition Conll2003 named-entity-recognition.ipynb named-entity-recognition.sh
Question Answering Extractive Q&A SQUAD (Wikipedia) extractive-qa.ipynb extractive-qa.sh

Model Evaluation

Task Use case Python sample (Notebook) CLI with YAML
Fill Mask Fill Mask rcds/wikipedia-for-mask-filling evaluate-model-fill-mask.ipynb

Sample inputs and outputs (for real-time inference)

Sample input

{
    "inputs": {
        "input_string": ["Paris is the [MASK] of France.", "Today is a [MASK] day!"]
    }
}

Sample output

[
    {
        "0": "capital"
    },
    {
        "0": "beautiful"
    }
]

Version: 12

Tags

Preview computes_allow_list : ['Standard_NV12s_v3', 'Standard_NV24s_v3', 'Standard_NV48s_v3', 'Standard_NC6s_v3', 'Standard_NC12s_v3', 'Standard_NC24s_v3', 'Standard_NC24rs_v3', 'Standard_NC6s_v2', 'Standard_NC12s_v2', 'Standard_NC24s_v2', 'Standard_NC24rs_v2', 'Standard_NC4as_T4_v3', 'Standard_NC8as_T4_v3', 'Standard_NC16as_T4_v3', 'Standard_NC64as_T4_v3', 'Standard_ND6s', 'Standard_ND12s', 'Standard_ND24s', 'Standard_ND24rs', 'Standard_ND40rs_v2', 'Standard_ND96asr_v4'] license : apache-2.0 model_specific_defaults : ordereddict({'apply_deepspeed': 'true', 'apply_lora': 'true', 'apply_ort': 'true'}) task : fill-mask

View in Studio: https://ml.azure.com/registries/azureml/models/bert-large-cased/version/12

License: apache-2.0

Properties

SHA: d9238236d8326ce4bc117132bb3b7e62e95f3a9a

datasets: bookcorpus, wikipedia

evaluation-min-sku-spec: 8|0|28|56

evaluation-recommended-sku: Standard_DS4_v2

finetune-min-sku-spec: 4|1|28|176

finetune-recommended-sku: Standard_NC24rs_v3

finetuning-tasks: text-classification, token-classification, question-answering

inference-min-sku-spec: 4|0|14|28

inference-recommended-sku: Standard_DS3_v2, Standard_D4a_v4, Standard_D4as_v4, Standard_DS4_v2, Standard_D8a_v4, Standard_D8as_v4, Standard_DS5_v2, Standard_D16a_v4, Standard_D16as_v4, Standard_D32a_v4, Standard_D32as_v4, Standard_D48a_v4, Standard_D48as_v4, Standard_D64a_v4, Standard_D64as_v4, Standard_D96a_v4, Standard_D96as_v4, Standard_FX4mds, Standard_F8s_v2, Standard_FX12mds, Standard_F16s_v2, Standard_F32s_v2, Standard_F48s_v2, Standard_F64s_v2, Standard_F72s_v2, Standard_FX24mds, Standard_FX36mds, Standard_FX48mds, Standard_E4s_v3, Standard_E8s_v3, Standard_E16s_v3, Standard_E32s_v3, Standard_E48s_v3, Standard_E64s_v3, Standard_NC4as_T4_v3, Standard_NC6s_v3, Standard_NC8as_T4_v3, Standard_NC12s_v3, Standard_NC16as_T4_v3, Standard_NC24s_v3, Standard_NC64as_T4_v3, Standard_NC24ads_A100_v4, Standard_NC48ads_A100_v4, Standard_NC96ads_A100_v4, Standard_ND96asr_v4, Standard_ND96amsr_A100_v4, Standard_ND40rs_v2

languages: en

Clone this wiki locally