-
Notifications
You must be signed in to change notification settings - Fork 126
models microsoft beit base patch16 224 pt22k ft22k
Description: The BEiT is a vision transformer that is similar to the BERT model, but is also capable of image analysis. The model is pre-trained on a large collection of images, and uses patches to analyze images. It uses relative position embeddings and mean-pooling to classify images, and can be used to extract image features for downstream tasks by placing a linear layer on top of the pre-trained encoder. You can place a linear layer on top of the [CLS] token or mean-pool the final hidden states of the patch embeddings, depending on the specifics of your task. > The above summary was generated using ChatGPT. Review the original-model-card to understand the data used to train the model, evaluation metrics, license, intended uses, limitations and bias before using the model. ### Inference samples Inference type|Python sample (Notebook)|CLI with YAML |--|--|--| Real time|image-classification-online-endpoint.ipynb|image-classification-online-endpoint.sh Batch |image-classification-batch-endpoint.ipynb|image-classification-batch-endpoint.sh ### Finetuning samples Task|Use case|Dataset|Python sample (Notebook)|CLI with YAML |---|--|--|--|--| Image Multi-class classification|Image Multi-class classification|fridgeObjects|fridgeobjects-multiclass-classification.ipynb|fridgeobjects-multiclass-classification.sh Image Multi-label classification|Image Multi-label classification|multilabel fridgeObjects|fridgeobjects-multilabel-classification.ipynb|fridgeobjects-multilabel-classification.sh ### Model Evaluation |Task|Use case|Dataset|Python sample (Notebook)| |---|--|--|--| |Image Multi-class classification|Image Multi-class classification|fridgeObjects|image-multiclass-classification.ipynb| |Image Multi-label classification|Image Multi-label classification|multilabel fridgeObjects|image-multilabel-classification.ipynb| ### Sample inputs and outputs (for real-time inference) #### Sample input json { "input_data": { "columns": [ "image" ], "index": [0, 1], "data": ["image1", "image2"] } } Note: "image1" and "image2" string should be in base64 format or publicly accessible urls.
#### Sample output json [ { "probs": [0.91, 0.09], "labels": ["can", "carton"] }, { "probs": [0.1, 0.9], "labels": ["can", "carton"] } ]
#### Model inference - visualization for a sample image
Version: 6
Preview
license : apache-2.0
model_specific_defaults : ordereddict([('apply_deepspeed', 'false'), ('apply_ort', 'false')])
task : image-classification
View in Studio: https://ml.azure.com/registries/azureml/models/microsoft-beit-base-patch16-224-pt22k-ft22k/version/6
License: apache-2.0
SHA: 9da301148150e37e533abef672062fa49f6bda4f
datasets: imagenet, imagenet-21k
evaluation-min-sku-spec: 4|1|28|176
evaluation-recommended-sku: Standard_NC6s_v3
finetune-min-sku-spec: 4|1|28|176
finetune-recommended-sku: Standard_NC6s_v3
finetuning-tasks: image-classification
inference-min-sku-spec: 2|0|14|28
inference-recommended-sku: Standard_DS3_v2
model_id: microsoft/beit-base-patch16-224-pt22k-ft22k