components oss_chat_completion_pipeline

OSS Chat Completion Pipeline

FTaaS Pipeline component for chat completion

Version: 0.0.26

View in Studio: https://ml.azure.com/registries/azureml/components/oss_chat_completion_pipeline/version/0.0.26

Compute parameters

Name	Description	Type	Default	Optional
instance_type_data_import	Instance type to be used for data_import component in case of virtual cluster compute, eg. Singularity.D8_v3. The parameter compute_data_import must be set to 'virtual cluster' for instance_type to be used	string	Singularity.D8_v3	True
instance_type_finetune	Instance type to be used for finetune component in case of virtual cluster compute, eg. Singularity.ND40_v2. The parameter compute_finetune must be set to 'virtual cluster' for instance_type to be used	string	Singularity.ND40_v2	True
number_of_gpu_to_use_finetuning	number of gpus to be used per node for finetuning, should be equal to number of gpu per node in the compute SKU used for finetune	integer	1	True

Continual-Finetuning model path

Name	Description	Type	Default	Optional	Enum
mlflow_model_path	MLflow model asset path. Special characters like \ and ' are invalid in the parameter value.	mlflow_model		True
pytorch_model_path	Pytorch model asset path. Special characters like \ and ' are invalid in the parameter value.	custom_model		True

Dataset path Parameters

Name	Description	Type	Default	Optional	Enum
train_file_path	Path to the registered training data asset. The supported data formats are `jsonl`, `json`, `csv`, `tsv` and `parquet`. Special characters like \ and ' are invalid in the parameter value.	uri_file		False
validation_file_path	Path to the registered validation data asset. The supported data formats are `jsonl`, `json`, `csv`, `tsv` and `parquet`. Special characters like \ and ' are invalid in the parameter value.	uri_file		True

Finetuning parameters Training parameters

Name	Description	Type	Default	Optional
max_seq_length	Default is 8192.	integer	8192	True
num_train_epochs	training epochs	integer	1	True
per_device_train_batch_size	Train batch size	integer	1	True
learning_rate	Start learning rate.	number	0.0003	True

Validation parameters

Name	Description	Type	Default	Optional	Enum
system_properties	Validation parameters propagated from pipeline.	string		True

Compute parameters

Name	Description	Type	Default	Optional	Enum
compute_data_import	compute to be used for model_import eg. provide 'FT-Cluster' if your compute is named 'FT-Cluster'. Special characters like \ and ' are invalid in the parameter value. If compute cluster name is provided, instance_type field will be ignored and the respective cluster will be used	string	virtual cluster	True
compute_finetune	compute to be used for finetune eg. provide 'FT-Cluster' if your compute is named 'FT-Cluster'. Special characters like \ and ' are invalid in the parameter value. If compute cluster name is provided, instance_type field will be ignored and the respective cluster will be used	string	virtual cluster	True

Model parameters

Name	Description	Type	Default	Optional	Enum
model_asset_id	Asset id of model	string		False

Model registration

Name	Description	Type	Default	Optional	Enum
registered_model_name	Name of the registered model	string		True

Name	Description	Type
output_model	Output dir to save the finetuned lora weights	uri_folder