From 40aa09a1a9724eed323848f52ccda542a332f1aa Mon Sep 17 00:00:00 2001 From: Martin Gleize Date: Tue, 21 Jan 2025 14:04:40 +0100 Subject: [PATCH 1/2] Update doc on vLLM support --- .../tutorials/end_to_end_fine_tuning.rst | 26 ++++++++----------- 1 file changed, 11 insertions(+), 15 deletions(-) diff --git a/doc/source/tutorials/end_to_end_fine_tuning.rst b/doc/source/tutorials/end_to_end_fine_tuning.rst index c4d2d3abc..74e2141bc 100644 --- a/doc/source/tutorials/end_to_end_fine_tuning.rst +++ b/doc/source/tutorials/end_to_end_fine_tuning.rst @@ -281,41 +281,37 @@ VLLM Support ^^^^^^^^^^^^ -To accelerate the inference process, we can convert fairseq2 checkpoints to HuggingFace checkpoints, which can be deployed with VLLM. This takes 2 steps: +To accelerate the inference process, we can deploy fairseq2 checkpoints with VLLM. This takes 2 steps: -**Step 1: Convert fairseq2 checkpoint to XLFormer checkpoint** +**Step 1: Generate the Huggingface config.json file** -The first step is to use the fairseq2 command-line (:ref:`basics-cli`) tool to convert the fairseq2 checkpoint to an XLF checkpoint. The command structure is as follows: +The first step is to use the fairseq2 command-line (:ref:`basics-cli`) tool to generate the ``config.json`` file part of the Huggingface model format, which vLLM expects. The command structure is as follows: .. code-block:: bash - fairseq2 llama convert_checkpoint --model + fairseq2 llama write_hf_config --model * ````: Specify the architecture of the model -- `e.g.`, ``llama3`` (see :mod:`fairseq2.models.llama`) -* ````: Path to the directory containing the Fairseq2 checkpoint - -* ````: Path where the XLF checkpoint will be saved +* ````: Path to the directory containing your Fairseq2 checkpoint, where ``config.json`` will be added. .. note:: - Architecture ``--arch`` must exist and be defined in `e.g.` :meth:`fairseq2.models.llama.archs.register_archs`. - - -**Step 2: Convert XLFormer checkpoint to HF checkpoint** - -After obtaining the XLFormer checkpoint, the next step is to convert it to the Hugging Face format. Please refer to the official `HF script`_. + Architecture ``--model`` must exist and be defined in `e.g.` :meth:`fairseq2.models.llama.archs.register_archs`. -**Step 3: Deploy with VLLM** +**Step 2: Deploy with VLLM** .. code-block:: python from vllm import LLM - llm = LLM(model=) # path of your model + llm = LLM( + model=, # path of your model + tokenizer=, # path of your tokenizer files + ) output = llm.generate("Hello, my name is") print(output) From 7b54e97ac08ceb40af23ad08d306d556108a63b8 Mon Sep 17 00:00:00 2001 From: Martin Gleize Date: Tue, 21 Jan 2025 14:51:56 +0100 Subject: [PATCH 2/2] Fix path --- doc/source/tutorials/end_to_end_fine_tuning.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/source/tutorials/end_to_end_fine_tuning.rst b/doc/source/tutorials/end_to_end_fine_tuning.rst index 74e2141bc..f48e14373 100644 --- a/doc/source/tutorials/end_to_end_fine_tuning.rst +++ b/doc/source/tutorials/end_to_end_fine_tuning.rst @@ -299,7 +299,7 @@ The first step is to use the fairseq2 command-line (:ref:`basics-cli`) tool to g .. note:: - Architecture ``--model`` must exist and be defined in `e.g.` :meth:`fairseq2.models.llama.archs.register_archs`. + Architecture ``--model`` must exist and be defined in `e.g.` :meth:`fairseq2.models.llama._config.register_llama_configs`. **Step 2: Deploy with VLLM**