Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/snippets/dynamism/loop emitters #227

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
d84c2b5
Update GenAI NPU doc (#26060)
TolyaTalamanov Aug 21, 2024
fdde9f1
Temporarily remove TF layer tests from required on ARM (#26158)
akladiev Aug 21, 2024
54f58b8
[Snippets][CPU] Enabled dynamic MHA FP32 tokenization on x64 (#25500)
a-sidorova Aug 21, 2024
f791099
[CPU] Fix debug assertion in oneDNN Matmul Brgemm kernel (#26087)
dmitry-gorokhov Aug 21, 2024
a07e2bc
[OV JS] Remove extra assets from tests (#25712)
almilosz Aug 21, 2024
3afad8d
[CPU][ARM] Upgrade ACL to 24.08 (#26137)
alvoron Aug 21, 2024
6f899d2
Move INFERENCE_PRECISION_HINT test to optional conformance for meta-p…
Aug 21, 2024
407f0bc
[CI] [GHA] Do not checkout latest oneDNN on U22 in nightly (#26150)
akashchi Aug 21, 2024
9d09d0a
[GPU] Fix segfault in layer tests for onnx_tests.test_lstm.TestLSTM (…
andrew-k-park Aug 21, 2024
9ef7e23
[NPU] Fix ze_loader dependency (#26157)
ge0rdi Aug 21, 2024
50ffcbc
Fixed pattern for patching TBB config files (#26167)
ilya-lavrenov Aug 21, 2024
28950f6
[Common FE] Document get_input_by_reference better (#26165)
rkazants Aug 21, 2024
711f060
Revert "Temporarily remove TF layer tests from required on ARM" (#26169)
rkazants Aug 21, 2024
1335be0
[GPU] Enable fc 4d for MatMul (#24642)
steve-y Aug 21, 2024
abbf944
[TF FE] Stabilize L2Loss layer tests on all platforms (#26151)
rkazants Aug 21, 2024
9472f7b
[Snippets] Fixed uniqie_buffer_reg_group_count in MHATokenization pass
a-sidorova Aug 21, 2024
2ac7c04
[Snippets][CPU] Introduced jit_aux_gpr_holder for jit_loop_emitters
a-sidorova Aug 22, 2024
e66bbb6
[Snippets] Added new test smoke_Snippets_DynMHA_4D_WithMul
a-sidorova Aug 22, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/job_build_linux.yml
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ jobs:

# Ticket: 139627
- name: Checkout the latest OneDNN for GPU in nightly
if: ${{ inputs.event-name == 'schedule' }}
if: ${{ inputs.event-name == 'schedule' && inputs.os == 'ubuntu_20_04' }} # GPU tests are enabled only on U20
working-directory: ${{ env.OPENVINO_REPO }}/src/plugins/intel_gpu/thirdparty/onednn_gpu
run: |
git fetch origin
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,60 +8,56 @@ This guide will give you extra details on how to utilize NPU with the GenAI flav
:doc:`See the installation guide <../../get-started/install-openvino/install-openvino-genai>`
for information on how to start.

Export an LLM model via Hugging Face Optimum-Intel
##################################################
Prerequisites
#############

1. Create a python virtual environment and install the correct components for exporting a model:
Install required dependencies:

.. code-block:: console
.. code-block:: console

python -m venv export-npu-env
export-npu-env\Scripts\activate
pip install transformers>=4.42.4 openvino==2024.2.0 openvino-tokenizers==2024.2.0 nncf==2.11.0 onnx==1.16.1 optimum-intel@git+https://github.com/huggingface/optimum-intel.git
python -m venv npu-env
npu-env\Scripts\activate
pip install optimum-intel nncf==2.11 onnx==1.16.1
pip install --pre openvino openvino-tokenizers openvino-genai --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly

Export an LLM model via Hugging Face Optimum-Intel
##################################################

2. A chat-tuned TinyLlama model is used in this example. The following conversion & optimization settings are recommended when using the NPU:
A chat-tuned TinyLlama model is used in this example. The following conversion & optimization settings are recommended when using the NPU:

.. code-block:: python
.. code-block:: python

optimum-cli export openvino -m TinyLlama/TinyLlama-1.1B-Chat-v1.0 --weight-format int4 --sym --group-size 128 --ratio 1.0 TinyLlama
optimum-cli export openvino -m TinyLlama/TinyLlama-1.1B-Chat-v1.0 --weight-format int4 --sym --group-size 128 --ratio 1.0 TinyLlama

Run generation using OpenVINO GenAI
##########################################

1. Create a python virtual environment and install the correct components for running the model on the NPU via OpenVINO GenAI:

.. code-block:: console

python -m venv run-npu-env
run-npu-env\Scripts\activate
pip install openvino>=2024.3.1 openvino-tokenizers>=2024.3.1 openvino-genai>=2024.3.1
###################################

2. Perform generation using the new GenAI API
Use the following code snippet to perform generation with OpenVINO GenAI API:

.. tab-set::
.. tab-set::

.. tab-item:: Python
:sync: py
.. tab-item:: Python
:sync: py

.. code-block:: python
.. code-block:: python

import openvino_genai as ov_genai
pipe = ov_genai.LLMPipeline(model_path, "NPU")
print(pipe.generate("The Sun is yellow because", max_new_tokens=100))
import openvino_genai as ov_genai
pipe = ov_genai.LLMPipeline(model_path, "NPU")
print(pipe.generate("The Sun is yellow because", max_new_tokens=100))

.. tab-item:: C++
:sync: cpp
.. tab-item:: C++
:sync: cpp

.. code-block:: cpp
.. code-block:: cpp

#include "openvino/genai/llm_pipeline.hpp"
#include <iostream>
#include "openvino/genai/llm_pipeline.hpp"
#include <iostream>

int main(int argc, char* argv[]) {
std::string model_path = argv[1];
ov::genai::LLMPipeline pipe(model_path, "NPU");
std::cout << pipe.generate("The Sun is yellow because", ov::genai::max_new_tokens(100));
}
int main(int argc, char* argv[]) {
std::string model_path = argv[1];
ov::genai::LLMPipeline pipe(model_path, "NPU");
std::cout << pipe.generate("The Sun is yellow because", ov::genai::max_new_tokens(100));
}

Additional configuration options
################################
Expand Down
1 change: 1 addition & 0 deletions src/bindings/js/node/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ dist
build
types
ov_runtime
tests/unit/test_models


*.exp
Expand Down
3 changes: 2 additions & 1 deletion src/bindings/js/node/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,8 @@
"build": "npm run tsc",
"prepare": "npm run build",
"lint": "eslint .",
"test": "node --test ./tests/unit/*.test.js",
"test_setup": "node ./tests/unit/setup.js",
"test": "npm run test_setup && node --test ./tests/unit/*.test.js",
"test:e2e": "mocha ./tests/e2e/electron-app.test.js",
"tsc": "tsc",
"postinstall": "npm run install_runtime",
Expand Down
18 changes: 9 additions & 9 deletions src/bindings/js/node/scripts/download_runtime.js
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ class RuntimeExistsError extends Error {
async function downloadRuntime(destinationPath, config = { force: false, ignoreIfExists: true, proxy: null }) {
const { version } = packageJson;
const osInfo = await getOsInfo();
const isRuntimeDirectoryExists = await checkIfDirectoryExists(destinationPath);
const isRuntimeDirectoryExists = await checkIfPathExists(destinationPath);

if (isRuntimeDirectoryExists && !config.force) {
if (config.ignoreIfExists) {
Expand All @@ -87,7 +87,7 @@ async function downloadRuntime(destinationPath, config = { force: false, ignoreI
await fs.mkdir(tempDirectoryPath);

console.log('Downloading OpenVINO runtime archive...');
await downloadFile(runtimeArchiveUrl, filename, tempDirectoryPath, config.proxy);
await downloadFile(runtimeArchiveUrl, tempDirectoryPath, filename, config.proxy);
console.log('OpenVINO runtime archive downloaded.');

await removeDirectory(destinationPath);
Expand Down Expand Up @@ -139,16 +139,16 @@ async function getOsInfo() {
}

/**
* Check if directory exists.
* Check if path exists.
*
* @async
* @function checkIfDirectoryExists
* @param {string} directoryPath - The directory path.
* @function checkIfPathExists
* @param {string} path - The path to directory or file.
* @returns {Promise<boolean>}
*/
async function checkIfDirectoryExists(directoryPath) {
async function checkIfPathExists(path) {
try {
await fs.access(directoryPath);
await fs.access(path);
return true;
} catch (error) {
if (error.code === codeENOENT) {
Expand Down Expand Up @@ -210,7 +210,7 @@ async function removeDirectory(path) {
* @param {string} [proxy=null] - (Optional) The proxy URL.
* @returns {Promise<void>}
*/
function downloadFile(url, filename, destination, proxy = null) {
function downloadFile(url, destination, filename, proxy = null) {
const timeout = 5000;
const fullPath = path.resolve(destination, filename);
const file = createWriteStream(fullPath);
Expand Down Expand Up @@ -281,4 +281,4 @@ function unarchive(tarFilePath, dest) {
});
}

module.exports = { downloadRuntime };
module.exports = { downloadRuntime, downloadFile, checkIfPathExists };
Loading
Loading