diff --git a/README.md b/README.md index 6a045cdaaf..21322a8cb9 100755 --- a/README.md +++ b/README.md @@ -45,8 +45,8 @@ The software acceleration is achieved with vector instructions, AI hardware-spec With Intel(R) Extension for Scikit-learn, you can: -* Speed up training and inference by up to 100x with the equivalent mathematical accuracy -* Benefit from performance improvements across different Intel(R) hardware configurations +* Speed up training and inference by up to 100x with equivalent mathematical accuracy +* Benefit from performance improvements across different Intel(R) hardware configurations, including GPUs and multi-GPU configurations * Integrate the extension into your existing Scikit-learn applications without code modifications * Continue to use the open-source scikit-learn API * Enable and disable the extension with a couple of lines of code or at the command line @@ -71,12 +71,14 @@ Intel(R) Extension for Scikit-learn is also a part of [Intel(R) AI Tools](https: from sklearn.cluster import DBSCAN X = np.array([[1., 2.], [2., 2.], [2., 3.], - [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32) + [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32) clustering = DBSCAN(eps=3, min_samples=2).fit(X) ``` - **Enable Intel(R) GPU optimizations** + _Note: executing on GPU has [additional system software requirements](https://www.intel.com/content/www/us/en/developer/articles/system-requirements/intel-oneapi-dpcpp-system-requirements.html) - see [details](https://uxlfoundation.github.io/scikit-learn-intelex/latest/oneapi-gpu.html)._ + ```py import numpy as np import dpctl @@ -86,7 +88,7 @@ Intel(R) Extension for Scikit-learn is also a part of [Intel(R) AI Tools](https: from sklearn.cluster import DBSCAN X = np.array([[1., 2.], [2., 2.], [2., 3.], - [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32) + [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32) with config_context(target_offload="gpu:0"): clustering = DBSCAN(eps=3, min_samples=2).fit(X) ``` diff --git a/doc/sources/algorithms.rst b/doc/sources/algorithms.rst index 611e157156..8a5173de30 100755 --- a/doc/sources/algorithms.rst +++ b/doc/sources/algorithms.rst @@ -12,13 +12,14 @@ .. See the License for the specific language governing permissions and .. limitations under the License. +.. include:: substitutions.rst .. _sklearn_algorithms: #################### Supported Algorithms #################### -Applying |intelex| impacts the following scikit-learn algorithms: +Applying |intelex| impacts the following |sklearn| estimators: on CPU ------ @@ -380,6 +381,8 @@ Other Tasks - All parameters are supported - Only dense data is supported +.. _spmd-support: + SPMD Support ------------ diff --git a/doc/sources/conf.py b/doc/sources/conf.py index 4e8101d03d..c2f5ddb7ef 100755 --- a/doc/sources/conf.py +++ b/doc/sources/conf.py @@ -73,6 +73,7 @@ intersphinx_mapping = { "sklearn": ("https://scikit-learn.org/stable/", None), + "dpctl": ("https://intelpython.github.io/dpctl/latest", None), # from scikit-learn, in case some object in sklearnex points to them: # https://github.com/scikit-learn/scikit-learn/blob/main/doc/conf.py "python": ("https://docs.python.org/{.major}".format(sys.version_info), None), diff --git a/doc/sources/distributed-mode.rst b/doc/sources/distributed-mode.rst index c78a50d9e0..166cc0f876 100644 --- a/doc/sources/distributed-mode.rst +++ b/doc/sources/distributed-mode.rst @@ -12,39 +12,85 @@ .. See the License for the specific language governing permissions and .. limitations under the License. +.. include:: substitutions.rst + .. _distributed: -Distributed Mode -================ +Distributed Mode (SPMD) +======================= |intelex| offers Single Program, Multiple Data (SPMD) supported interfaces for distributed computing. -Several `GPU-supported algorithms `_ -also provide distributed, multi-GPU computing capabilities via integration with ``mpi4py``. The prerequisites +Several :doc:`GPU-supported algorithms ` +also provide distributed, multi-GPU computing capabilities via integration with |mpi4py|. The prerequisites match those of GPU computing, along with an MPI backend of your choice (`Intel MPI recommended `_, available -via ``impi-devel`` python package) and the ``mpi4py`` python package. If using |intelex| +via ``impi_rt`` python package) and the |mpi4py| python package. If using |intelex| `installed from sources `_, ensure that the spmd_backend is built. -Note that |intelex| now supports GPU offloading to speed up MPI operations. This is supported automatically with -some MPI backends, but in order to use GPU offloading with Intel MPI, set the following environment variable (providing +.. important:: + SMPD mode requires the |mpi4py| package used at runtime to be compiled with the same MPI backend as the |intelex|. The PyPI and Conda distributions of |intelex| both use Intel's MPI as backend, and hence require an |mpi4py| also built with Intel's MPI - it can be easily installed from Intel's conda channel as follows:: + + conda install -c https://software.repos.intel.com/python/conda/ mpi4py + + It also requires the MPI runtime executable (``mpiexec`` / ``mpirun``) to be from the same library that was used to compile the |intelex| - Intel's MPI runtime library is offered as a Python package ``impi_rt`` and will be installed together with the ``mpi4py`` package if executing the command above, but otherwise, it can be installed separately from different distribution channels: + + - Intel's conda channel (recommended):: + + conda install -c https://software.repos.intel.com/python/conda/ impi_rt + + - Conda-Forge:: + + conda install -c conda-forge impi_rt + + - PyPI (not recommended, might require setting additional environment variables):: + + pip install impi_rt + + Using other MPI backends (e.g. OpenMPI) requires building |intelex| from source with that backend. + +Note that |intelex| supports GPU offloading to speed up MPI operations. This is supported automatically with +some MPI backends, but in order to use GPU offloading with Intel MPI, it is required to set the environment variable ``I_MPI_OFFLOAD`` to ``1`` (providing data on device without this may lead to a runtime error): -:: +- On Linux*:: + + export I_MPI_OFFLOAD=1 + +- On Windows*:: + + set I_MPI_OFFLOAD=1 + +SMPD-aware versions of estimators can be imported from the ``sklearnex.spmd`` module. Data should be distributed across multiple nodes as +desired, and should be transfered to a |dpctl| or `dpnp `__ array before being passed to the estimator. + +Note that SPMD estimators allow an additional argument ``queue`` in their ``.fit`` / ``.predict`` methods, which accept :obj:`dpctl.SyclQueue` objects. For example, while the signature for :obj:`sklearn.linear_model.LinearRegression.predict` would be + +.. code-block:: python + + def predict(self, X): ... + +The signature for the corresponding predict method in ``sklearnex.spmd.linear_model.LinearRegression.predict`` is: + +.. code-block:: python + + def predict(self, X, queue=None): ... + +Examples of SPMD usage can be found in the GitHub repository for the |intelex| under `examples/sklearnex `__. - export I_MPI_OFFLOAD=1 +To run on SPMD mode, first create a python file using SPMD estimators from ``sklearnex.spmd``, such as `linear_regression_spmd.py `__. -Estimators can be imported from the ``sklearnex.spmd`` module. Data should be distributed across multiple nodes as -desired, and should be transfered to a dpctl or dpnp array before being passed to the estimator. View a full -example of this process in the |intelex| repository, where many examples of our SPMD-supported estimators are -available: https://github.com/uxlfoundation/scikit-learn-intelex/blob/main/examples/sklearnex/. To run: +Then, execute the file through MPI under multiple ranks - for example: -:: +- On Linux*:: + + mpirun -n 4 python linear_regression_spmd.py - mpirun -n 4 python linear_regression_spmd.py +- On Windows*:: + + mpiexec -n 4 python linear_regression_spmd.py -Note that additional mpirun arguments can be added as desired. SPMD-supported estimators are listed in the -`algorithms support documentation `_. +Note that additional ``mpirun`` arguments can be added as desired. SPMD-supported estimators are listed in the :ref:`spmd-support` section. -Additionally, daal4py offers some distributed functionality, see +Additionally, ``daal4py`` (previously a separate package, now an importable module within ``scikit-learn-intelex``) offers some distributed functionality, see `documentation `_ for further details. diff --git a/doc/sources/index.rst b/doc/sources/index.rst index 627692b118..e2f624478e 100755 --- a/doc/sources/index.rst +++ b/doc/sources/index.rst @@ -12,8 +12,7 @@ .. See the License for the specific language governing permissions and .. limitations under the License. -.. |intelex_repo| replace:: |intelex| repository -.. _intelex_repo: https://github.com/uxlfoundation/scikit-learn-intelex +.. include:: substitutions.rst .. _index: @@ -21,20 +20,20 @@ |intelex| ######### -Intel(R) Extension for Scikit-learn is a **free software AI accelerator** designed to deliver up to **100X** faster performance for your existing scikit-learn code. +|intelex| is a **free software AI accelerator** designed to deliver up to **100X** faster performance for your existing |sklearn| code. The software acceleration is achieved with vector instructions, AI hardware-specific memory optimizations, threading, and optimizations for all upcoming Intel(R) platforms at launch time. .. rubric:: Designed for Data Scientists and Framework Designers -Use Intel(R) Extension for Scikit-learn, to: +Use |intelex|, to: -* Speed up training and inference by up to 100x with the equivalent mathematical accuracy -* Benefit from performance improvements across different x86-compatible CPUs or Intel(R) GPUs -* Integrate the extension into your existing Scikit-learn applications without code modifications +* Speed up training and inference by up to 100x with equivalent mathematical accuracy +* Benefit from performance improvements across different x86-64 CPUs and Intel(R) GPUs +* Integrate the extension into your existing |sklearn| applications without code modifications * Enable and disable the extension with a couple of lines of code or at the command line -Intel(R) Extension for Scikit-learn is also a part of `Intel(R) AI Tools `_. +|intelex| is also a part of `Intel(R) AI Tools `_. .. image:: _static/scikit-learn-acceleration.PNG @@ -65,11 +64,14 @@ Enable Intel(R) CPU Optimizations from sklearn.cluster import DBSCAN X = np.array([[1., 2.], [2., 2.], [2., 3.], - [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32) + [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32) clustering = DBSCAN(eps=3, min_samples=2).fit(X) Enable Intel(R) GPU optimizations ********************************* + +Note: executing on GPU has `additional system software requirements `__ - see :doc:`oneapi-gpu`. + :: import numpy as np @@ -80,7 +82,7 @@ Enable Intel(R) GPU optimizations from sklearn.cluster import DBSCAN X = np.array([[1., 2.], [2., 2.], [2., 3.], - [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32) + [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32) with config_context(target_offload="gpu:0"): clustering = DBSCAN(eps=3, min_samples=2).fit(X) @@ -101,7 +103,7 @@ Enable Intel(R) GPU optimizations :maxdepth: 2 algorithms.rst - oneAPI and GPU support + oneapi-gpu.rst distributed-mode.rst non-scikit-algorithms.rst input-types.rst diff --git a/doc/sources/oneapi-gpu.rst b/doc/sources/oneapi-gpu.rst index f9808f97e4..3993b3d93c 100644 --- a/doc/sources/oneapi-gpu.rst +++ b/doc/sources/oneapi-gpu.rst @@ -12,65 +12,76 @@ .. See the License for the specific language governing permissions and .. limitations under the License. +.. include:: substitutions.rst .. _oneapi_gpu: ############################################################## oneAPI and GPU support in |intelex| ############################################################## -|intelex| supports oneAPI concepts, which -means that algorithms can be executed on different devices: CPUs and GPUs. -This is done via integration with -`dpctl `_ package that -implements core oneAPI concepts like queues and devices. +|intelex| can execute computations on different devices (CPUs, GPUs) through the SYCL framework in oneAPI. + +The device used for computations can be easily controlled through the target offloading functionality (e.g. through ``sklearnex.config_context(target_offload="gpu")`` - see rest of this page for more details), but for finer-grained controlled (e.g. operating on arrays that are already in a given device's memory), it can also interact with objects from package |dpctl|, which offers a Python interface over SYCL concepts such as devices, queues, and USM (unified shared memory) arrays. + +While not strictly required, package |dpctl| is recommended for a better experience on GPUs. + +.. important:: Be aware that GPU usage requires non-Python dependencies on your system, such as the `Intel(R) GPGPU Drivers `_. Prerequisites ------------- -For execution on GPU, DPC++ compiler runtime and driver are required. Refer to `DPC++ system -requirements `_ for details. +For execution on GPUs, DPC++ runtime and GPGPU drivers are required. -DPC++ compiler runtime can be installed either from PyPI or Anaconda: +DPC++ compiler runtime can be installed either from PyPI or Conda: - Install from PyPI:: pip install dpcpp-cpp-rt -- Install using Conda via the Intel repository:: +- Install using Conda from Intel's repository:: - conda install dpcpp_cpp_rt -c https://software.repos.intel.com/python/conda/ + conda install -c https://software.repos.intel.com/python/conda/ dpcpp_cpp_rt -Device offloading ------------------ +- Install using Conda from the conda-forge channel:: -|intelex| offers two options for running an algorithm on a -specific device with the help of dpctl: + conda install -c conda-forge dpcpp_cpp_rt -- Pass input data as `dpctl.tensor.usm_ndarray `_ to the algorithm. +For GPGPU driver installation instructions, see the general `DPC++ system requirements `_ sections corresponding to your operating system. - The computation will run on the device where the input data is - located, and the result will be returned as :code:`usm_ndarray` to the same - device. +Device offloading +----------------- - .. note:: - All the input data for an algorithm must reside on the same device. +|intelex| offers two options for running an algorithm on a specified device: - .. warning:: - The :code:`usm_ndarray` can only be consumed by the base methods - like :code:`fit`, :code:`predict`, and :code:`transform`. - Note that only the algorithms in |intelex| support - :code:`usm_ndarray`. The algorithms from the stock version of scikit-learn - do not support this feature. - Use global configurations of |intelex|\*: - 1. The :code:`target_offload` option can be used to set the device primarily - used to perform computations. Accepted data types are :code:`str` and - :code:`dpctl.SyclQueue`. If you pass a string to :code:`target_offload`, - it should either be ``"auto"``, which means that the execution - context is deduced from the location of input data, or a string - with SYCL* filter selector. The default value is ``"auto"``. - - 2. The :code:`allow_fallback_to_host` option + 1. The :code:`target_offload` argument (in ``config_context`` and in ``set_config`` / ``get_config``) + can be used to set the device primarily used to perform computations. Accepted data types are + :code:`str` and :obj:`dpctl.SyclQueue`. Strings must match to device names recognized by + the SYCL* device filter selector - for example, ``"gpu"``. If passing ``"auto"``, + the device will be deduced from the location of the input data. Examples: + + .. code-block:: python + + from sklearnex import config_context + from sklearnex.linear_model import LinearRegression + + with config_context(target_offload="gpu"): + model = LinearRegression().fit(X, y) + + .. code-block:: python + + from sklearnex import set_config + from sklearnex.linear_model import LinearRegression + + set_config(target_offload="gpu") + model = LinearRegression().fit(X, y) + + + If passing a string different than ``"auto"``, + it must be a device + + 2. The :code:`allow_fallback_to_host` argument in those same configuration functions is a Boolean flag. If set to :code:`True`, the computation is allowed to fallback to the host device when a particular estimator does not support the selected device. The default value is :code:`False`. @@ -83,16 +94,27 @@ call :code:`sklearnex.get_config()`. Functions :code:`set_config`, :code:`get_config` and :code:`config_context` are always patched after the :code:`sklearnex.patch_sklearn()` call. -.. rubric:: Compatibility considerations +- Pass input data as :obj:`dpctl.tensor.usm_ndarray` to the algorithm. + + The computation will run on the device where the input data is + located, and the result will be returned as :code:`usm_ndarray` to the same + device. + + .. note:: + All the input data for an algorithm must reside on the same device. + + .. warning:: + The :code:`usm_ndarray` can only be consumed by the base methods + like :code:`fit`, :code:`predict`, and :code:`transform`. + Note that only the algorithms in |intelex| support + :code:`usm_ndarray`. The algorithms from the stock version of |sklearn| + do not support this feature. -For compatibility reasons, algorithms in |intelex| may be offloaded to the device using -:code:`daal4py.oneapi.sycl_context`. However, it is recommended to use one of the options -described above for device offloading instead of using :code:`sycl_context`. Example ------- -An example on how to patch your code with Intel CPU/GPU optimizations: +A full example of how to patch your code with Intel CPU/GPU optimizations: .. code-block:: python @@ -102,11 +124,11 @@ An example on how to patch your code with Intel CPU/GPU optimizations: from sklearn.cluster import DBSCAN X = np.array([[1., 2.], [2., 2.], [2., 3.], - [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32) + [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32) with config_context(target_offload="gpu:0"): clustering = DBSCAN(eps=3, min_samples=2).fit(X) -.. note:: Current offloading behavior restricts fitting and inference of any models to be - in the same context or absence of context. For example, a model trained in the GPU context with - target_offload="gpu:0" throws an error if the inference is made outside the same GPU context. +.. note:: Current offloading behavior restricts fitting and predictions (a.k.a. inference) of any models to be + in the same context or absence of context. For example, a model whose ``.fit()`` method was called in a GPU context with + ``target_offload="gpu:0"`` will throw an error if a ``.predict()`` call is then made outside the same GPU context. diff --git a/doc/sources/quick-start.rst b/doc/sources/quick-start.rst index 084039f321..2ec236424f 100644 --- a/doc/sources/quick-start.rst +++ b/doc/sources/quick-start.rst @@ -12,19 +12,18 @@ .. See the License for the specific language governing permissions and .. limitations under the License. -.. |intelex_repo| replace:: |intelex| repository -.. _intelex_repo: https://github.com/uxlfoundation/scikit-learn-intelex +.. include:: substitutions.rst #################### Quick Start #################### -Get ready to elevate your scikit-learn code with |intelex| and experience the benefits of accelerated performance in just a few simple steps. +Get ready to elevate your |sklearn| code with |intelex| and experience the benefits of accelerated performance in just a few simple steps. Compatibility with Scikit-learn* --------------------------------- -Intel(R) Extension for Scikit-learn is compatible with the last four versions of scikit-learn. +|intelex| is compatible with the latest stable releases of |sklearn| - see :ref:`software-requirements` for more details. Integrate |intelex| -------------------- @@ -32,10 +31,10 @@ Integrate |intelex| Patching ********************** -Once you install Intel*(R) Extension for Scikit-learn*, you replace algorithms that exist in the scikit-learn package with their optimized versions from the extension. -This action is called ``patching``. This is not a permanent change so you can always undo the patching if necessary. +Once you install the |intelex|, you can replace estimator classes (algorithms) that exist in the ``sklearn`` module from |sklearn| with their optimized versions from the extension. +This action is called `patching`. This is not a permanent change so you can always undo the patching if necessary. -To patch Intel® Extension for Scikit-learn, use one of these methods: +To patch |sklearn| with the |intelex|, the following methods can be used: .. list-table:: :header-rows: 1 @@ -71,7 +70,7 @@ They support different enabling scenarios while producing the same result. **Example** -This example shows how to patch Intel(R) extension for Scikit-Learn by modifing your script. To make sure that patching is registered by the scikit-learn estimators, always import scikit-learn after these lines. +This example shows how to patch |sklearn| by modifing your script. To make sure that patching is registered by the scikit-learn estimators, always import module ``sklearn`` after these lines. .. code-block:: python :caption: Example: Drop-In Patching @@ -85,7 +84,7 @@ This example shows how to patch Intel(R) extension for Scikit-Learn by modifing # The use of the original Scikit-learn is not changed X = np.array([[1, 2], [1, 4], [1, 0], - [10, 2], [10, 4], [10, 0]]) + [10, 2], [10, 4], [10, 0]]) kmeans = KMeans(n_clusters=2, random_state=0).fit(X) print(f"kmeans.labels_ = {kmeans.labels_}") @@ -93,7 +92,7 @@ This example shows how to patch Intel(R) extension for Scikit-Learn by modifing Global Patching ********************** -You can also use global patching to patch all your scikit-learn applications without any additional actions. +You can also use global patching to patch all your |sklearn| applications without any additional actions. Before you begin, make sure that you have read and write permissions for Scikit-learn files. @@ -160,10 +159,10 @@ With global patching, you can: Unpatching ********************** -To undo the patch (also called `unpatching`) is to return scikit-learn to original implementation and -replace patched algorithms with the stock scikit-learn algorithms. +To undo the patch (also called `unpatching`) is to return the ``sklearn`` module to the original implementation and +replace patched estimators with the stock |sklearn| estimators. -To unpatch successfully, you must reimport the scikit-learn package:: +To unpatch successfully, you must reimport the ``sklearn`` module(s):: sklearnex.unpatch_sklearn() # Re-import scikit-learn algorithms after the unpatch @@ -191,62 +190,29 @@ To install |intelex|, run: **Supported Configurations** .. list-table:: - :header-rows: 1 :align: left - * - OS / Python version - - Python 3.9 - - Python 3.10 - - Python 3.11 - - Python 3.12 - * - Linux* OS - - [CPU, GPU] - - [CPU, GPU] - - [CPU, GPU] - - [CPU, GPU] - * - Windows* OS - - [CPU, GPU] - - [CPU, GPU] - - [CPU, GPU] - - [CPU, GPU] + * - Operating systems + - Windows*, Linux* + * - Python versions + - 3.9, 3.10, 3.11, 3.12, 3.13 + * - Devices + - CPU, GPU + * - Modes + - Single, SPMD +.. tip:: Running on GPU involves additional dependencies, see :doc:`oneapi-gpu`. SPMD mode has additional requirements on top of GPU ones, see :doc:`distributed-mode` for details. +.. note:: Wheels are only available for x86-64 architecture. Install from Anaconda* Cloud ******************************************** To prevent version conflicts, we recommend installing `scikit-learn-intelex` into a new conda environment. -.. tabs:: - - .. tab:: Conda-Forge channel - - Recommended by default. - - To install, run:: - - conda install scikit-learn-intelex -c conda-forge - - .. list-table:: **Supported Configurations** - :header-rows: 1 - :align: left - - * - OS / Python version - - Python 3.9 - - Python 3.10 - - Python 3.11 - - Python 3.12 - * - Linux* OS - - [CPU] - - [CPU] - - [CPU] - - [CPU] - * - Windows* OS - - [CPU] - - [CPU] - - [CPU] - - [CPU] +*Note: the main Anaconda channel also provides distributions of scikit-learn-intelex, but it does not provide the latest versions, nor does it provide GPU-enabled builds. It is highly recommended to install it from either Intel's channel or conda-forge instead.* +.. tabs:: .. tab:: Intel channel @@ -254,62 +220,49 @@ To prevent version conflicts, we recommend installing `scikit-learn-intelex` int To install, run:: - conda install scikit-learn-intelex -c https://software.repos.intel.com/python/conda/ + conda install -c https://software.repos.intel.com/python/conda/ scikit-learn-intelex .. list-table:: **Supported Configurations** - :header-rows: 1 :align: left - * - OS / Python version - - Python 3.9 - - Python 3.10 - - Python 3.11 - - Python 3.12 - * - Linux* OS - - [CPU, GPU] - - [CPU, GPU] - - [CPU, GPU] - - [CPU, GPU] - * - Windows* OS - - [CPU, GPU] - - [CPU, GPU] - - [CPU, GPU] - - [CPU, GPU] + * - Operating systems + - Windows*, Linux* + * - Python versions + - 3.9, 3.10, 3.11, 3.12, 3.13 + * - Devices + - CPU, GPU + * - Modes + - Single, SPMD - - .. tab:: Main channel + .. tab:: Conda-Forge channel To install, run:: - conda install scikit-learn-intelex + conda install -c conda-forge scikit-learn-intelex .. list-table:: **Supported Configurations** - :header-rows: 1 :align: left - * - OS / Python version - - Python 3.9 - - Python 3.10 - - Python 3.11 - - Python 3.12 - * - Linux* OS - - [CPU] - - [CPU] - - [CPU] - - [CPU] - * - Windows* OS - - [CPU] - - [CPU] - - [CPU] - - [CPU] + * - Operating systems + - Windows*, Linux* + * - Python versions + - 3.9, 3.10, 3.11, 3.12, 3.13 + * - Devices + - CPU, GPU + * - Modes + - Single, SPMD +.. tip:: Running on GPU involves additional dependencies, see :doc:`oneapi-gpu`. SPMD mode has additional requirements on top of GPU ones, see :doc:`distributed-mode` for details. +.. note:: Packages are only available for x86-64 architecture. + +.. _build-from-sources: Build from Sources ********************** -See `Installation instructions `_ to build |intelex| from the sources. +See `Installation instructions `_ to build |intelex| from the sources. Install Intel*(R) AI Tools **************************** @@ -319,7 +272,7 @@ Download the Intel AI Tools `here `_ for each version of Intel® Extension for Scikit-learn*. +See the `Release Notes `_ for each version of |intelex|. System Requirements -------------------- @@ -331,23 +284,24 @@ Hardware Requirements .. tab:: CPU - All processors with ``x86`` architecture with at least one of the following instruction sets: + Any processor with ``x86-64`` architecture with at least one of the following instruction sets: - SSE2 - SSE4.2 - AVX2 - AVX512 - .. note:: ARM* architecture is not supported. + .. note:: + Note: pre-built packages are not provided for other CPU architectures. See :ref:`build-from-sources` for ARM. .. tab:: GPU - - All Intel® integrated and discrete GPUs - - Intel® GPU drivers + - Any Intel® GPU supported by both `DPC++ `_ and `oneMKL `_ .. tip:: Intel(R) processors provide better performance than other CPUs. Read more about hardware comparison in our :ref:`blogs `. +.. _software-requirements: Software Requirements ********************** @@ -362,25 +316,28 @@ Software Requirements .. tab:: GPU - - Linux* OS: Ubuntu* 18.04 or newer - - Windows* OS 10 or newer - - Windows* Server 2019 or newer + - A Linux* or Windows* version supported by DPC++ and oneMKL + - Intel® GPGPU drivers + - DPC++ runtime libraries .. important:: - If you use accelerators, refer to `oneAPI DPC++/C++ Compiler System Requirements `_. + If you use accelerators (e.g. GPUs), refer to `oneAPI DPC++/C++ Compiler System Requirements `_. -Intel(R) Extension for Scikit-learn is compatible with the last four versions of scikit-learn: +|intelex| is compatible with the latest stable releases of |sklearn|: * 1.0.X * 1.1.X * 1.2.X * 1.3.X +* 1.4.X +* 1.5.X +* 1.6.X Memory Requirements ********************** By default, algorithms in |intelex| run in the multi-thread mode. This mode uses all available threads. -Optimized scikit-learn algorithms can consume more RAM than their corresponding unoptimized versions. +Optimized scikit-learn estimators can consume more RAM than their corresponding unoptimized versions. .. list-table:: :header-rows: 1 @@ -390,7 +347,7 @@ Optimized scikit-learn algorithms can consume more RAM than their corresponding - Single-thread mode - Multi-thread mode * - SVM - - Both Scikit-learn and |intelex| consume approximately the same amount of RAM. + - Both |sklearn| and |intelex| consume approximately the same amount of RAM. - In |intelex|, an algorithm with ``N`` threads consumes ``N`` times more RAM. In all |intelex| algorithms with GPU support, computations run on device memory. diff --git a/doc/sources/substitutions.rst b/doc/sources/substitutions.rst new file mode 100644 index 0000000000..e7cdef0832 --- /dev/null +++ b/doc/sources/substitutions.rst @@ -0,0 +1,19 @@ +.. Copyright contributors to the oneDAL project +.. +.. Licensed under the Apache License, Version 2.0 (the "License"); +.. you may not use this file except in compliance with the License. +.. You may obtain a copy of the License at +.. +.. http://www.apache.org/licenses/LICENSE-2.0 +.. +.. Unless required by applicable law or agreed to in writing, software +.. distributed under the License is distributed on an "AS IS" BASIS, +.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +.. See the License for the specific language governing permissions and +.. limitations under the License. + +.. |dpctl| replace:: :external+dpctl:doc:`dpctl ` +.. |sklearn| replace:: :external+sklearn:doc:`scikit-learn ` +.. |intelex_repo| replace:: |intelex| repository +.. _intelex_repo: https://github.com/uxlfoundation/scikit-learn-intelex +.. |mpi4py| replace:: `mpi4py `__