Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make SVM practical #452

Merged
merged 25 commits into from
Sep 8, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
246e521
__init__: import warn once at the top
inducer Jul 1, 2022
5fc2fb7
Add is_queue_in_order
inducer Jul 1, 2022
ce7a76c
Add fold markers in wrap_mempool
inducer Jun 27, 2022
05e0380
Jostle some section headers in wrap_cl.hpp
inducer Jul 1, 2022
84de917
Add command_queue_ref
inducer Aug 18, 2022
3e05c06
Rework SVM for efficient use in arrays, implement SVM mempool
inducer Aug 18, 2022
af06c2f
Add, use enqueue_fill
inducer Aug 18, 2022
15e3ab9
set_arg: Try cl_mem/svm based on what was used last
inducer Aug 5, 2022
ec92241
Add an example for arrays with SVM
inducer Aug 18, 2022
b8348bc
Mark some destructors virtual, partially based on clang warnings
matthiasdiener Aug 9, 2022
10bfaad
Add missing result initialization in buffer_allocator_call
matthiasdiener Aug 18, 2022
04b6372
Move tools higher in the docs TOC
inducer Aug 28, 2022
00d2987
Acknowledge DOE funding
inducer Aug 28, 2022
d08b313
Bump version to 2022.2
inducer Aug 30, 2022
ef1c00b
Use pocl dev label for Conda CI
inducer Aug 31, 2022
34951a4
Add section headers in test_wrapper
inducer Sep 5, 2022
0bc7264
Fix rendering in doc/runtime_const
inducer Sep 6, 2022
5eafd63
array: Import warn once at the top
inducer Sep 6, 2022
d6a2f9a
Expose {SVMAllocation,PooledSVM}._queue
inducer Sep 6, 2022
02daca5
Array constructor: warn about SVM queue inconsistencies
inducer Sep 6, 2022
2a25c49
Make enqueue_copy offsets interface more uniform
inducer Sep 6, 2022
a2c2f21
Add test_svm_mem_pool_with_arrays
matthiasdiener Aug 18, 2022
6d46d62
Generalize test_coarse_grain_svm to test opqaue-style SVM
inducer Sep 6, 2022
2c2eaa7
Add changelog entry for 2022.2
inducer Sep 7, 2022
e2cb9d7
Retry clSVMAlloc in SVMAllocation after running GC
inducer Sep 8, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .test-conda-env-py3.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
name: test-conda-env
channels:
# For https://github.com/pocl/pocl/pull/1069
# See https://github.com/conda-forge/pocl-feedstock/pull/80
inducer marked this conversation as resolved.
Show resolved Hide resolved
- conda-forge/label/pocl_dev
- conda-forge
- nodefaults

Expand Down
2 changes: 1 addition & 1 deletion doc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -110,11 +110,11 @@ Contents
runtime_memory
runtime_program
runtime_gl
tools
array
types
algorithm
howto
tools
misc
🚀 Github <https://github.com/inducer/pyopencl>
💾 Download Releases <https://pypi.org/project/pyopencl>
Expand Down
12 changes: 10 additions & 2 deletions doc/misc.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,6 @@ Then run::
You can install these pieces of software in your user account and
do not need root/administrator privileges.


Enabling access to CPUs and GPUs via (Py)OpenCL
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down Expand Up @@ -283,6 +282,13 @@ other software to be turned into the corresponding :mod:`pyopencl` objects.
User-visible Changes
====================

Version 2022.2
--------------

- Added :ref:`opaque-style SVM <opaque-svm>` and :class:`pyopencl.SVMPointer`.
- Added :class:`pyopencl.tools.SVMPool`.
- Added automatic queue-synchronized deallocation of SVM.

Version 2020.3
--------------
.. note::
Expand Down Expand Up @@ -728,7 +734,9 @@ Funding
Work on pytential was supported in part by

* the US National Science Foundation under grant numbers DMS-1418961,
DMS-1654756, SHF-1911019, and OAC-1931577.
DMS-1654756, SHF-1911019, and OAC-1931577, and
* the Department of Energy, National Nuclear Security Administration,
under Award Number DE-NA0003963.

AK also gratefully acknowledges a hardware gift from Nvidia Corporation.

Expand Down
2 changes: 2 additions & 0 deletions doc/runtime_const.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ OpenCL Runtime: Constants
.. include:: constants.inc

.. class:: NameVersion

Describes the version of a specific feature.

.. note::
Expand All @@ -19,6 +20,7 @@ OpenCL Runtime: Constants
.. attribute:: name

.. class:: DeviceTopologyAmd

.. method:: __init__(bus, device, function)
.. attribute:: type
.. attribute:: bus
Expand Down
114 changes: 107 additions & 7 deletions doc/runtime_memory.rst
Original file line number Diff line number Diff line change
Expand Up @@ -116,14 +116,109 @@ by both the host and the device. *Coarse-grain* SVM requires that
buffers be mapped before being accessed on the host, *fine-grain* SVM
does away with that requirement.

.. warning::

Compared to :class:`Buffer`\ s, SVM brings with it a new concern: the
synchronization of memory deallocation. Unlike other objects in OpenCL,
SVM is represented by a plain (C-language) pointer and thus has no ability for
reference counting.

As a result, it is perfectly legal to allocate a :class:`Buffer`, enqueue an
operation on it, and release the buffer, without worrying about whether the
operation has completed. The OpenCL implementation will keep the buffer alive
until the operation has completed. This is *not* the case with SVM: Unless
otherwise specified, memory deallocation is performed immediately when
requested, and so SVM will be deallocated whenever the Python
garbage collector sees fit, even if the operation has not completed,
immediately leading to undefined behavior (i.e., typically, memory corruption and,
before too long, a crash).

Version 2022.2 of PyOpenCL offers substantially improved tools
for dealing with this. In particular, all means for allocating SVM
allow specifying a :class:`CommandQueue`, so that deallocation
is enqueued and performed after previously-enqueued operations
have completed.

SVM requires OpenCL 2.0.

.. _opaque-svm:

Opaque and "Wrapped-:mod:`numpy`" Styles of Referencing SVM
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

When trying to pass SVM pointers to functionality in :mod:`pyopencl`,
two styles are supported:

- First, the opaque style. This style most closely resembles
:class:`Buffer`-based allocation available in OpenCL 1.x.
SVM pointers are held in opaque "handle" objects such as :class:`SVMAllocation`.

- Second, the wrapped-:mod:`numpy` style. In this case, a :class:`numpy.ndarray`
(or another object implementing the :c:func:`Python buffer protocol
<PyObject_GetBuffer>`) serves as the reference to an area of SVM.
This style permits using memory areas with :mod:`pyopencl`'s SVM
interfaces even if they were allocated outside of :mod:`pyopencl`.

Since passing a :class:`numpy.ndarray` (or another type of object obeying the
buffer interface) already has existing semantics in most settings in
:mod:`pyopencl` (such as when passing arguments to a kernel or calling
:func:`enqueue_copy`), there exists a wrapper object, :class:`SVM`, that may
be "wrapped around" these objects to mark them as SVM.

The commonality between the two styles is that both ultimately implement
the :class:`SVMPointer` interface, which :mod:`pyopencl` uses to obtain
the actual SVM pointer.

Note that it is easily possible to obtain a :class:`numpy.ndarray` view of SVM
areas held in the opaque style, see :attr:`SVMPointer.buf`, permitting
transitions from opaque to wrapped-:mod:`numpy` style. The opposite transition
(from wrapped-:mod:`numpy` to opaque) is not necessarily straightforward,
as it would require "fishing" the opaque SVM handle out of a chain of
:attr:`numpy.ndarray.base` attributes (or similar, depending on
the actual object serving as the main SVM reference).

See :ref:`numpy-svm-helpers` for helper functions that ease setting up the
wrapped-:mod:`numpy` structure.

Wrapped-:mod:`numpy` SVM tends to be a good fit for fine-grain SVM because of
the ease of direct host-side access, but the creation of the nested structure
that makes this possible is associated with a certain amount of cost.

By comparison, opaque SVM access tends to be a good fit for coarse-grain
SVM, because direct host access is not possible without mapping the array
anyway, and it has lower setup cost. It is of course entirely possible to use
opaque SVM access with fine-grain SVM.

.. versionchanged:: 2022.2

This version adds the opaque style of SVM access.

Using SVM with Arrays
^^^^^^^^^^^^^^^^^^^^^

While all types of SVM can be used as the memory backing
:class:`pyopencl.array.Array` objects, ensuring that new arrays returned
by array operations (e.g. arithmetic) also use SVM is easiest to accomplish
by passing an :class:`~pyopencl.tools.SVMAllocator` (or
:class:`~pyopencl.tools.SVMPool`) as the *allocator* parameter in functions
returning new arrays.

SVM Pointers, Allocations, and Maps
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. autoclass:: SVMPointer

.. autoclass:: SVMAllocation

.. autoclass:: SVM

.. autoclass:: SVMMap

Allocating SVM
^^^^^^^^^^^^^^

.. _numpy-svm-helpers:

Helper functions for :mod:`numpy`-based SVM allocation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. autofunction:: svm_empty
.. autofunction:: svm_empty_like
Expand All @@ -140,11 +235,6 @@ Operations on SVM
.. autofunction:: enqueue_svm_memfill
.. autofunction:: enqueue_svm_migratemem

SVM Allocation Holder
^^^^^^^^^^^^^^^^^^^^^

.. autoclass:: SVMAllocation

Image
-----

Expand Down Expand Up @@ -281,6 +371,8 @@ Transfers

.. autofunction:: enqueue_copy(queue, dest, src, **kwargs)

.. autofunction:: enqueue_fill(queue, dest, src, **kwargs)

Mapping Memory into Host Address Space
--------------------------------------

Expand Down Expand Up @@ -406,3 +498,11 @@ Pipes

See :class:`pipe_info` for values of *param*.

Type aliases
------------

.. currentmodule:: pyopencl._cl

.. class:: Buffer

See :class:`pyopencl.Buffer`.
Loading