Skip to content

Commit

Permalink
Improve Tensors docs (#2915)
Browse files Browse the repository at this point in the history
* Expose the CUDA and NumPy Array Interface
* Add links to the above
* Add a paragraph about the memory being wrapped
* Mention that the memory is invalidated in subsequent iteration
* Add better cross-links
* Turn on plugin for section labels (easy links in one .rst)
* Download the latest dali.png from repo.

Signed-off-by: Krzysztof Lecki <[email protected]>
  • Loading branch information
klecki authored May 10, 2021
1 parent 6939669 commit b00f2f8
Show file tree
Hide file tree
Showing 5 changed files with 47 additions and 19 deletions.
8 changes: 4 additions & 4 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ for built in data loaders and data iterators in popular deep learning frameworks

Deep learning applications require complex, multi-stage data processing pipelines
that include loading, decoding, cropping, resizing, and many other augmentations.
These data processing pipelines, which are currently executed on the CPU, have become a
These data processing pipelines, which are currently executed on the CPU, have become a
bottleneck, limiting the performance and scalability of training and inference.

DALI addresses the problem of the CPU bottleneck by offloading data preprocessing to the
Expand All @@ -23,11 +23,11 @@ are handled transparently for the user.
In addition, the deep learning frameworks have multiple data pre-processing implementations,
resulting in challenges such as portability of training and inference workflows, and code
maintainability. Data processing pipelines implemented using DALI are portable because they
can easily be retargeted to TensorFlow, PyTorch, MXNet and PaddlePaddle.
can easily be retargeted to TensorFlow, PyTorch, MXNet and PaddlePaddle.

.. image:: /dali.png
:width: 800
:align: center
:align: center
:alt: DALI Diagram

Highlights
Expand Down Expand Up @@ -64,7 +64,7 @@ To install the latest DALI release for the latest CUDA version (11.x)::

pip install --extra-index-url https://developer.download.nvidia.com/compute/redist --upgrade nvidia-dali-cuda110

DALI comes preinstalled in the TensorFlow, PyTorch, and MXNet containers on `NVIDIA GPU Cloud <https://ngc.nvidia.com>`_
DALI comes preinstalled in the TensorFlow, PyTorch, and MXNet containers on `NVIDIA GPU Cloud <https://ngc.nvidia.com>`_
(versions 18.07 and later).

For other installation paths (TensorFlow plugin, older CUDA version, nightly and weekly builds, etc),
Expand Down
31 changes: 22 additions & 9 deletions dali/python/backend_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -301,7 +301,7 @@ void ExposeTensor(py::module &m) {
Python object to be checked
)code");

py::class_<Tensor<CPUBackend>>(m, "TensorCPU", py::buffer_protocol())
auto tensor_cpu_binding = py::class_<Tensor<CPUBackend>>(m, "TensorCPU", py::buffer_protocol())
.def(py::init([](py::capsule &capsule, string layout = "") {
auto t = std::make_unique<Tensor<CPUBackend>>();
FillTensorFromDlPack(capsule, t.get(), layout);
Expand All @@ -310,7 +310,7 @@ void ExposeTensor(py::module &m) {
"object"_a,
"layout"_a = "",
R"code(
DLPack of Tensor residing in the CPU memory.
Wrap a DLPack Tensor residing in the CPU memory.
object : DLPack object
Python DLPack object
Expand Down Expand Up @@ -366,7 +366,7 @@ void ExposeTensor(py::module &m) {
"layout"_a = "",
"is_pinned"_a = false,
R"code(
Tensor residing in the CPU memory.
Wrap a Tensor residing in the CPU memory.
b : object
the buffer to wrap into the TensorListCPU object
Expand Down Expand Up @@ -424,10 +424,17 @@ void ExposeTensor(py::module &m) {
)code")
.def_property("__array_interface__", &ArrayInterfaceRepr<CPUBackend>, nullptr,
R"code(
Returns array interface representation of TensorCPU.
Returns Array Interface representation of TensorCPU.
)code");
tensor_cpu_binding.doc() = R"code(
Class representing a Tensor residing in host memory. It can be used to access individual
samples of a :class:`TensorListCPU` or used to wrap CPU memory that is intended
to be passed as an input to DALI.
py::class_<Tensor<GPUBackend>>(m, "TensorGPU")
It is compatible with `Python Buffer Protocol <https://docs.python.org/3/c-api/buffer.html>`_
and `NumPy Array Interface <https://numpy.org/doc/stable/reference/arrays.interface.html>`_.)code";

auto tensor_gpu_binding = py::class_<Tensor<GPUBackend>>(m, "TensorGPU")
.def(py::init([](py::capsule &capsule, string layout = "") {
auto t = std::make_unique<Tensor<GPUBackend>>();
FillTensorFromDlPack(capsule, t.get(), layout);
Expand All @@ -436,7 +443,7 @@ void ExposeTensor(py::module &m) {
"object"_a,
"layout"_a = "",
R"code(
DLPack of Tensor residing in the GPU memory.
Wrap a DLPack Tensor residing in the GPU memory.
object : DLPack object
Python DLPack object
Expand All @@ -452,10 +459,10 @@ void ExposeTensor(py::module &m) {
"layout"_a = "",
"device_id"_a = -1,
R"code(
Tensor residing in the GPU memory.
Wrap a Tensor residing in the GPU memory that implements CUDA Array Interface.
object : object
Python object that implement CUDA Array Interface
Python object that implements CUDA Array Interface
layout : str
Layout of the data
device_id: int
Expand Down Expand Up @@ -540,8 +547,14 @@ void ExposeTensor(py::module &m) {
)code")
.def_property("__cuda_array_interface__", &ArrayInterfaceRepr<GPUBackend>, nullptr,
R"code(
Returns cuda array interface representation of TensorGPU.
Returns CUDA Array Interface (Version 2) representation of TensorGPU.
)code");
tensor_gpu_binding.doc() = R"code(
Class representing a Tensor residing in GPU memory. It can be used to access individual
samples of a :class:`TensorListGPU` or used to wrap GPU memory that is intended
to be passed as an input to DALI.
It is compatible with `CUDA Array Interface <https://numba.pydata.org/numba-doc/dev/cuda/cuda_array_interface.html>`_.)code";
}

template <typename Backend>
Expand Down
3 changes: 2 additions & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,7 @@
'IPython.sphinxext.ipython_console_highlighting',
'nbsphinx',
'sphinx.ext.intersphinx',
'sphinx.ext.autosectionlabel',
]

# Add any paths that contain templates here, relative to this directory.
Expand Down Expand Up @@ -171,7 +172,7 @@
subprocess.call(["wget", "-O", favicon_rel_path, "https://docs.nvidia.com/images/nvidia.ico"])
html_favicon = favicon_rel_path

subprocess.call(["wget", "-O", "dali.png", "https://developer.nvidia.com/sites/default/files/akamai/dali.png"])
subprocess.call(["wget", "-O", "dali.png", "https://raw.githubusercontent.com/NVIDIA/DALI/master/dali.png"])

# Custom sidebar templates, must be a dictionary that maps document names
# to template names.
Expand Down
21 changes: 16 additions & 5 deletions docs/data_types.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,21 @@ Types

TensorList
----------
.. currentmodule:: nvidia.dali.pipeline
.. currentmodule:: nvidia.dali

TensorList represents a batch of tensors. TensorLists are the return values of `Pipeline.run`
or `Pipeline.share_outputs`
TensorList represents a batch of tensors. TensorLists are the return values of :meth:`Pipeline.run`,
:meth:`Pipeline.outputs` or :meth:`Pipeline.share_outputs`.

Subsequent invocations of the mentioned functions (or :meth:`Pipeline.release_outputs`) invalidate
the TensorList (as well as any DALI :ref:`Tensors<Tensor>` obtained from it) and indicate to DALI
that the memory can be used for something else.

TensorList wraps the outputs of current iteration and is valid only for the duration of the
iteration. Using the TensorList after moving to the next iteration is not allowed.
If you wish to retain the data you need to copy it before indicating DALI that you released it.

For typicall use-cases, for example when DALI is used through :ref:`DL Framework Plugins`,
no additionall memory bookkeeping is necessary.

.. currentmodule:: nvidia.dali.backend

Expand All @@ -33,14 +44,14 @@ TensorCPU
.. autoclass:: TensorCPU
:members:
:undoc-members:
:special-members: __init__
:special-members: __init__, __array_interface__

TensorGPU
^^^^^^^^^
.. autoclass:: TensorGPU
:members:
:undoc-members:
:special-members: __init__
:special-members: __init__, __cuda_array_interface__


.. _layout_str_doc:
Expand Down
3 changes: 3 additions & 0 deletions docs/framework_plugins.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@

.. _DL Framework Plugins
DL Framework Plugins
====================

Expand Down

0 comments on commit b00f2f8

Please sign in to comment.