Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C API 2.0 Tensor and TensorList #5799

Merged
merged 11 commits into from
Mar 3, 2025
Merged

Conversation

mzient
Copy link
Contributor

@mzient mzient commented Jan 31, 2025

Category:

New feature (non-breaking change which adds functionality)

Description:

This PR adds wrappers for Tensor and TensorList that are later used to implement the C interface for those objects.
This PR also contains adds missing API functions for setting the tensor/sample source info.

Additional information:

The implementation follows the general scheme:

  • C functions translate arguments, return values and errors to/from C++
  • C++ classes implement C API functions in a 1-to-1 fashion, e.g.:
    • daliTensorGetDesc <-> ITensor::GetDesc

Affected modules and functionalities:

Key points relevant for the review:

Tests:

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

The relevant API functions are documented in dali.h. The C++ classes correspond to those 1:1. Copying the documentation would be counterproductive and create opportunity for documentation mismatch between the C header and the implementation.

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: N/A

@mzient mzient mentioned this pull request Jan 31, 2025
18 tasks
@mzient mzient force-pushed the C_API2_data_objects branch from 04ef2a5 to e27aa11 Compare February 1, 2025 15:50
Copy link
Collaborator

@szkarpinski szkarpinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving some initial comments

// Interfaces
//////////////////////////////////////////////////////////////////////////////

class ITensor : public _DALITensor, public RefCountedObject {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we use this ISomething convention in the code elsewhere?

@@ -0,0 +1,414 @@
// Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you testing non-contiguous tensor lists here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm testing daliTensorListAttachBuffer with custom offsets.
The test for daliTensorListAttachSamples is missing. I'll add.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@mzient mzient force-pushed the C_API2_data_objects branch from e27aa11 to 361bf44 Compare February 10, 2025 15:00
@mzient mzient force-pushed the C_API2_data_objects branch 3 times, most recently from b0511b2 to 539de4e Compare February 13, 2025 12:01
}

//////////////////////////////////////////////////////////////////////////////
// Tensor
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's missing impl for daliTensorGetShape, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was missing; added.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Collaborator

@banasraf banasraf Feb 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mzient also missing:
daliTensorGetBufferPlacement
daliTensorListGetBufferPlacement
and source methods
daliTensorGetSourceInfo
daliTensorListGetSourceInfo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Also added setters.

@mzient mzient force-pushed the C_API2_data_objects branch 5 times, most recently from bbbca35 to 28a3201 Compare February 20, 2025 11:49
Copy link
Collaborator

@szkarpinski szkarpinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall.

My only major concern is the lack of documentation. Some of the functions perform complex operations details of which are not clear from their signatures. Even a minimalistic doxygen comments would help a lot.

virtual const TensorShape<> &GetShape() const & = 0;

template <typename Backend>
const std::shared_ptr<Tensor<Backend>> &Unwrap() const &;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some function names are self-explanatory, but some are not obvious. How about adding some minimal docstrings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

throw std::invalid_argument(
"The number of dimensions must not be negative when num_samples is 0.");
else
ndim = samples[0].ndim;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good feature, but definitely should be documented

Copy link
Contributor Author

@mzient mzient Feb 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is. See daliTensorListAttachSamples:

 * @param ndim            the number of dimensions in each sample;
 *                        if num_samples > 0, this value can be set to -1 and the number of
 *                        dimensions will be taken from samples[0].ndim

Comment on lines 588 to 589
throw std::out_of_range(make_string("The sample index ", sample, " is out of range. "
"Valid indices are [0..", shape.num_samples() - 1, "]."));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: This will look strange for 0 samples

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@mzient mzient force-pushed the C_API2_data_objects branch 2 times, most recently from 18f285c to 71d2513 Compare February 24, 2025 17:05
@mzient
Copy link
Contributor Author

mzient commented Feb 24, 2025

Looks good overall.

My only major concern is the lack of documentation. Some of the functions perform complex operations details of which are not clear from their signatures. Even a minimalistic doxygen comments would help a lot.

Would it be enough to refer to the relevant C API function? Those are quite thoroughly documented.

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [24487279]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [24487279]: BUILD PASSED

mzient and others added 4 commits March 3, 2025 10:25
Signed-off-by: Michał Zientkiewicz <[email protected]>
* Add generic validation.
* Move ToOptional to a new utils.h header.

Signed-off-by: Michal Zientkiewicz <[email protected]>
Signed-off-by: Michal Zientkiewicz <[email protected]>
mzient and others added 6 commits March 3, 2025 10:25
Signed-off-by: Michal Zientkiewicz <[email protected]>
Signed-off-by: Michal Zientkiewicz <[email protected]>
Signed-off-by: Michał Zientkiewicz <[email protected]>
Signed-off-by: Michał Zientkiewicz <[email protected]>
@mzient mzient force-pushed the C_API2_data_objects branch from bdc1621 to 2cd34ef Compare March 3, 2025 09:28
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [24811190]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [24811409]: BUILD STARTED

Signed-off-by: Michał Zientkiewicz <[email protected]>
@mzient mzient force-pushed the C_API2_data_objects branch from 475a3ac to 205c558 Compare March 3, 2025 12:25
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [24811504]: BUILD STARTED

@NVIDIA NVIDIA deleted a comment from dali-automaton Mar 3, 2025
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [24811504]: BUILD PASSED

@mzient mzient merged commit 7da1a0c into NVIDIA:main Mar 3, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants