Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Layouts now store a reference to array_data instead of a value #51

Merged
merged 5 commits into from
Apr 8, 2024

Conversation

JohanMabille
Copy link
Collaborator

Fixes #50

include/sparrow/fixed_size_layout.hpp Outdated Show resolved Hide resolved
@@ -121,7 +121,7 @@ namespace sparrow
// using iterator = layout_iterator<self_type, false>;
// using const_iterator = layout_iterator<self_type, true>;

explicit fixed_size_layout(array_data p);
explicit fixed_size_layout(array_data& data);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, can layouts be moved or copied? My assumption is yes but if so we need to add ways for these operations to be safe, in particular in the context of moving array_data objects around (if that would happen). Because now that it's a pointer being stored, moved-from layouts might lead to UB?

Copy link
Collaborator Author

@JohanMabille JohanMabille Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good catch, the overall structure would be something like:

tempate < .... >
class typed_array
{

private:
    array_data m_data;
    layout_type m_layout;
];

With m_layout referencing m_data, and everything being movable / copiable ...

Copy link
Collaborator Author

@JohanMabille JohanMabille Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One solution would be to move m_data into a std::unique_ptr to guarantee in-memory stability.
Another solution is to have the layout classes provide a void reset(array_data&) method to update its pointer after the array_data has been moved.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A third option is to delete copy and move semantics of layouts and reinstantiate them each time we copy / move a typed_array. This is actually my favorite option.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that leads to questions like:

  • if we put a unique_ptr in typed_array , is the cost of allocation a problem? (assuming both members would be contained in the allocated object, like a pimpl)
  • is it ok to implement typed_array's copy operations manually? what happens to m_layout if it's not copyable but we need to copy the typed_array?

I feel like keeping layouts copy and move enabled might be the simplest solution, not sure...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The copy / move constructor of typed_array would be something like:

typed_array(const typed_array& rhs)
    : m_data(rhs.m_data)
    , m_layout(m_data)
{
}

typed_array(typed_array&& rhs)
    : m_data(std::move(rhs.m_data))
    , m_layout(m_data)
{
}

The issue is for assignment operators (copy and move). We probably need an additional "reset_array_data" method to be able to implement these operators.

The issue with keeping the copy and move semantic is that it will be sliently wrong the layout ogf a copied array will still refer to the array_data of the initial array.

if we put a unique_ptr in typed_array , is the cost of allocation a problem? (assuming both members would be contained in the allocated object, like a pimpl)

I don't really know; I would say that as much as we can avoid dynamic allocation, we should do it. Especially if the copy / move semantic of typed_array is "easy enough" to implement without the unique_ptr (even if I admit that "easy enough" is totally suggestive)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say that as much as we can avoid dynamic allocation, we should do it.

I agree.

The issue is for assignment operators (copy and move). We probably need an additional "reset_array_data" method to be able to implement these operators.

In those cases, only the m_data should be moved and copied, as m_layout always refers to the local m_data. No need to change m_layout... Or am I missing something? Does m_layout have an actual state? Depends on the layout type?

Copy link
Collaborator Author

@JohanMabille JohanMabille Mar 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In those cases, only the m_data should be moved and copied, as m_layout always refers to the local m_data. No need to change m_layout... Or am I missing something?

Oh indeed, the layout will still point to the same object, so no need to do anything!

Does m_layout have an actual state? Depends on the layout type?

For me a layout class is just a "way" to read the data, it should be stateless.

{
// We only require the presence of the bitmap and the first buffer.
assert(m_data.buffers.size() > 0);
assert(m_data.length == m_data.bitmap.size());
assert(p_data->buffers.size() > 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might need a free function checking if an array_data seems well-formed, and use that in asserts too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether an array_data is well-formed is actually specific to the layout (even if some layouts share some invariants).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case it might be necessary to make a concept for layouts that requests, among other thing, that these layout types provide each a way to check if the provided array_data matches the expected structure.
Then, because of the nature of the construction of the layout, the check must be done whatever the build context and then we must throw an exception (or abort, that depends on what we decide to support as error handling) or keep assertions and document that providing an array_data to constructor layout that doesnt match the expected structure is undefined (and maybe provide a way to activate the check).
Lots of tradeoff choices here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should do this in a dedicated PR. I'll open an issue to track it.

@JohanMabille JohanMabille changed the title Layouts now store a pointer to array_data instead of a value Layouts now store a reference to array_data instead of a value Mar 28, 2024
Copy link
Collaborator

@Klaim Klaim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, dont forget to note the things to do in future prs

@JohanMabille JohanMabille merged commit 9887596 into man-group:main Apr 8, 2024
20 checks passed
@JohanMabille JohanMabille deleted the array_data_ref branch April 8, 2024 12:59
@JohanMabille JohanMabille mentioned this pull request Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Layout should not own array_data
4 participants