Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unique vectors #550

Open
wants to merge 9 commits into
base: dev
Choose a base branch
from
Open

Unique vectors #550

wants to merge 9 commits into from

Conversation

TimWhiting
Copy link
Collaborator

@TimWhiting TimWhiting commented Jun 9, 2024

Addresses #544

@anfelor can you take a look at this?

I added a few public apis and a few private helpers.
Public:

  • set
  • update
  • update for local variable vectors

The most important function is the copy function, which ensures that the vector is unique prior to applying any mutation operations. For many cases we would expect it to be unique.

@TimWhiting
Copy link
Collaborator Author

We might want to add a few more apis (cutting an array short and dropping the the rest of the array), or extending an array.

Copy link
Collaborator

@anfelor anfelor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @TimWhiting, this looks a lot friendlier than my draft! I made a few minor comments regarding naming of variables and documentation.

I worry a bit that the unsafe surface area is quite large still. I think we can probably pay for two is-unique checks per iteration of map (given that the second can be well-predicted). Then we can make this API more clean by incorporating the copy calls into unsafe-assign and unsafe-idx-own.

@anfelor
Copy link
Collaborator

anfelor commented Jun 21, 2024

We might want to add a few more apis (cutting an array short and dropping the the rest of the array), or extending an array.

Yes, that would be fun! I think we could also support vector.filter, which would require us to write box_null into the tail of the vector and keep a separate length field. In OCaml, this is unsupported since box_null might not be an inhabitant of the type of the elements though, and I wonder if there is a problem that I don't see currently.

@TimWhiting
Copy link
Collaborator Author

We might want to add a few more apis (cutting an array short and dropping the the rest of the array), or extending an array.

Yes, that would be fun! I think we could also support vector.filter, which would require us to write box_null into the tail of the vector and keep a separate length field. In OCaml, this is unsupported since box_null might not be an inhabitant of the type of the elements though, and I wonder if there is a problem that I don't see currently.

I think these sorts of APIs are maybe best left to a resizeable vector type, which could hold on to the length like you mention and have a resize policy for add / extending the vector as well. What do you think?

Copy link
Collaborator

@anfelor anfelor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@TimWhiting TimWhiting requested a review from daanx June 23, 2024 00:51
@TimWhiting TimWhiting added this to the Small Release milestone Dec 21, 2024
@TimWhiting TimWhiting marked this pull request as draft January 19, 2025 16:57
@TimWhiting TimWhiting mentioned this pull request Jan 19, 2025
@TimWhiting
Copy link
Collaborator Author

TimWhiting commented Jan 20, 2025

I've found a bug with this. Specifically, it behaves poorly with exceptions. Note that this doesn't affect all of the other vector functions, just the mutation of vector elements in local variables (i.e. assign/update). Added a few tests that show the problem.

I think we can still do efficient update of mutable variables, but it will require some runtime changes.
Proposed runtime adjustment: have another special refcount (like STICKY), for REUSE_NO_DROP. For updates to variables, the ref that contains the datastructure needs to be kept alive until the ref is deallocated, but the datastructure itself should have a unique refcount and be reused if it can. However, it shouldn't deallocate itself, if for example an exception is thrown, since the old value could still be obtained via the ref, even if the original value was consumed in the update prior to or after the exception was thrown.

Turning this to draft for now, until we can discuss this, though I think we should probably move ahead with the other changes which are independent from the local variable change.

@TimWhiting TimWhiting marked this pull request as ready for review January 31, 2025 02:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants