Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved representation for Set and Map #1073

Open
meooow25 opened this issue Nov 28, 2024 · 3 comments
Open

Improved representation for Set and Map #1073

meooow25 opened this issue Nov 28, 2024 · 3 comments

Comments

@meooow25
Copy link
Contributor

meooow25 commented Nov 28, 2024

Consider changing the representation to avoid Tips.

Suggested by wrengr in #1069 (comment)


I'm thinking

data Set a
  = Bin !Int !a !(Set a) !(Set a)
  | Two !a !a
  | One !a
  | Nil -- Invariant: Never a child of Bin

This hopefully improves performance but certainly improves memory usage. Compared to today, for n elements we can avoid storing ~n Tips and ~n/2-2n/3 Ints.

Alternately, as midway between current and the above,

data Set a
  = Bin !Int !a !(Set a) !(Set a)
  | One !a
  | Tip
-- Invariant: Bin _ x Tip Tip is always replaced with One x

This puts the match-on-singleton idea of #1069 into the type. Avoids storing ~2n/3-n Tips and ~n/3-n/2 Ints.

@meooow25
Copy link
Contributor Author

My suggestion to evaluate these:

  • Adopt the most important functions (say fromList, insert, delete, member/lookup, union, intersection) to work on these alternatives in a separate repo. This is to avoid having to change everything if you try to edit Data.Set.Internal or Data.Map.Internal instead.
  • Adopt property tests, to be safe. No need to write tests separately for each structure, CPP can toggle the import.
  • Adopt benchmarks for the implemented functions.
  • Run the benchmarks and compare the results to see which representation is better.

I can test this, but it is a decent amount of work and it will be a few weeks before I can start.

@HuwCampbell might you be interested in trying this out, as an extension of #1069?

@HuwCampbell
Copy link

Sounds good, but I'm a bit under the pump with work.
I should have some more time in the new year.

@meooow25
Copy link
Contributor Author

Thanks. I might be able to get to it sooner than that, will update here if I do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants