Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve fromList for Data.Map and Data.Set #330

Closed
treeowl opened this issue Sep 4, 2016 · 1 comment
Closed

Improve fromList for Data.Map and Data.Set #330

treeowl opened this issue Sep 4, 2016 · 1 comment

Comments

@treeowl
Copy link
Contributor

treeowl commented Sep 4, 2016

Currently, we try to be very efficient about lists that ascend from the start for most of their length. This seems a tad specific. I suspect what we'd really like to use is an algorithm more like that of Data.List.sort. That is, break up the input into ascending and descending segments, convert each segment, and use union to join the segments together. Unfortunately, the current code is not structured in a way that makes this terribly easy.

Another (less serious) problem is that fromAscListWithKey and fromDescListWithKey preprocess their input lists to squash duplicates. When there aren't any duplicates, this allocates a bunch of short-lived conses. It would be nice to find a way to avoid this, but it's probably not terribly important. I made a brief attempt, but it got horrible pretty quickly.

Another (more serious?) problem is that Data.Map.fromAscList is defined in terms of fromAscListWithKey, which in the lazy case seems rather likely to produce an avoidable space leak. (Fixed in #332)

@meooow25
Copy link
Contributor

meooow25 commented Jan 5, 2025

Considering this resolved after #1042.

That is, break up the input into ascending and descending segments, convert each segment, and use union to join the segments together.

I did experiment with unioning multiple ascending segments, but it made the worst case slower.

-- More complicated implementations are possible, such as repeatedly
-- accumulating runs of increasing elements in Stacks (not just once) and
-- union-ing with an accumulated Set, but this makes the worst case somewhat
-- slower (~10%).

Another (less serious) problem is that fromAscListWithKey and fromDescListWithKey preprocess their input lists to squash duplicates. When there aren't any duplicates, this allocates a bunch of short-lived conses.

This is fixed by #1083.

@meooow25 meooow25 closed this as completed Jan 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants