Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make size run in O(1) #170

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

rockbmb
Copy link
Contributor

@rockbmb rockbmb commented Oct 11, 2017

To fix #138, this PR modifies the HashMap datatype and its associated functions so that size :: (Eq k, Hashable k) => HashMap k v -> Int runs in constant time, and not linear.

To summarize what was done (not in this order):

  • the present (as of now, before this PR is merged) HashMap type was renamed Tree and is no longer visible to the end-user;
  • a wrapper HashMap = HashMap !Int !(Tree k v) was created, is now the user-facing HashMap datatype. Here, the !Int represents the hashmap's size
  • every function that potentially changes a hashmap's size calls an internal "wrapper" version of itself that returns a tuple with the change in the hashmap's size
  • the wrapper function is responsible for updating the Int field in HashMap after calling its internal variant
  • because of all this, size is now defined as size (HashMap sz _) = sz, which runs in O(1)

The most important parts of the code are here, but I plan on updating the PR with some commits adding better comments throughout. The reason I submit this PR now is that it will take a while for it to be reviewed and merged, and the sooner that process can begin the better.

As an added bonus, (==)/(/=) will now shortcircuit whenever the two hashmaps being compared don't have the same size.

UPDATE (15/03/2018): another plus is that intersection{With,WithKey}, union{With,WithKey} now compare their argument hashmaps' sizes, and iterate over the smaller one.

cc @tibbe @treeowl

Data/HashMap/Array.hs Outdated Show resolved Hide resolved
Data/HashMap/Array.hs Outdated Show resolved Hide resolved
Data/HashMap/Base.hs Outdated Show resolved Hide resolved
Data/HashMap/Base.hs Outdated Show resolved Hide resolved
Data/HashMap/Base.hs Outdated Show resolved Hide resolved
Data/HashMap/Base.hs Outdated Show resolved Hide resolved
Data/HashMap/Strict.hs Outdated Show resolved Hide resolved
@treeowl
Copy link
Collaborator

treeowl commented Oct 11, 2017

Aside from my inline comments, I also have a general concern about this. Making HashMap a wrapper is probably fine when it can be unboxed, although of course there might be some effects from register pressure, instruction cache, etc. But when it doesn't get unboxed, there's an extra indirection at the top of the tree. This will happen, for example, if someone makes a HashMap, Array, etc., full of HashMaps. We'll need benchmarks specifically designed to look at this issue.

Data/HashMap/Array.hs Outdated Show resolved Hide resolved
@rockbmb rockbmb force-pushed the make-size-const branch 3 times, most recently from 067c891 to de6b771 Compare October 12, 2017 06:02
@rockbmb
Copy link
Contributor Author

rockbmb commented Oct 12, 2017

The fromListWith strictness test in CI is failing, and I'm not yet sure why. After this is fixed, I'll work on those benchmarks @treeowl.

UPDATE: It was just a bug with 'unsafeInsertWithInternal', fixed.

@rockbmb rockbmb force-pushed the make-size-const branch 7 times, most recently from 7d190bd to a4f9bc6 Compare October 12, 2017 22:14
@rockbmb
Copy link
Contributor Author

rockbmb commented Oct 13, 2017

@treeowl regarding benchmarks, if someone has an Array/Vector/HashMap/Map/.. of HashMaps, in what kind of operations may the indirection you mentioned cause problems?

@rockbmb
Copy link
Contributor Author

rockbmb commented Oct 14, 2017

Regarding the already existing benchmarks, because the data used in them is generated with the same seed, the maps never change, so I cloned this repository elsewhere and ran the benchmarks a few times for both this branch and master.

Current implementation is on the left, this PR's code on the right:

benchmarking HashMap/lookup/String                           benchmarking HashMap/lookup/String
time                 825.1 μs   (818.0 μs .. 830.8 μs)       time                 812.1 μs   (808.4 μs .. 817.1 μs)
                     0.999 R²   (0.999 R² .. 1.000 R²)                            1.000 R²   (1.000 R² .. 1.000 R²)
mean                 776.9 μs   (765.6 μs .. 786.6 μs)       mean                 757.9 μs   (747.5 μs .. 766.9 μs)
std dev              31.40 μs   (24.58 μs .. 37.85 μs)       std dev              29.49 μs   (24.90 μs .. 35.37 μs)
variance introduced by outliers: 30% (moderately inflated)   variance introduced by outliers: 29% (moderately inflated)
                                                             
benchmarking HashMap/lookup/ByteString                       benchmarking HashMap/lookup/ByteString
time                 307.4 μs   (304.3 μs .. 312.7 μs)       time                 287.1 μs   (285.0 μs .. 288.8 μs)
                     0.994 R²   (0.991 R² .. 0.997 R²)                            1.000 R²   (0.999 R² .. 1.000 R²)
mean                 306.9 μs   (300.7 μs .. 312.9 μs)       mean                 265.7 μs   (261.8 μs .. 269.0 μs)
std dev              18.31 μs   (15.95 μs .. 21.64 μs)       std dev              10.39 μs   (8.187 μs .. 13.52 μs)
variance introduced by outliers: 54% (severely inflated)     variance introduced by outliers: 33% (moderately inflated)
                                                             
benchmarking HashMap/lookup/Int                              benchmarking HashMap/lookup/Int
time                 181.1 μs   (178.6 μs .. 185.3 μs)       time                 170.0 μs   (166.4 μs .. 176.3 μs)
                     0.995 R²   (0.992 R² .. 0.998 R²)                            0.992 R²   (0.988 R² .. 0.995 R²)
mean                 172.9 μs   (170.5 μs .. 177.2 μs)       mean                 168.4 μs   (164.4 μs .. 171.7 μs)
std dev              8.822 μs   (6.386 μs .. 12.12 μs)       std dev              10.32 μs   (8.075 μs .. 13.87 μs)
variance introduced by outliers: 48% (moderately inflated)   variance introduced by outliers: 58% (severely inflated)
                                                             
benchmarking HashMap/lookup-miss/String                      benchmarking HashMap/lookup-miss/String
time                 485.8 μs   (470.0 μs .. 498.8 μs)       time                 279.3 μs   (276.0 μs .. 281.7 μs)
                     0.997 R²   (0.996 R² .. 1.000 R²)                            0.998 R²   (0.997 R² .. 0.999 R²)
mean                 437.7 μs   (430.1 μs .. 446.6 μs)       mean                 248.0 μs   (240.5 μs .. 254.5 μs)
std dev              25.10 μs   (19.19 μs .. 34.43 μs)       std dev              20.53 μs   (18.99 μs .. 23.45 μs)
variance introduced by outliers: 50% (moderately inflated)   variance introduced by outliers: 70% (severely inflated)
                                                             
benchmarking HashMap/lookup-miss/ByteString                  benchmarking HashMap/lookup-miss/ByteString
time                 217.6 μs   (214.5 μs .. 222.4 μs)       time                 157.6 μs   (154.8 μs .. 161.3 μs)
                     0.997 R²   (0.995 R² .. 0.998 R²)                            0.997 R²   (0.994 R² .. 1.000 R²)
mean                 209.0 μs   (206.2 μs .. 211.7 μs)       mean                 144.6 μs   (141.8 μs .. 148.1 μs)
std dev              7.665 μs   (6.626 μs .. 9.098 μs)       std dev              8.944 μs   (6.593 μs .. 12.77 μs)
variance introduced by outliers: 32% (moderately inflated)   variance introduced by outliers: 59% (severely inflated)
                                                             
benchmarking HashMap/lookup-miss/Int                         benchmarking HashMap/lookup-miss/Int
time                 163.6 μs   (156.2 μs .. 170.7 μs)       time                 145.0 μs   (138.2 μs .. 152.1 μs)
                     0.991 R²   (0.988 R² .. 0.996 R²)                            0.991 R²   (0.989 R² .. 0.996 R²)
mean                 149.2 μs   (145.0 μs .. 153.8 μs)       mean                 131.0 μs   (128.1 μs .. 135.0 μs)
std dev              12.69 μs   (10.66 μs .. 15.62 μs)       std dev              10.25 μs   (7.675 μs .. 12.99 μs)
variance introduced by outliers: 74% (severely inflated)     variance introduced by outliers: 71% (severely inflated)
                                                             
benchmarking HashMap/insert/String                           benchmarking HashMap/insert/String
time                 1.708 ms   (1.676 ms .. 1.734 ms)       time                 1.518 ms   (1.472 ms .. 1.564 ms)
                     0.997 R²   (0.996 R² .. 0.999 R²)                            0.995 R²   (0.994 R² .. 0.997 R²)
mean                 1.534 ms   (1.497 ms .. 1.567 ms)       mean                 1.352 ms   (1.312 ms .. 1.385 ms)
std dev              107.9 μs   (88.46 μs .. 128.3 μs)       std dev              111.0 μs   (84.43 μs .. 143.3 μs)
variance introduced by outliers: 52% (severely inflated)     variance introduced by outliers: 60% (severely inflated)
                                                             
benchmarking HashMap/insert/ByteString                       benchmarking HashMap/insert/ByteString
time                 1.728 ms   (1.690 ms .. 1.783 ms)       time                 1.736 ms   (1.717 ms .. 1.754 ms)
                     0.996 R²   (0.994 R² .. 0.997 R²)                            0.998 R²   (0.996 R² .. 0.999 R²)
mean                 1.596 ms   (1.567 ms .. 1.622 ms)       mean                 1.470 ms   (1.416 ms .. 1.520 ms)
std dev              85.97 μs   (76.16 μs .. 96.84 μs)       std dev              158.9 μs   (143.8 μs .. 174.4 μs)
variance introduced by outliers: 39% (moderately inflated)   variance introduced by outliers: 73% (severely inflated)
                                                             
benchmarking HashMap/insert/Int                              benchmarking HashMap/insert/Int
time                 1.210 ms   (1.185 ms .. 1.227 ms)       time                 1.069 ms   (1.047 ms .. 1.092 ms)
                     0.995 R²   (0.993 R² .. 0.997 R²)                            0.997 R²   (0.996 R² .. 0.999 R²)
mean                 1.026 ms   (994.3 μs .. 1.061 ms)       mean                 958.8 μs   (943.4 μs .. 976.4 μs)
std dev              99.72 μs   (87.32 μs .. 111.2 μs)       std dev              51.47 μs   (42.41 μs .. 71.74 μs)
variance introduced by outliers: 69% (severely inflated)     variance introduced by outliers: 41% (moderately inflated)
                                                             
benchmarking HashMap/insert-dup/String                       benchmarking HashMap/insert-dup/String
time                 1.961 ms   (1.879 ms .. 2.027 ms)       time                 845.1 μs   (829.5 μs .. 870.5 μs)
                     0.968 R²   (0.943 R² .. 0.982 R²)                            0.994 R²   (0.992 R² .. 0.996 R²)
mean                 1.459 ms   (1.287 ms .. 1.584 ms)       mean                 849.5 μs   (837.7 μs .. 861.3 μs)
std dev              422.8 μs   (363.2 μs .. 464.9 μs)       std dev              37.04 μs   (32.20 μs .. 43.18 μs)
variance introduced by outliers: 96% (severely inflated)     variance introduced by outliers: 33% (moderately inflated)
                                                             
benchmarking HashMap/insert-dup/ByteString                   benchmarking HashMap/insert-dup/ByteString
time                 427.8 μs   (417.1 μs .. 442.5 μs)       time                 386.5 μs   (383.6 μs .. 388.4 μs)
                     0.995 R²   (0.994 R² .. 0.997 R²)                            0.999 R²   (0.998 R² .. 0.999 R²)
mean                 421.6 μs   (416.2 μs .. 427.2 μs)       mean                 345.9 μs   (335.2 μs .. 354.3 μs)
std dev              16.38 μs   (13.51 μs .. 21.15 μs)       std dev              28.10 μs   (24.83 μs .. 31.41 μs)
variance introduced by outliers: 31% (moderately inflated)   variance introduced by outliers: 68% (severely inflated)
                                                             
benchmarking HashMap/insert-dup/Int                          benchmarking HashMap/insert-dup/Int
time                 546.0 μs   (538.2 μs .. 556.4 μs)       time                 522.2 μs   (511.3 μs .. 531.6 μs)
                     0.998 R²   (0.996 R² .. 0.999 R²)                            0.998 R²   (0.997 R² .. 0.999 R²)
mean                 504.4 μs   (495.7 μs .. 514.5 μs)       mean                 468.1 μs   (457.0 μs .. 479.6 μs)
std dev              27.59 μs   (21.43 μs .. 35.39 μs)       std dev              34.74 μs   (30.22 μs .. 40.10 μs)
variance introduced by outliers: 46% (moderately inflated)   variance introduced by outliers: 62% (severely inflated)
                                                             
benchmarking HashMap/delete/String                           benchmarking HashMap/delete/String
time                 1.578 ms   (1.566 ms .. 1.589 ms)       time                 1.339 ms   (1.303 ms .. 1.400 ms)
                     0.999 R²   (0.998 R² .. 0.999 R²)                            0.993 R²   (0.990 R² .. 0.996 R²)
mean                 1.394 ms   (1.356 ms .. 1.429 ms)       mean                 1.383 ms   (1.364 ms .. 1.400 ms)
std dev              114.8 μs   (104.2 μs .. 128.8 μs)       std dev              56.30 μs   (44.23 μs .. 71.89 μs)
variance introduced by outliers: 60% (severely inflated)     variance introduced by outliers: 27% (moderately inflated)
                                                             
benchmarking HashMap/delete/ByteString                       benchmarking HashMap/delete/ByteString
time                 879.5 μs   (874.2 μs .. 884.3 μs)       time                 877.5 μs   (862.8 μs .. 884.9 μs)
                     0.999 R²   (0.999 R² .. 1.000 R²)                            0.998 R²   (0.997 R² .. 0.999 R²)
mean                 830.0 μs   (822.9 μs .. 837.5 μs)       mean                 772.7 μs   (755.0 μs .. 793.8 μs)
std dev              23.00 μs   (20.15 μs .. 27.19 μs)       std dev              60.44 μs   (53.78 μs .. 66.84 μs)
variance introduced by outliers: 17% (moderately inflated)   variance introduced by outliers: 61% (severely inflated)
                                                             
benchmarking HashMap/delete/Int                              benchmarking HashMap/delete/Int
time                 606.3 μs   (590.1 μs .. 617.2 μs)       time                 553.6 μs   (546.1 μs .. 561.4 μs)
                     0.997 R²   (0.997 R² .. 0.998 R²)                            0.999 R²   (0.998 R² .. 0.999 R²)
mean                 534.4 μs   (521.6 μs .. 548.3 μs)       mean                 526.9 μs   (521.4 μs .. 533.2 μs)
std dev              41.11 μs   (33.19 μs .. 49.58 μs)       std dev              18.26 μs   (15.05 μs .. 22.51 μs)
variance introduced by outliers: 63% (severely inflated)     variance introduced by outliers: 25% (moderately inflated)
                                                             
benchmarking HashMap/delete-miss/String                      benchmarking HashMap/delete-miss/String
time                 452.6 μs   (448.5 μs .. 458.2 μs)       time                 278.3 μs   (271.0 μs .. 287.3 μs)
                     0.998 R²   (0.996 R² .. 0.999 R²)                            0.997 R²   (0.995 R² .. 1.000 R²)
mean                 435.5 μs   (431.0 μs .. 440.3 μs)       mean                 261.6 μs   (257.7 μs .. 264.8 μs)
std dev              13.54 μs   (11.44 μs .. 16.20 μs)       std dev              10.59 μs   (9.044 μs .. 12.37 μs)
variance introduced by outliers: 23% (moderately inflated)   variance introduced by outliers: 35% (moderately inflated)
                                                             
benchmarking HashMap/delete-miss/ByteString                  benchmarking HashMap/delete-miss/ByteString
time                 243.9 μs   (243.3 μs .. 244.6 μs)       time                 226.3 μs   (216.5 μs .. 233.9 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)                            0.993 R²   (0.990 R² .. 0.997 R²)
mean                 221.6 μs   (216.9 μs .. 224.5 μs)       mean                 203.6 μs   (199.0 μs .. 209.6 μs)
std dev              10.20 μs   (7.591 μs .. 16.48 μs)       std dev              15.99 μs   (13.14 μs .. 18.34 μs)
variance introduced by outliers: 43% (moderately inflated)   variance introduced by outliers: 69% (severely inflated)
                                                             
benchmarking HashMap/delete-miss/Int                         benchmarking HashMap/delete-miss/Int
time                 348.0 μs   (332.4 μs .. 361.6 μs)       time                 321.3 μs   (319.0 μs .. 324.3 μs)
                     0.994 R²   (0.991 R² .. 0.999 R²)                            0.999 R²   (0.998 R² .. 0.999 R²)
mean                 305.8 μs   (298.5 μs .. 316.1 μs)       mean                 307.8 μs   (304.9 μs .. 310.6 μs)
std dev              24.40 μs   (18.02 μs .. 35.36 μs)       std dev              8.846 μs   (6.875 μs .. 11.49 μs)
variance introduced by outliers: 67% (severely inflated)     variance introduced by outliers: 21% (moderately inflated)
                                                             
benchmarking HashMap/union                                   benchmarking HashMap/union
time                 117.3 μs   (115.7 μs .. 119.3 μs)       time                 124.4 μs   (122.6 μs .. 125.7 μs)
                     0.998 R²   (0.997 R² .. 0.999 R²)                            0.999 R²   (0.999 R² .. 1.000 R²)
mean                 109.3 μs   (107.8 μs .. 110.7 μs)       mean                 114.5 μs   (112.9 μs .. 116.1 μs)
std dev              4.333 μs   (3.613 μs .. 5.356 μs)       std dev              4.738 μs   (3.876 μs .. 5.755 μs)
variance introduced by outliers: 38% (moderately inflated)   variance introduced by outliers: 40% (moderately inflated)
                                                             
benchmarking HashMap/map                                     benchmarking HashMap/map
time                 99.16 μs   (97.83 μs .. 100.7 μs)       time                 91.76 μs   (91.14 μs .. 92.48 μs)
                     0.999 R²   (0.997 R² .. 0.999 R²)                            1.000 R²   (0.999 R² .. 1.000 R²)
mean                 89.76 μs   (87.41 μs .. 91.72 μs)       mean                 84.07 μs   (82.49 μs .. 85.35 μs)
std dev              5.898 μs   (4.361 μs .. 7.675 μs)       std dev              4.031 μs   (3.316 μs .. 4.891 μs)
variance introduced by outliers: 64% (severely inflated)     variance introduced by outliers: 48% (moderately inflated)
                                                             
benchmarking HashMap/difference                              benchmarking HashMap/difference
time                 344.2 μs   (342.8 μs .. 345.6 μs)       time                 328.0 μs   (322.9 μs .. 330.9 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)                            0.997 R²   (0.996 R² .. 0.998 R²)
mean                 318.4 μs   (314.4 μs .. 322.3 μs)       mean                 287.1 μs   (278.9 μs .. 296.1 μs)
std dev              11.76 μs   (10.09 μs .. 13.82 μs)       std dev              26.59 μs   (24.72 μs .. 28.61 μs)
variance introduced by outliers: 30% (moderately inflated)   variance introduced by outliers: 74% (severely inflated)
                                                             
benchmarking HashMap/intersection                            benchmarking HashMap/intersection
time                 354.5 μs   (348.0 μs .. 365.6 μs)       time                 300.8 μs   (298.7 μs .. 302.7 μs)
                     0.997 R²   (0.996 R² .. 0.998 R²)                            0.999 R²   (0.998 R² .. 0.999 R²)
mean                 335.8 μs   (331.0 μs .. 340.4 μs)       mean                 279.6 μs   (275.4 μs .. 283.7 μs)
std dev              14.44 μs   (11.88 μs .. 17.58 μs)       std dev              12.81 μs   (10.67 μs .. 15.23 μs)
variance introduced by outliers: 37% (moderately inflated)   variance introduced by outliers: 41% (moderately inflated)
                                                             
benchmarking HashMap/foldl'                                  benchmarking HashMap/foldl'
time                 31.85 μs   (31.60 μs .. 31.98 μs)       time                 31.11 μs   (30.07 μs .. 32.02 μs)
                     0.999 R²   (0.999 R² .. 1.000 R²)                            0.996 R²   (0.995 R² .. 0.998 R²)
mean                 28.38 μs   (27.51 μs .. 29.12 μs)       mean                 27.71 μs   (27.10 μs .. 28.50 μs)
std dev              2.180 μs   (1.844 μs .. 2.438 μs)       std dev              2.004 μs   (1.365 μs .. 2.577 μs)
variance introduced by outliers: 75% (severely inflated)     variance introduced by outliers: 72% (severely inflated)
                                                             
benchmarking HashMap/foldr                                   benchmarking HashMap/foldr
time                 107.6 μs   (105.1 μs .. 110.3 μs)       time                 47.71 μs   (47.19 μs .. 48.67 μs)
                     0.987 R²   (0.979 R² .. 0.992 R²)                            0.996 R²   (0.994 R² .. 0.998 R²)
mean                 94.54 μs   (90.91 μs .. 97.41 μs)       mean                 45.65 μs   (45.00 μs .. 46.57 μs)
std dev              9.167 μs   (7.074 μs .. 11.45 μs)       std dev              2.217 μs   (1.582 μs .. 3.240 μs)
variance introduced by outliers: 80% (severely inflated)     variance introduced by outliers: 52% (severely inflated)
                                                             
benchmarking HashMap/filter                                  benchmarking HashMap/filter
time                 105.9 μs   (103.2 μs .. 109.7 μs)       time                 129.7 μs   (128.4 μs .. 130.6 μs)
                     0.994 R²   (0.992 R² .. 0.996 R²)                            0.999 R²   (0.998 R² .. 0.999 R²)
mean                 100.8 μs   (98.74 μs .. 103.0 μs)       mean                 117.6 μs   (113.8 μs .. 120.3 μs)
std dev              5.980 μs   (4.971 μs .. 7.160 μs)       std dev              8.720 μs   (6.054 μs .. 11.06 μs)
variance introduced by outliers: 58% (severely inflated)     variance introduced by outliers: 68% (severely inflated)
                                                             
benchmarking HashMap/filterWithKey                           benchmarking HashMap/filterWithKey
time                 68.36 μs   (67.44 μs .. 70.04 μs)       time                 222.0 μs   (218.1 μs .. 224.6 μs)
                     0.995 R²   (0.992 R² .. 0.997 R²)                            0.999 R²   (0.998 R² .. 0.999 R²)
mean                 65.82 μs   (64.71 μs .. 67.48 μs)       mean                 200.4 μs   (195.8 μs .. 204.2 μs)
std dev              3.731 μs   (2.842 μs .. 4.642 μs)       std dev              12.48 μs   (10.12 μs .. 15.62 μs)
variance introduced by outliers: 58% (severely inflated)     variance introduced by outliers: 58% (severely inflated)
                                                             
benchmarking HashMap/size/String                             benchmarking HashMap/size/String
time                 45.83 μs   (45.27 μs .. 46.46 μs)       time                 6.549 ns   (6.508 ns .. 6.600 ns)
                     0.998 R²   (0.996 R² .. 0.999 R²)                            0.999 R²   (0.998 R² .. 0.999 R²)
mean                 41.73 μs   (40.93 μs .. 42.58 μs)       mean                 5.738 ns   (5.630 ns .. 5.826 ns)
std dev              2.458 μs   (2.064 μs .. 3.086 μs)       std dev              192.5 ps   (155.6 ps .. 266.5 ps)
variance introduced by outliers: 61% (severely inflated)     variance introduced by outliers: 55% (severely inflated)
                                                             
benchmarking HashMap/size/ByteString                         benchmarking HashMap/size/ByteString
time                 41.24 μs   (41.05 μs .. 41.44 μs)       time                 7.556 ns   (7.552 ns .. 7.561 ns)
                     1.000 R²   (0.999 R² .. 1.000 R²)                            1.000 R²   (1.000 R² .. 1.000 R²)
mean                 37.68 μs   (37.10 μs .. 38.14 μs)       mean                 6.649 ns   (6.558 ns .. 6.740 ns)
std dev              1.450 μs   (1.174 μs .. 1.883 μs)       std dev              172.7 ps   (131.5 ps .. 228.5 ps)
variance introduced by outliers: 41% (moderately inflated)   variance introduced by outliers: 42% (moderately inflated)
                                                             
benchmarking HashMap/size/Int                                benchmarking HashMap/size/Int
time                 27.31 μs   (26.56 μs .. 28.48 μs)       time                 7.018 ns   (6.977 ns .. 7.070 ns)
                     0.991 R²   (0.989 R² .. 0.994 R²)                            1.000 R²   (1.000 R² .. 1.000 R²)
mean                 25.97 μs   (25.33 μs .. 26.58 μs)       mean                 6.153 ns   (6.071 ns .. 6.256 ns)
std dev              1.820 μs   (1.570 μs .. 2.220 μs)       std dev              176.1 ps   (140.0 ps .. 229.3 ps)
variance introduced by outliers: 71% (severely inflated)     variance introduced by outliers: 46% (moderately inflated)
                                                             
benchmarking HashMap/fromList/long/String                    benchmarking HashMap/fromList/long/String
time                 1.497 ms   (1.465 ms .. 1.522 ms)       time                 1.049 ms   (1.019 ms .. 1.091 ms)
                     0.996 R²   (0.994 R² .. 0.998 R²)                            0.993 R²   (0.992 R² .. 0.996 R²)
mean                 1.272 ms   (1.235 ms .. 1.311 ms)       mean                 1.010 ms   (997.3 μs .. 1.023 ms)
std dev              118.4 μs   (103.7 μs .. 134.5 μs)       std dev              39.87 μs   (33.71 μs .. 46.49 μs)
variance introduced by outliers: 67% (severely inflated)     variance introduced by outliers: 28% (moderately inflated)
                                                             
benchmarking HashMap/fromList/long/ByteString                benchmarking HashMap/fromList/long/ByteString
time                 1.143 ms   (1.129 ms .. 1.154 ms)       time                 1.296 ms   (1.264 ms .. 1.316 ms)
                     0.998 R²   (0.997 R² .. 0.999 R²)                            0.995 R²   (0.993 R² .. 0.997 R²)
mean                 1.060 ms   (1.047 ms .. 1.073 ms)       mean                 1.110 ms   (1.077 ms .. 1.150 ms)
std dev              39.46 μs   (33.47 μs .. 50.57 μs)       std dev              108.5 μs   (92.00 μs .. 130.0 μs)
variance introduced by outliers: 25% (moderately inflated)   variance introduced by outliers: 71% (severely inflated)
                                                             
benchmarking HashMap/fromList/long/Int                       benchmarking HashMap/fromList/long/Int
time                 496.8 μs   (484.6 μs .. 514.3 μs)       time                 663.7 μs   (658.4 μs .. 670.9 μs)
                     0.996 R²   (0.996 R² .. 0.999 R²)                            0.997 R²   (0.996 R² .. 0.999 R²)
mean                 479.7 μs   (475.3 μs .. 484.8 μs)       mean                 641.7 μs   (632.7 μs .. 650.7 μs)
std dev              14.71 μs   (13.04 μs .. 16.67 μs)       std dev              28.51 μs   (22.23 μs .. 35.29 μs)
variance introduced by outliers: 22% (moderately inflated)   variance introduced by outliers: 35% (moderately inflated)
                                                             
benchmarking HashMap/fromList/short/String                   benchmarking HashMap/fromList/short/String
time                 530.3 μs   (516.0 μs .. 540.9 μs)       time                 512.0 μs   (496.1 μs .. 530.5 μs)
                     0.996 R²   (0.994 R² .. 0.998 R²)                            0.995 R²   (0.992 R² .. 0.998 R²)
mean                 465.1 μs   (451.9 μs .. 478.9 μs)       mean                 486.2 μs   (478.2 μs .. 495.7 μs)
std dev              40.05 μs   (34.48 μs .. 48.54 μs)       std dev              25.92 μs   (20.53 μs .. 32.37 μs)
variance introduced by outliers: 70% (severely inflated)     variance introduced by outliers: 45% (moderately inflated)
                                                             
benchmarking HashMap/fromList/short/ByteString               benchmarking HashMap/fromList/short/ByteString
time                 477.6 μs   (473.9 μs .. 481.7 μs)       time                 559.7 μs   (546.7 μs .. 568.0 μs)
                     0.999 R²   (0.998 R² .. 0.999 R²)                            0.997 R²   (0.995 R² .. 0.998 R²)
mean                 453.2 μs   (448.9 μs .. 456.9 μs)       mean                 493.7 μs   (481.2 μs .. 507.5 μs)
std dev              12.28 μs   (9.790 μs .. 16.12 μs)       std dev              38.66 μs   (32.36 μs .. 44.11 μs)
variance introduced by outliers: 18% (moderately inflated)   variance introduced by outliers: 64% (severely inflated)
                                                             
benchmarking HashMap/fromList/short/Int                      benchmarking HashMap/fromList/short/Int
time                 279.2 μs   (278.1 μs .. 280.4 μs)       time                 341.7 μs   (337.0 μs .. 345.2 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)                            0.998 R²   (0.998 R² .. 0.999 R²)
mean                 254.4 μs   (249.9 μs .. 257.9 μs)       mean                 320.7 μs   (317.1 μs .. 324.8 μs)
std dev              11.42 μs   (9.232 μs .. 13.74 μs)       std dev              11.29 μs   (9.763 μs .. 13.39 μs)
variance introduced by outliers: 40% (moderately inflated)   variance introduced by outliers: 28% (moderately inflated)
                                                             
benchmarking HashMap/fromListWith/long/String                benchmarking HashMap/fromListWith/long/String
time                 1.482 ms   (1.440 ms .. 1.513 ms)       time                 1.266 ms   (1.249 ms .. 1.282 ms)
                     0.997 R²   (0.996 R² .. 0.998 R²)                            0.998 R²   (0.998 R² .. 0.999 R²)
mean                 1.291 ms   (1.262 ms .. 1.318 ms)       mean                 1.121 ms   (1.096 ms .. 1.148 ms)
std dev              91.14 μs   (75.72 μs .. 111.6 μs)       std dev              80.92 μs   (67.78 μs .. 95.87 μs)
variance introduced by outliers: 54% (severely inflated)     variance introduced by outliers: 56% (severely inflated)
                                                             
benchmarking HashMap/fromListWith/long/ByteString            benchmarking HashMap/fromListWith/long/ByteString
time                 1.308 ms   (1.289 ms .. 1.332 ms)       time                 1.296 ms   (1.247 ms .. 1.330 ms)
                     0.996 R²   (0.995 R² .. 0.998 R²)                            0.995 R²   (0.993 R² .. 0.997 R²)
mean                 1.240 ms   (1.221 ms .. 1.257 ms)       mean                 1.129 ms   (1.102 ms .. 1.160 ms)
std dev              57.98 μs   (46.47 μs .. 70.24 μs)       std dev              93.02 μs   (73.60 μs .. 120.2 μs)
variance introduced by outliers: 33% (moderately inflated)   variance introduced by outliers: 62% (severely inflated)
                                                             
benchmarking HashMap/fromListWith/long/Int                   benchmarking HashMap/fromListWith/long/Int
time                 739.8 μs   (732.4 μs .. 747.0 μs)       time                 1.433 ms   (1.239 ms .. 1.564 ms)
                     0.998 R²   (0.997 R² .. 0.999 R²)                            0.933 R²   (0.907 R² .. 0.957 R²)
mean                 653.2 μs   (635.7 μs .. 669.1 μs)       mean                 980.0 μs   (891.0 μs .. 1.098 ms)
std dev              46.49 μs   (38.84 μs .. 54.39 μs)       std dev              295.7 μs   (237.5 μs .. 346.7 μs)
variance introduced by outliers: 58% (severely inflated)     variance introduced by outliers: 97% (severely inflated)
                                                             
benchmarking HashMap/fromListWith/short/String               benchmarking HashMap/fromListWith/short/String
time                 566.5 μs   (549.5 μs .. 578.8 μs)       time                 601.2 μs   (590.3 μs .. 611.2 μs)
                     0.997 R²   (0.996 R² .. 0.999 R²)                            0.993 R²   (0.988 R² .. 0.996 R²)
mean                 495.5 μs   (482.9 μs .. 510.1 μs)       mean                 612.9 μs   (598.6 μs .. 644.7 μs)
std dev              39.96 μs   (33.15 μs .. 50.02 μs)       std dev              58.88 μs   (21.24 μs .. 99.47 μs)
variance introduced by outliers: 65% (severely inflated)     variance introduced by outliers: 72% (severely inflated)
                                                             
benchmarking HashMap/fromListWith/short/ByteString           benchmarking HashMap/fromListWith/short/ByteString
time                 565.4 μs   (559.2 μs .. 576.4 μs)       time                 636.7 μs   (619.7 μs .. 657.9 μs)
                     0.997 R²   (0.996 R² .. 0.998 R²)                            0.995 R²   (0.994 R² .. 0.997 R²)
mean                 526.5 μs   (514.3 μs .. 538.2 μs)       mean                 586.7 μs   (567.8 μs .. 599.4 μs)
std dev              36.02 μs   (30.28 μs .. 41.60 μs)       std dev              45.28 μs   (34.64 μs .. 57.82 μs)
variance introduced by outliers: 57% (severely inflated)     variance introduced by outliers: 63% (severely inflated)
                                                             
benchmarking HashMap/fromListWith/short/Int                  benchmarking HashMap/fromListWith/short/Int
time                 348.4 μs   (345.2 μs .. 352.2 μs)       time                 491.9 μs   (479.3 μs .. 500.7 μs)
                     0.999 R²   (0.998 R² .. 0.999 R²)                            0.995 R²   (0.993 R² .. 0.997 R²)
mean                 308.4 μs   (298.6 μs .. 317.2 μs)       mean                 423.5 μs   (410.5 μs .. 438.1 μs)
std dev              25.91 μs   (20.67 μs .. 30.26 μs)       std dev              39.79 μs   (33.92 μs .. 44.75 μs)
variance introduced by outliers: 70% (severely inflated)     variance introduced by outliers: 73% (severely inflated)

@rockbmb
Copy link
Contributor Author

rockbmb commented Oct 15, 2017

  • As expected, size is much faster, taking a few nanoseconds regardless of the map's size (it's supposed to be constant after all, this is good)
  • filterWithKey have become noticeably slower, I don't think anything can be done to eliminate this slowdown, it'll be this one of this PR's tradeoffs
  • Every other function stays more or less the same, at times with an advantage of about 5-10% to the current implementation. This is expected because of the overhead in size-tracking.

@rockbmb
Copy link
Contributor Author

rockbmb commented Oct 16, 2017

@treeowl I added some benchmarks for the situation you describe.

I'll run them with care later (and add a few more cases, i.e. intersection, difference, size), but in the meantime let me know if this is what you meant.

@treeowl
Copy link
Collaborator

treeowl commented Oct 16, 2017

@treeowl regarding benchmarks, if someone has an Array/Vector/HashMap/Map/.. of HashMaps, in what kind of operations may the indirection you mentioned cause problems?

Well, imagine someone has a generalized trie with HashMaps in the nodes representing certain sums. Indirection is generally a problem for performance! Frankly, I'd be more comfortable about this whole enterprise if there were some nice way to "opt out", so people who don't need size don't have to pay for it.

@rockbmb
Copy link
Contributor Author

rockbmb commented Oct 16, 2017

@treeowl that's a point I cannot argue against, unfortunately speeding up size would never be free, at the very least a constant factor slowdown is necessary for the size housekeeping. There may be use cases where size is not required at all, so people who have no need for it would end up paying extra for nothing, but for what it's worth if their codebase changes (which happens frequently) and sizes are suddenly required often and in large maps, that would probably offset the price previously paid.

Opting out does sound like an alternative, but aside from creating two different hashmap types, I don't think it can be neatly done.

I'll do some benchmarks before speaking more definitely about this.

@rockbmb rockbmb force-pushed the make-size-const branch 2 times, most recently from c104c5d to f344e35 Compare October 17, 2017 02:09
@rockbmb
Copy link
Contributor Author

rockbmb commented Oct 17, 2017

I added some benchmarks to operations to series of HashMaps inside []/Vector/HashSet/Set, and ran them on both this PR's most recent commit (147b05f), and the latest master cherry-picked with the 43b9ad0 commit (which introduces these benchmarks).

The current code is on the left, this PR's code on the right.

benchmarking containerized/lookup/List                       benchmarking containerized/lookup/List
time                 11.49 ms   (11.32 ms .. 11.64 ms)       time                 11.42 ms   (11.34 ms .. 11.47 ms)
                     0.999 R²   (0.999 R² .. 1.000 R²)                            1.000 R²   (0.999 R² .. 1.000 R²)
mean                 11.04 ms   (10.88 ms .. 11.15 ms)       mean                 10.75 ms   (10.53 ms .. 10.88 ms)
std dev              336.5 μs   (233.2 μs .. 545.9 μs)       std dev              449.4 μs   (339.3 μs .. 594.3 μs)
variance introduced by outliers: 10% (moderately inflated)   variance introduced by outliers: 17% (moderately inflated)
                                                             
benchmarking containerized/lookup/Vector                     benchmarking containerized/lookup/Vector
time                 11.43 ms   (11.32 ms .. 11.55 ms)       time                 11.54 ms   (11.46 ms .. 11.62 ms)
                     1.000 R²   (0.999 R² .. 1.000 R²)                            1.000 R²   (0.999 R² .. 1.000 R²)
mean                 10.83 ms   (10.63 ms .. 10.99 ms)       mean                 10.67 ms   (10.42 ms .. 10.86 ms)
std dev              454.1 μs   (317.4 μs .. 677.1 μs)       std dev              537.2 μs   (397.7 μs .. 833.9 μs)
variance introduced by outliers: 17% (moderately inflated)   variance introduced by outliers: 21% (moderately inflated)
                                                             
benchmarking containerized/lookup/HashSet                    benchmarking containerized/lookup/HashSet
time                 11.59 ms   (11.46 ms .. 11.72 ms)       time                 11.23 ms   (11.11 ms .. 11.37 ms)
                     0.999 R²   (0.998 R² .. 1.000 R²)                            0.999 R²   (0.999 R² .. 1.000 R²)
mean                 11.02 ms   (10.79 ms .. 11.16 ms)       mean                 10.80 ms   (10.62 ms .. 10.92 ms)
std dev              448.1 μs   (325.6 μs .. 660.5 μs)       std dev              359.6 μs   (222.9 μs .. 533.9 μs)
variance introduced by outliers: 14% (moderately inflated)   variance introduced by outliers: 10% (moderately inflated)
                                                             
benchmarking containerized/lookup/Set                        benchmarking containerized/lookup/Set
time                 11.47 ms   (11.36 ms .. 11.57 ms)       time                 11.34 ms   (11.23 ms .. 11.46 ms)
                     1.000 R²   (0.999 R² .. 1.000 R²)                            0.999 R²   (0.999 R² .. 1.000 R²)
mean                 10.84 ms   (10.60 ms .. 10.98 ms)       mean                 10.59 ms   (10.37 ms .. 10.77 ms)
std dev              454.6 μs   (276.9 μs .. 750.5 μs)       std dev              501.2 μs   (403.0 μs .. 652.7 μs)
variance introduced by outliers: 17% (moderately inflated)   variance introduced by outliers: 21% (moderately inflated)
                                                             
benchmarking containerized/insert/List                       benchmarking containerized/insert/List
time                 143.6 ms   (142.0 ms .. 145.5 ms)       time                 143.6 ms   (139.2 ms .. 146.1 ms)
                     1.000 R²   (0.999 R² .. 1.000 R²)                            0.999 R²   (0.997 R² .. 1.000 R²)
mean                 139.1 ms   (136.3 ms .. 141.0 ms)       mean                 139.9 ms   (138.5 ms .. 141.6 ms)
std dev              3.166 ms   (1.738 ms .. 4.476 ms)       std dev              2.230 ms   (1.340 ms .. 3.065 ms)
variance introduced by outliers: 12% (moderately inflated)   variance introduced by outliers: 12% (moderately inflated)
                                                             
benchmarking containerized/insert/Vector                     benchmarking containerized/insert/Vector
time                 159.5 ms   (151.1 ms .. 165.7 ms)       time                 146.2 ms   (141.4 ms .. 148.2 ms)
                     0.998 R²   (0.996 R² .. 1.000 R²)                            0.999 R²   (0.998 R² .. 1.000 R²)
mean                 153.0 ms   (144.7 ms .. 156.2 ms)       mean                 142.4 ms   (139.6 ms .. 144.2 ms)
std dev              6.817 ms   (1.496 ms .. 9.849 ms)       std dev              3.058 ms   (1.198 ms .. 4.634 ms)
variance introduced by outliers: 12% (moderately inflated)   variance introduced by outliers: 12% (moderately inflated)
                                                             
benchmarking containerized/insert/HashSet                    benchmarking containerized/insert/HashSet
time                 211.2 ms   (198.9 ms .. 224.9 ms)       time                 193.5 ms   (191.9 ms .. 195.7 ms)
                     0.998 R²   (0.997 R² .. 1.000 R²)                            1.000 R²   (1.000 R² .. 1.000 R²)
mean                 195.4 ms   (189.5 ms .. 201.7 ms)       mean                 190.4 ms   (188.7 ms .. 191.4 ms)
std dev              8.081 ms   (4.938 ms .. 11.70 ms)       std dev              1.616 ms   (500.7 μs .. 2.229 ms)
variance introduced by outliers: 14% (moderately inflated)   variance introduced by outliers: 14% (moderately inflated)
                                                             
benchmarking containerized/insert/Set                        benchmarking containerized/insert/Set
time                 158.1 ms   (152.2 ms .. 162.3 ms)       time                 159.3 ms   (157.6 ms .. 160.1 ms)
                     0.999 R²   (0.998 R² .. 1.000 R²)                            1.000 R²   (1.000 R² .. 1.000 R²)
mean                 158.7 ms   (156.9 ms .. 163.2 ms)       mean                 155.7 ms   (154.1 ms .. 156.6 ms)
std dev              3.708 ms   (1.134 ms .. 5.314 ms)       std dev              1.672 ms   (673.1 μs .. 2.556 ms)
variance introduced by outliers: 12% (moderately inflated)   variance introduced by outliers: 12% (moderately inflated)
                                                             
benchmarking containerized/delete/List                       benchmarking containerized/delete/List
time                 5.357 ms   (5.257 ms .. 5.470 ms)       time                 5.063 ms   (5.007 ms .. 5.108 ms)
                     0.997 R²   (0.994 R² .. 0.998 R²)                            0.999 R²   (0.999 R² .. 1.000 R²)
mean                 5.418 ms   (5.355 ms .. 5.481 ms)       mean                 4.683 ms   (4.593 ms .. 4.749 ms)
std dev              181.1 μs   (154.1 μs .. 222.0 μs)       std dev              222.6 μs   (167.1 μs .. 283.9 μs)
variance introduced by outliers: 15% (moderately inflated)   variance introduced by outliers: 25% (moderately inflated)
                                                             
benchmarking containerized/delete/Vector                     benchmarking containerized/delete/Vector
time                 5.918 ms   (5.869 ms .. 5.983 ms)       time                 5.361 ms   (5.170 ms .. 5.543 ms)
                     0.999 R²   (0.998 R² .. 1.000 R²)                            0.996 R²   (0.993 R² .. 0.999 R²)
mean                 5.263 ms   (5.102 ms .. 5.399 ms)       mean                 5.092 ms   (4.969 ms .. 5.169 ms)
std dev              403.9 μs   (342.0 μs .. 476.6 μs)       std dev              274.0 μs   (157.9 μs .. 402.7 μs)
variance introduced by outliers: 46% (moderately inflated)   variance introduced by outliers: 28% (moderately inflated)
                                                             
benchmarking containerized/delete/HashSet                    benchmarking containerized/delete/HashSet
time                 5.555 ms   (5.379 ms .. 5.722 ms)       time                 5.015 ms   (4.983 ms .. 5.052 ms)
                     0.996 R²   (0.994 R² .. 0.999 R²)                            1.000 R²   (1.000 R² .. 1.000 R²)
mean                 5.096 ms   (5.005 ms .. 5.188 ms)       mean                 4.675 ms   (4.603 ms .. 4.733 ms)
std dev              267.0 μs   (214.6 μs .. 343.9 μs)       std dev              193.1 μs   (147.4 μs .. 254.9 μs)
variance introduced by outliers: 28% (moderately inflated)   variance introduced by outliers: 20% (moderately inflated)
                                                             
benchmarking containerized/delete/Set                        benchmarking containerized/delete/Set
time                 5.377 ms   (5.259 ms .. 5.536 ms)       time                 5.784 ms   (5.614 ms .. 5.956 ms)
                     0.995 R²   (0.992 R² .. 0.998 R²)                            0.994 R²   (0.990 R² .. 0.997 R²)
mean                 5.369 ms   (5.306 ms .. 5.429 ms)       mean                 4.965 ms   (4.822 ms .. 5.129 ms)
std dev              188.5 μs   (160.2 μs .. 231.6 μs)       std dev              443.9 μs   (387.6 μs .. 542.1 μs)
variance introduced by outliers: 15% (moderately inflated)   variance introduced by outliers: 53% (severely inflated)
                                    
benchmarking containerized/map/List                          benchmarking containerized/map/List
time                 2.310 ms   (2.244 ms .. 2.379 ms)       time                 1.021 ms   (1.014 ms .. 1.027 ms)
                     0.987 R²   (0.979 R² .. 0.992 R²)                            1.000 R²   (0.999 R² .. 1.000 R²)
mean                 1.886 ms   (1.799 ms .. 1.964 ms)       mean                 908.3 μs   (889.1 μs .. 926.3 μs)
std dev              218.2 μs   (195.5 μs .. 233.2 μs)       std dev              57.62 μs   (50.91 μs .. 66.36 μs)
variance introduced by outliers: 74% (severely inflated)     variance introduced by outliers: 49% (moderately inflated)
                                                             
benchmarking containerized/map/Vector                        benchmarking containerized/map/Vector
time                 5.055 ms   (4.907 ms .. 5.235 ms)       time                 4.538 ms   (4.473 ms .. 4.596 ms)
                     0.994 R²   (0.993 R² .. 0.996 R²)                            0.998 R²   (0.997 R² .. 0.999 R²)
mean                 4.742 ms   (4.635 ms .. 4.830 ms)       mean                 3.985 ms   (3.855 ms .. 4.078 ms)
std dev              287.1 μs   (230.9 μs .. 379.2 μs)       std dev              310.2 μs   (240.3 μs .. 415.8 μs)
variance introduced by outliers: 35% (moderately inflated)   variance introduced by outliers: 50% (moderately inflated)
                                                             
benchmarking containerized/map/HashSet                       benchmarking containerized/map/HashSet
time                 8.761 ms   (8.430 ms .. 9.016 ms)       time                 7.445 ms   (7.332 ms .. 7.546 ms)
                     0.992 R²   (0.988 R² .. 0.996 R²)                            0.997 R²   (0.995 R² .. 0.999 R²)
mean                 7.406 ms   (7.166 ms .. 7.693 ms)       mean                 6.678 ms   (6.480 ms .. 6.829 ms)
std dev              688.9 μs   (571.3 μs .. 826.0 μs)       std dev              466.4 μs   (362.9 μs .. 576.1 μs)
variance introduced by outliers: 50% (severely inflated)     variance introduced by outliers: 38% (moderately inflated)
                                                             
benchmarking containerized/map/Set                           benchmarking containerized/map/Set
time                 5.798 ms   (5.692 ms .. 5.878 ms)       time                 5.202 ms   (5.091 ms .. 5.325 ms)
                     0.997 R²   (0.994 R² .. 0.998 R²)                            0.997 R²   (0.995 R² .. 0.999 R²)
mean                 5.416 ms   (5.287 ms .. 5.507 ms)       mean                 4.689 ms   (4.554 ms .. 4.795 ms)
std dev              313.6 μs   (237.1 μs .. 417.4 μs)       std dev              357.0 μs   (261.5 μs .. 472.5 μs)
variance introduced by outliers: 32% (moderately inflated)   variance introduced by outliers: 46% (moderately inflated)
                                                             
benchmarking containerized/union/List                        benchmarking containerized/union/List
time                 885.6 μs   (852.9 μs .. 911.9 μs)       time                 1.003 ms   (982.2 μs .. 1.033 ms)
                     0.996 R²   (0.995 R² .. 0.999 R²)                            0.996 R²   (0.994 R² .. 0.998 R²)
mean                 780.4 μs   (765.2 μs .. 800.3 μs)       mean                 941.7 μs   (926.1 μs .. 956.3 μs)
std dev              53.71 μs   (42.91 μs .. 69.48 μs)       std dev              45.66 μs   (38.82 μs .. 54.44 μs)
variance introduced by outliers: 54% (severely inflated)     variance introduced by outliers: 37% (moderately inflated)
                                                             
benchmarking containerized/union/Vector                      benchmarking containerized/union/Vector
time                 1.443 ms   (1.140 ms .. 1.620 ms)       time                 1.070 ms   (1.055 ms .. 1.081 ms)
                     0.887 R²   (0.861 R² .. 0.929 R²)                            0.998 R²   (0.997 R² .. 0.999 R²)
mean                 993.0 μs   (902.0 μs .. 1.141 ms)       mean                 921.1 μs   (889.8 μs .. 948.0 μs)
std dev              320.2 μs   (225.5 μs .. 401.8 μs)       std dev              84.67 μs   (74.83 μs .. 97.45 μs)
variance introduced by outliers: 97% (severely inflated)     variance introduced by outliers: 68% (severely inflated)
                                                             
benchmarking containerized/union/HashSet                     benchmarking containerized/union/HashSet
time                 957.9 μs   (922.2 μs .. 988.9 μs)       time                 963.5 μs   (952.8 μs .. 972.0 μs)
                     0.964 R²   (0.937 R² .. 0.981 R²)                            0.999 R²   (0.998 R² .. 0.999 R²)
mean                 1.178 ms   (1.108 ms .. 1.268 ms)       mean                 880.4 μs   (865.1 μs .. 893.3 μs)
std dev              254.8 μs   (216.8 μs .. 281.7 μs)       std dev              44.27 μs   (36.94 μs .. 53.96 μs)
variance introduced by outliers: 93% (severely inflated)     variance introduced by outliers: 38% (moderately inflated)
                                                             
benchmarking containerized/union/Set                         benchmarking containerized/union/Set
time                 936.7 μs   (932.9 μs .. 940.2 μs)       time                 1.013 ms   (980.0 μs .. 1.053 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)                            0.995 R²   (0.993 R² .. 0.999 R²)
mean                 855.6 μs   (840.4 μs .. 867.7 μs)       mean                 947.3 μs   (930.9 μs .. 963.3 μs)
std dev              42.09 μs   (35.43 μs .. 50.38 μs)       std dev              48.05 μs   (40.49 μs .. 58.29 μs)
variance introduced by outliers: 38% (moderately inflated)   variance introduced by outliers: 39% (moderately inflated)
                                                             
benchmarking containerized/intersection/List                 benchmarking containerized/intersection/List
time                 1.892 μs   (1.856 μs .. 1.912 μs)       time                 1.679 μs   (1.668 μs .. 1.689 μs)
                     0.998 R²   (0.997 R² .. 0.999 R²)                            1.000 R²   (0.999 R² .. 1.000 R²)
mean                 1.644 μs   (1.586 μs .. 1.692 μs)       mean                 1.449 μs   (1.409 μs .. 1.484 μs)
std dev              128.6 ns   (102.3 ns .. 146.4 ns)       std dev              82.39 ns   (61.64 ns .. 103.4 ns)
variance introduced by outliers: 81% (severely inflated)     variance introduced by outliers: 69% (severely inflated)
                                                             
benchmarking containerized/intersection/Vector               benchmarking containerized/intersection/Vector
time                 1.304 μs   (1.296 μs .. 1.312 μs)       time                 1.124 μs   (1.124 μs .. 1.125 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)                            1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.121 μs   (1.101 μs .. 1.139 μs)       mean                 962.3 ns   (943.8 ns .. 980.0 ns)
std dev              38.66 ns   (31.82 ns .. 49.79 ns)       std dev              37.40 ns   (29.82 ns .. 47.03 ns)
variance introduced by outliers: 44% (moderately inflated)   variance introduced by outliers: 51% (severely inflated)
                                                             
benchmarking containerized/intersection/HashSet              benchmarking containerized/intersection/HashSet
time                 3.685 μs   (3.682 μs .. 3.688 μs)       time                 3.525 μs   (3.487 μs .. 3.563 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)                            0.999 R²   (0.999 R² .. 1.000 R²)
mean                 3.210 μs   (3.150 μs .. 3.268 μs)       mean                 3.070 μs   (2.993 μs .. 3.128 μs)
std dev              132.2 ns   (108.1 ns .. 165.1 ns)       std dev              163.1 ns   (119.2 ns .. 217.7 ns)
variance introduced by outliers: 51% (severely inflated)     variance introduced by outliers: 64% (severely inflated)
                                                             
benchmarking containerized/intersection/Set                  benchmarking containerized/intersection/Set
time                 1.604 μs   (1.549 μs .. 1.642 μs)       time                 1.453 μs   (1.430 μs .. 1.478 μs)
                     0.996 R²   (0.996 R² .. 0.997 R²)                            0.997 R²   (0.995 R² .. 0.999 R²)
mean                 1.371 μs   (1.311 μs .. 1.435 μs)       mean                 1.275 μs   (1.250 μs .. 1.311 μs)
std dev              136.6 ns   (116.5 ns .. 158.9 ns)       std dev              69.45 ns   (47.61 ns .. 102.2 ns)
variance introduced by outliers: 87% (severely inflated)     variance introduced by outliers: 67% (severely inflated)
                                                             
benchmarking containerized/size/List                         benchmarking containerized/size/List
time                 252.9 μs   (249.3 μs .. 258.7 μs)       time                 1.864 μs   (1.848 μs .. 1.879 μs)
                     0.996 R²   (0.995 R² .. 0.997 R²)                            1.000 R²   (0.999 R² .. 1.000 R²)
mean                 239.3 μs   (235.9 μs .. 242.3 μs)       mean                 1.611 μs   (1.578 μs .. 1.638 μs)
std dev              9.146 μs   (7.719 μs .. 11.00 μs)       std dev              65.52 ns   (53.32 ns .. 85.26 ns)
variance introduced by outliers: 32% (moderately inflated)   variance introduced by outliers: 52% (severely inflated)
                                                             
benchmarking containerized/size/Vector                       benchmarking containerized/size/Vector
time                 286.8 μs   (285.3 μs .. 287.9 μs)       time                 1.241 μs   (1.235 μs .. 1.248 μs)
                     0.999 R²   (0.998 R² .. 0.999 R²)                            1.000 R²   (1.000 R² .. 1.000 R²)
mean                 253.7 μs   (245.9 μs .. 260.1 μs)       mean                 1.071 μs   (1.051 μs .. 1.092 μs)
std dev              20.25 μs   (14.92 μs .. 24.38 μs)       std dev              46.55 ns   (39.33 ns .. 54.91 ns)
variance introduced by outliers: 68% (severely inflated)     variance introduced by outliers: 57% (severely inflated)
                                                             
benchmarking containerized/size/HashSet                      benchmarking containerized/size/HashSet
time                 255.8 μs   (253.5 μs .. 258.0 μs)       time                 5.615 μs   (5.520 μs .. 5.768 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)                            0.997 R²   (0.996 R² .. 0.998 R²)
mean                 230.6 μs   (226.1 μs .. 234.7 μs)       mean                 5.007 μs   (4.934 μs .. 5.083 μs)
std dev              12.18 μs   (9.907 μs .. 15.30 μs)       std dev              181.5 ns   (146.3 ns .. 227.7 ns)
variance introduced by outliers: 48% (moderately inflated)   variance introduced by outliers: 44% (moderately inflated)
                                                             
benchmarking containerized/size/Set                          benchmarking containerized/size/Set
time                 261.9 μs   (256.3 μs .. 270.5 μs)       time                 3.032 μs   (3.002 μs .. 3.062 μs)
                     0.995 R²   (0.994 R² .. 0.997 R²)                            0.999 R²   (0.999 R² .. 1.000 R²)
mean                 247.8 μs   (244.6 μs .. 251.3 μs)       mean                 2.630 μs   (2.565 μs .. 2.688 μs)
std dev              10.24 μs   (8.850 μs .. 11.99 μs)       std dev              152.8 ns   (123.4 ns .. 189.7 ns)
variance introduced by outliers: 36% (moderately inflated)   variance introduced by outliers: 68% (severely inflated)

Doesn't look too bad @treeowl

@rockbmb rockbmb force-pushed the make-size-const branch 2 times, most recently from 826a32f to f8463ca Compare October 17, 2017 19:20
@rockbmb
Copy link
Contributor Author

rockbmb commented Oct 10, 2018

Plan for this PR:

  • Fix conflicts.
  • Implement the approach mentioned here. (See here for branch in my fork)
  • Rerun benchmarks to see if anything significant has changed.
  • Compare the two benchmarks: for the PR's current approach, and for the new one.

I know the maintainer said here that storing intermediate structure sizes inside the HashMap datatype is probably not worth it, but as said here, I'll try benchmarking before definitively deciding against it.

@chshersh
Copy link

@rockbmb @treeowl Any updates on this issue? It would be really nice to have a faster size function for HashMap 🙂 Maybe this issue needs some additional pair of hands?

@rockbmb
Copy link
Contributor Author

rockbmb commented Jun 5, 2020

@chshersh Hey! I might have time to come back to this in the next few months, but I can't guarantee it: if anyone else wants to take over, feel free to!

@sjakobi
Copy link
Member

sjakobi commented Jun 24, 2020

What's the current status here? Can we already say whether the added overhead is prohibitive or not?

I'll tentatively mark this as a draft for now.

@sjakobi sjakobi marked this pull request as draft June 24, 2020 11:18
@rockbmb
Copy link
Contributor Author

rockbmb commented Jul 20, 2020

@sjakobi ok with this being moved to draft.

Tentative status is that there is overhead in some places, and improvements in others, but overall the PR can be improved.
I'm not sure how to correct the performance problem this is stuck on, so I'll reach out on IRC.

@sjakobi
Copy link
Member

sjakobi commented Jul 20, 2020

I'm not sure how to correct the performance problem this is stuck on, so I'll reach out on IRC.

What is this performance problem? I'm not sure I can help but I'd gladly take a look. And there are other folks here who might be able to help too.

@treeowl
Copy link
Collaborator

treeowl commented Jul 20, 2020

Unions are going to get rather seriously pricier with this scheme. The alternative approach is to slow down everything a little by bloating the structure with an extra size field per internal node.

@rockbmb rockbmb force-pushed the make-size-const branch 6 times, most recently from 9a40921 to 8c7b2aa Compare September 20, 2020 23:50
@rockbmb
Copy link
Contributor Author

rockbmb commented Sep 21, 2020

I finally had some time to resume work on this.

@treeowl

The alternative approach is to slow down everything a little by bloating the structure with an extra size field per internal node.

In my fork of this repo a while back I started a branch doing just that, https://github.com/rockbmb/unordered-containers/tree/make-size-const-2.
It had bitrotted since, but I've brought it back. The following benchmarks are for master, this PR, and the above approach (see columns).

@sjakobi

What is this performance problem?

Here are the benchmarks: https://gist.github.com/rockbmb/c3d7e1d1a252028802db7a8dac63655a
There is too much information there, so I'll focus on a specific example.

Let's consider Data.HashMap.foldl' (internally it uses foldlWithKey').
In master, it is currently

foldlWithKey' :: (a -> k -> v -> a) -> a -> HashMap k v -> a
foldlWithKey' f = go
  where
    go !z Empty                = z
    go z (Leaf _ (L k v))      = f z k v
    go z (BitmapIndexed _ ary) = A.foldl' go z ary
    go z (Full ary)            = A.foldl' go z ary
    go z (Collision _ ary)     = A.foldl' (\ z' (L k v) -> f z' k v) z ary

In this PR it changes to

foldlWithKey' :: (a -> k -> v -> a) -> a -> HashMap k v -> a
foldlWithKey' f acc (HashMap _ m) = go acc m
  where
    go !z Empty                = z
    go z (Leaf _ (L k v))      = f z k v
    go z (BitmapIndexed _ ary) = A.foldl' go z ary
    go z (Full ary)            = A.foldl' go z ary
    go z (Collision _ ary)     = A.foldl' (\ z' (L k v) -> f z' k v) z ary

The only real change, aside from HashMap now being a wrapper, is going from

foldlWithKey' f = go

to

foldlWithKey' f acc (HashMap _ m) = go acc m

and this alone causes an almost 3x slowdown in benchmarks, see

benchmarked HashMap/foldl'                                  | benchmarked HashMap/foldl'                            
time                 28.59 μs   (28.05 μs .. 29.22 μs)      | time                 72.35 μs   (72.04 μs .. 72.65 μs)
                     0.999 R²   (0.998 R² .. 0.999 R²)      |                      1.000 R²   (1.000 R² .. 1.000 R²)
mean                 29.18 μs   (29.03 μs .. 29.32 μs)      | mean                 73.46 μs   (73.25 μs .. 73.84 μs)
std dev              386.8 ns   (301.1 ns .. 527.5 ns)      | std dev              827.2 ns   (519.0 ns .. 1.451 μs)

This is also the case for other relatively simple functions like filter and fromList. I have no idea why simply unwrapping HashMap does this. @treeowl suggested taking a look at the produced Core a few years back, but I could not interpret much useful information out of that.

Do you believe this situation can be improved, or is the above alternative of expanding the HashMap constructors with size information more worthwhile?

@sjakobi
Copy link
Member

sjakobi commented Sep 21, 2020

@rockbmb It's great to see some progress on this! :)

The apparent slowdown of foldl' etc. is very surprising to me. Are you sure that we aren't measuring the wrong thing here? Could the benchmarks be affected by the more expensive map construction?!

Also, to get a better overview of the perf differences, could you produce criterion-compare reports comparing both this PR and your other branch against master?

@rockbmb
Copy link
Contributor Author

rockbmb commented Sep 21, 2020

@sjakobi

Could the benchmarks be affected by the more expensive map construction?!

Reasonable assumption. It's been a few years since I've worked with these benchmarks, so I went and took a look at the relevant parts:

data Env = Env {
    ...
    hmi         :: !(HM.HashMap Int Int),
    ...
}

(https://github.com/haskell-unordered-containers/unordered-containers/blob/master/benchmarks/Benchmarks.hs#L62)

setupEnv = do
    ...
    hmi         = HM.fromList elemsI
    ...
    return Env{..}

(https://github.com/haskell-unordered-containers/unordered-containers/blob/master/benchmarks/Benchmarks.hs#L104)

and the benchmark

          , bench "foldl'" $ whnf (HM.foldl' (+) 0) hmi

(https://github.com/haskell-unordered-containers/unordered-containers/blob/master/benchmarks/Benchmarks.hs#L324)

so I do not believe this is the problem.

I've attached a .zip with the two criterion-compare reports.
b8e614741b412e4d88753788eb5c7d3e-9dda9ccd450f144efc532bae950035fbe136f1d9.zip

It is not possible to share .html files over GitHub comments sadly, and I cannot host these files myself.

@rockbmb
Copy link
Contributor Author

rockbmb commented Sep 22, 2020

Also I just now realized it is meaningless to compare benchmark runs of different code that don't use the same seed.
I'll correct this and reupload the data.

EDIT: Nevermind, gauge does not allow seeding, and the hashmaps used in tests are always the same anyway, I played myself here.

@sjakobi
Copy link
Member

sjakobi commented Sep 26, 2020

I've attached a .zip with the two criterion-compare reports.
b8e614741b412e4d88753788eb5c7d3e-9dda9ccd450f144efc532bae950035fbe136f1d9.zip

Thanks! So the most extreme slowdowns with this branch are

  • HashMap/containerized/union/List (283.3%)
  • HashMap/filterWithKey (184.2%)
  • HashMap/filter (152.3%)
  • HashMap/foldl' (149.6%)

Before we decide to scrap this branch and focus on the other one, it would be very good to understand why these slowdowns are so extreme. I don't have a better idea for doing that than comparing the generated Core.

For the other branch, I think the slowdowns in fromList[With] are the most concerning ones.

@treeowl
Copy link
Collaborator

treeowl commented Sep 26, 2020

I'm really not optimistic about this approach from the union and intersection standpoint.

@rockbmb
Copy link
Contributor Author

rockbmb commented Sep 27, 2020

@sjakobi

it would be very good to understand why these slowdowns are so extreme.

@treeowl

I'm really not optimistic about this approach from the union and intersection standpoint.

True, the numbers for either branch don't look great at the moment, especially this one.

I don't have a better idea for doing that than comparing the generated Core.

I don't mind doing this, but I don't know how proceed - I'm not quite sure what to be looking out for.
Is there some prior art, some related example or a similar effort (e.g. an improvement to containers) that I can take a look at so I can have a better picture of what is needed?

@sjakobi
Copy link
Member

sjakobi commented Oct 3, 2020

I don't have a better idea for doing that than comparing the generated Core.

I don't mind doing this, but I don't know how proceed - I'm not quite sure what to be looking out for.
Is there some prior art, some related example or a similar effort (e.g. an improvement to containers) that I can take a look at so I can have a better picture of what is needed?

Unfortunately, I don't have any references. Maybe just produce a diff of the Core for foldl' or filterWithKey, and paste it here.

    * because 'HashMap' is now a wrapper and may/may not get unboxed
      during a program's execution, benchmarks to operations on sets of
      hashmaps inside different kinds of containers were added;
@rockbmb
Copy link
Contributor Author

rockbmb commented Oct 18, 2022

It's been 5 years since I started this, and 2 since I last worked on it.
I finally have some spare time to look over this!

Over the last 2 years this PR bitrot so for now I have rebased it to master.
I also have an alternative branch in my fork of this that adds an Int to each of the HashMap variants tracking the subtrees' sizes instead of wrapping over the type.

I'll rebase that branch next. Afterwards I'll resume benchmarking, which will probably involve looking at the Core.

EDIT: also, for a reason I cannot understand the job for GHC 8.10.7 fails, and it doesn't seem to be over tests or compilation. It hangs while fetching ghcup.

@treeowl
Copy link
Collaborator

treeowl commented Oct 18, 2022

Good luck!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can size be made O(1)?
6 participants