Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lookupMin and lookupMax for IntSet #976

Merged
merged 1 commit into from
Aug 17, 2024

Conversation

meooow25
Copy link
Contributor

@meooow25 meooow25 commented Nov 15, 2023

I wanted to use IntSet.lookupMin today and realized it's not defined. lookupMin and lookupMax already exist for Set, Map, and IntMap.

Edit: Also

  • Update docs for lookupMin, lookupMax, findMin, findMax to be
    consistent for all the structures.
  • Modify IntMap's lookupMin and lookupMax impls so that Just is
    returned immediately and mark them INLINE. This allows GHC to inline
    and possibly eliminate the Maybe. Set and Map's lookupMin and
    lookupMax are already implemented in this style.

@meooow25 meooow25 force-pushed the intset-lookupminmax branch 2 times, most recently from 7960124 to 1a2cf00 Compare November 16, 2023 17:05
@meooow25 meooow25 force-pushed the intset-lookupminmax branch from 1a2cf00 to 0b1ffd7 Compare January 13, 2024 22:35
@meooow25
Copy link
Contributor Author

Hello, could you please review this PR?

Copy link
Contributor

@treeowl treeowl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick question: does wrapping the Just on outside still work/help now that GHC does a lot more fancy join point stuff? It would be good to verify that the Core is still what we want.

containers/src/Data/IntMap/Internal.hs Outdated Show resolved Hide resolved
@meooow25 meooow25 force-pushed the intset-lookupminmax branch from 0b1ffd7 to 787516a Compare January 15, 2024 05:40
@meooow25
Copy link
Contributor Author

meooow25 commented Jan 15, 2024

Updated Map's lookupMin and lookupMax as discussed. Sharing the core below.

Core
Rec {
-- RHS size: {terms: 19, types: 26, coercions: 0, joins: 0/0}
Data.Map.Internal.$wlookupMinSure [InlPrag=[2], Occ=LoopBreaker]
  :: forall {k} {a}. k -> a -> Map k a -> (# k, a #)
[GblId[StrictWorker([~, ~, !])],
 Arity=3,
 Str=<ML><L><1L>,
 Unf=OtherCon []]
Data.Map.Internal.$wlookupMinSure
  = \ (@k_sIhw)
      (@a_sIhx)
      (k1_sIhy :: k_sIhw)
      (a1_sIhz :: a_sIhx)
      (ds_sIhA :: Map k_sIhw a_sIhx) ->
      case ds_sIhA of {
        Bin bx_dH9Z k2_asZd a2_asZe l_asZf ds1_dzm5 ->
          Data.Map.Internal.$wlookupMinSure
            @k_sIhw @a_sIhx k2_asZd a2_asZe l_asZf;
        Tip ->
          case k1_sIhy of conrep_atnO { __DEFAULT ->
          (# conrep_atnO, a1_sIhz #)
          }
      }
end Rec }

-- RHS size: {terms: 14, types: 18, coercions: 0, joins: 0/0}
lookupMinSure [InlPrag=[2]]
  :: forall k a. k -> a -> Map k a -> KeyValue k a
[GblId,
 Arity=3,
 Str=<ML><L><1L>,
 Cpr=1,
 Unf=Unf{Src=StableSystem, TopLvl=True,
         Value=True, ConLike=True, WorkFree=True, Expandable=True,
         Guidance=ALWAYS_IF(arity=3,unsat_ok=True,boring_ok=False)
         Tmpl= \ (@k_sIhw)
                 (@a_sIhx)
                 (k1_sIhy [Occ=Once1] :: k_sIhw)
                 (a1_sIhz [Occ=Once1] :: a_sIhx)
                 (ds_sIhA [Occ=Once1] :: Map k_sIhw a_sIhx) ->
                 case Data.Map.Internal.$wlookupMinSure
                        @k_sIhw @a_sIhx k1_sIhy a1_sIhz ds_sIhA
                 of
                 { (# ww_sINB [Occ=Once1], ww1_sINC [Occ=Once1] #) ->
                 Data.Map.Internal.KeyValue @k_sIhw @a_sIhx ww_sINB ww1_sINC
                 }}]
lookupMinSure
  = \ (@k_sIhw)
      (@a_sIhx)
      (k1_sIhy :: k_sIhw)
      (a1_sIhz :: a_sIhx)
      (ds_sIhA :: Map k_sIhw a_sIhx) ->
      case Data.Map.Internal.$wlookupMinSure
             @k_sIhw @a_sIhx k1_sIhy a1_sIhz ds_sIhA
      of
      { (# ww_sINB, ww1_sINC #) ->
      Data.Map.Internal.KeyValue @k_sIhw @a_sIhx ww_sINB ww1_sINC
      }

-- RHS size: {terms: 18, types: 34, coercions: 0, joins: 0/0}
lookupMin [InlPrag=INLINE (sat-args=1)]
  :: forall k a. Map k a -> Maybe (k, a)
[GblId,
 Arity=1,
 Str=<1L>,
 Unf=Unf{Src=StableUser, TopLvl=True,
         Value=True, ConLike=True, WorkFree=True, Expandable=True,
         Guidance=ALWAYS_IF(arity=1,unsat_ok=False,boring_ok=False)
         Tmpl= \ (@k_avaJ)
                 (@a_avaK)
                 (ds_dzmb [Occ=Once1!] :: Map k_avaJ a_avaK) ->
                 case ds_dzmb of {
                   Bin _ [Occ=Dead] k1_asZg [Occ=Once1] x_asZh [Occ=Once1]
                       l_asZi [Occ=Once1] _ [Occ=Dead] ->
                     case lookupMinSure @k_avaJ @a_avaK k1_asZg x_asZh l_asZi of
                     { KeyValue k2_asZ9 [Occ=Once1] a1_asZa [Occ=Once1] ->
                     GHC.Maybe.Just @(k_avaJ, a_avaK) (k2_asZ9, a1_asZa)
                     };
                   Tip -> GHC.Maybe.Nothing @(k_avaJ, a_avaK)
                 }}]
lookupMin
  = \ (@k_avaJ) (@a_avaK) (ds_dzmb :: Map k_avaJ a_avaK) ->
      case ds_dzmb of {
        Bin bx_dHa0 k1_asZg x_asZh l_asZi ds1_dzoB ->
          case Data.Map.Internal.$wlookupMinSure
                 @k_avaJ @a_avaK k1_asZg x_asZh l_asZi
          of
          { (# ww_sINB, ww1_sINC #) ->
          GHC.Maybe.Just @(k_avaJ, a_avaK) (ww_sINB, ww1_sINC)
          };
        Tip -> GHC.Maybe.Nothing @(k_avaJ, a_avaK)
      }

@meooow25 meooow25 force-pushed the intset-lookupminmax branch from 787516a to b753f77 Compare January 16, 2024 22:50
@meooow25
Copy link
Contributor Author

meooow25 commented Jan 16, 2024

Updated again to put a bang on k, as recommended in GHC#24340.
Though it looks like k is evaluated every time in core (case k1_somV of k2_X0 { __DEFAULT ->... in #976 (comment)), the core does not tell the full story. This is out of my depth but, as explained in the GHC issue above, code generation down the line gets rid of this eval.

@meooow25 meooow25 requested a review from treeowl January 16, 2024 23:00
@meooow25
Copy link
Contributor Author

@treeowl does this look good?

@treeowl
Copy link
Contributor

treeowl commented Jan 19, 2024

I'd like to take one more look at it after I finish waking up and mixing dough.

@meooow25
Copy link
Contributor Author

@treeowl?

@meooow25
Copy link
Contributor Author

meooow25 commented Feb 9, 2024

@treeowl, reminder to review.

@meooow25 meooow25 force-pushed the intset-lookupminmax branch from b753f77 to 99ac18f Compare March 31, 2024 16:40
@meooow25
Copy link
Contributor Author

Rebased.

@meooow25 meooow25 force-pushed the intset-lookupminmax branch 2 times, most recently from e045b5c to 7614da1 Compare August 17, 2024 05:38
@meooow25
Copy link
Contributor Author

@treeowl I am inclined to merge this soon if that's alright.
Your suggestion of using a half-strict pair for Map is implemented, and we are strict in the key in the functions as recommended in the GHC issue.

Copy link
Contributor

@treeowl treeowl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry this sat on my plate so long.

containers-tests/tests/intset-properties.hs Outdated Show resolved Hide resolved
containers-tests/tests/intset-properties.hs Outdated Show resolved Hide resolved
lookupMinSure (Tip k v) = (k,v)
lookupMinSure (Bin _ l _) = lookupMinSure l
lookupMinSure Nil = error "lookupMinSure Nil"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does worker-wrapper eliminate the I# wrappers here? I imagine so, with sufficiently recent GHC, but it might be worth checking and noting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! The worker returns a Int#. Mentioned previously in #976 (comment), but I checked it again and it's as expected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent. Could you add a comment to that effect? I believe that's likely only true in recentish GHC, with nested CPR. Along with the two accidental space characters, I think that's all that's left to do, as far as I'm concerned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm you're right, I wasn't aware that this is a somewhat recent change. I checked on 9.2 and it's returning a boxed Int.
I feel like we should use a KeyValue !Int a for the benefit of older GHCs, after all we're already doing that for Map. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to use KeyValue.

containers/src/Data/IntMap/Internal.hs Outdated Show resolved Hide resolved
@meooow25
Copy link
Contributor Author

Thanks for reviewing!

These already exist for Set, Map, and IntMap.

Additionally,

* Implement lookupMin, lookupMax, findMin, findMax in a consistent
  manner for all structures. The rationale for this implementation is
  provided in the note [Inline lookupMin] in Data.Set.Internal.
* Update docs for lookupMin, lookupMax, findMin, findMax to be
  consistent across all the structures.
@meooow25 meooow25 force-pushed the intset-lookupminmax branch from 89acdb9 to 7c6af0d Compare August 17, 2024 13:57
@meooow25 meooow25 merged commit 0209162 into haskell:master Aug 17, 2024
11 checks passed
@meooow25 meooow25 deleted the intset-lookupminmax branch August 17, 2024 14:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants