Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove instance NFData (a -> b) #16

Open
amigalemming opened this issue May 23, 2016 · 83 comments
Open

remove instance NFData (a -> b) #16

amigalemming opened this issue May 23, 2016 · 83 comments

Comments

@amigalemming
Copy link

I suggested to remove the instance NFData (a -> b) because it cannot be implemented properly. I think the proposal got broad support:
https://mail.haskell.org/libraries/2016-May/026961.html
We have still to decide whether to drop the instance or replace it by an unimplementable one. The latter one would have the advantage that people see that the instance is omitted by intention. We would need a major version bump. I can setup a pull request if you like.

@hvr
Copy link
Member

hvr commented May 23, 2016

tbh, I'd like the @haskell/core-libraries-committee to sign off on this.

Personally, I'd prefer an unimplementable instance, as otherwise all it takes is one orphan instance hidden somewhere in a popular package to thwart this change. And even worse, if more than one orphan instance appears, we would risk ending up in an even worse situation than we're now in :-/

(/cc'ing @RyanGlScott as he wasn't yet part of the @haskell/core-libraries-committee github team at time of writing)

@ekmett
Copy link
Member

ekmett commented May 23, 2016

I have no objection to removing the instance, but I'm not a fan of the idea
of making it unimplementable.

Yes, one popular package could bring it back from the dead, or bring into
scope a variant that enumerates the entire domain of a function and
memoizes all results, but if it can weather the storm of popular opinion
then perhaps it deserves to come back. This scenario is unlikely given the
complete absence of interest in the current instance.

We have an open world. Orphan instances are deeply unpopular and
anti-modular. If someone needs the instance, let them have it, it won't go
far enough upstream to matter.

-Edward

On Mon, May 23, 2016 at 6:29 AM, Herbert Valerio Riedel <
[email protected]> wrote:

tbh, I'd like the @haskell/core-libraries-committee
https://github.com/orgs/haskell/teams/core-libraries-committee to sign
off on this.

Personally, I'd prefer an unimplementable instance, as otherwise all it
takes is one orphan instance hidden somewhere in a popular package to
thwart this change. And even worse, if more than one orphan instance
appears, we would risk ending up in an even worse situation than we're now
in :-/

(/cc'ing @RyanGlScott https://github.com/RyanGlScott as he wasn't yet
part of @haskell/core-libraries-committee
https://github.com/orgs/haskell/teams/core-libraries-committee at time
of writing)


You are receiving this because you are on a team that was mentioned.
Reply to this email directly or view it on GitHub
#16 (comment)

@amigalemming
Copy link
Author

On Mon, 23 May 2016, Edward Kmett wrote:

We have an open world. Orphan instances are deeply unpopular and
anti-modular. If someone needs the instance, let them have it, it won't
go far enough upstream to matter.

There needs to be only one package in any of hundred packages I import
defining this instance as an orphan and then this disables my type safety
check. I prefer the non-implementable instance and if someone needs the
enumerating instance he can implement it for a newtype wrapper of the
function type.

@hvr
Copy link
Member

hvr commented May 23, 2016

@amigalemming afaik you suggested to implement some warning facility to annotate undesirable/questionable instances; can you think of a variant which would help here?

@ekmett
Copy link
Member

ekmett commented May 23, 2016

And you can just as easily implement an orphan instance module that
provides the impossible instance with the same head.

class Don'tExportMe where boom :: a
instance Don'tExportMe => NFData (a -> b) where
rnf = boom

and import that to know that the other isn't being used in your code,
because now you'll get a overlapping instance head.

I'd, however, treat whatever package exported this instance as just as
questionable as one that provided the other.

NFData provides no guarantees that it "really" does anything. There are
lots of such "mildly dangerous instances" that may not be as fully strict
as you'd like for NFData. You'll never rule them all out.

For the scenario you fear to come to pass you have to make two lapses in
judgment. First you have to depend on a package that provides the instance,
and then you have to write code that depends on the instance existing.

That seems well balanced against the fact that someone using criterion
might very well want to define the instance for NFData (a -> b) locally in
their benchmark suite, regardless of whether or not you believe the
instance should exist.

-Edward

On Mon, May 23, 2016 at 7:00 AM, amigalemming [email protected]
wrote:

On Mon, 23 May 2016, Edward Kmett wrote:

We have an open world. Orphan instances are deeply unpopular and
anti-modular. If someone needs the instance, let them have it, it won't
go far enough upstream to matter.

There needs to be only one package in any of hundred packages I import
defining this instance as an orphan and then this disables my type safety
check. I prefer the non-implementable instance and if someone needs the
enumerating instance he can implement it for a newtype wrapper of the
function type.


You are receiving this because you are on a team that was mentioned.
Reply to this email directly or view it on GitHub
#16 (comment)

@snoyberg
Copy link

I agree with Edward here, my vote goes with his stance.

On Mon, May 23, 2016, 3:33 PM Edward Kmett [email protected] wrote:

And you can just as easily implement an orphan instance module that
provides the impossible instance with the same head.

class Don'tExportMe where boom :: a
instance Don'tExportMe => NFData (a -> b) where
rnf = boom

and import that to know that the other isn't being used in your code,
because now you'll get a overlapping instance head.

I'd, however, treat whatever package exported this instance as just as
questionable as one that provided the other.

NFData provides no guarantees that it "really" does anything. There are
lots of such "mildly dangerous instances" that may not be as fully strict
as you'd like for NFData. You'll never rule them all out.

For the scenario you fear to come to pass you have to make two lapses in
judgment. First you have to depend on a package that provides the instance,
and then you have to write code that depends on the instance existing.

That seems well balanced against the fact that someone using criterion
might very well want to define the instance for NFData (a -> b) locally in
their benchmark suite, regardless of whether or not you believe the
instance should exist.

-Edward

On Mon, May 23, 2016 at 7:00 AM, amigalemming [email protected]
wrote:

On Mon, 23 May 2016, Edward Kmett wrote:

We have an open world. Orphan instances are deeply unpopular and
anti-modular. If someone needs the instance, let them have it, it won't
go far enough upstream to matter.

There needs to be only one package in any of hundred packages I import
defining this instance as an orphan and then this disables my type safety
check. I prefer the non-implementable instance and if someone needs the
enumerating instance he can implement it for a newtype wrapper of the
function type.


You are receiving this because you are on a team that was mentioned.
Reply to this email directly or view it on GitHub
#16 (comment)


You are receiving this because you are on a team that was mentioned.

Reply to this email directly or view it on GitHub
#16 (comment)

@dolio
Copy link

dolio commented May 23, 2016

I'm also for no instance at this point. If you think the existing instance is bad, it's better for no instance to exist, so that you get an error if you accidentally try to use it. The other instances I can think of have roughly the same negatives as the existing one. You don't get a compile-time complaint, you just have to figure out what went wrong if your program ever exercises the instance (and either blows up or loops uselessly for a very long time). And the instance that doesn't just blow up requires introducing a new class to implement ideally, and then getting people to implement it.

@amigalemming
Copy link
Author

On Mon, 23 May 2016, dolio wrote:

I'm also for no instance at this point. If you think the existing
instance is bad, it's better for no instance to exist, so that you get
an error if you accidentally try to use it. The other instances I can
think of have roughly the same negatives as the existing one. You don't
get a compile-time complaint,

I definitely want a compile-time complaint, and I think this can be better
done with an explicitly forbidden instance. I think of something similar
to Edward, only in Haskell 98, e.g.

class NotImplementable a where -- not exported
instance NotImplementable a => NFData (a -> b) where

This instance tells any programmer that the instance cannot be defined
anymore and that this is by intent.

@amigalemming
Copy link
Author

On Mon, 23 May 2016, Herbert Valerio Riedel wrote:

@amigalemming afaik you suggested to implement some warning facility to
annotate undesirable/questionable instances; can you think of a variant
which would help here?

I proposed this one:
https://ghc.haskell.org/trac/ghc/ticket/11796

It would solve the issue here, too, but it is not implemented in GHC, it
did not even start a discussion, so far.

@ekmett
Copy link
Member

ekmett commented May 23, 2016

My point about that instance is that it is something that you are just as
free to implement as someone who wants to implement the instance you don't
like.

Just to be clear, I was not advocating that the instance I mentioned be
placed in base.

I was mentioning that if you wanted to rule out the instance in question,
you could write your own orphan instance in your own package somewhere, and
it'd infect the global context and conflict with any other attempted
instance.

So you can either, simply check to make sure that your code still compiles
with any version of your dependencies you know doesn't have the instance
around, or you can jump through hoops and create one of the orphan
'impossible' instances described so far.

Only in the latter case do you break users who just want some instance
like instance
NFData (a -> b) locally in their criterion benchmarks, and they asked for
it by depending on your code.

-Edward

On Mon, May 23, 2016 at 9:44 AM, amigalemming [email protected]
wrote:

On Mon, 23 May 2016, dolio wrote:

I'm also for no instance at this point. If you think the existing
instance is bad, it's better for no instance to exist, so that you get
an error if you accidentally try to use it. The other instances I can
think of have roughly the same negatives as the existing one. You don't
get a compile-time complaint,

I definitely want a compile-time complaint, and I think this can be better
done with an explicitly forbidden instance. I think of something similar
to Edward, only in Haskell 98, e.g.

class NotImplementable a where -- not exported
instance NotImplementable a => NFData (a -> b) where

This instance tells any programmer that the instance cannot be defined
anymore and that this is by intent.


You are receiving this because you are on a team that was mentioned.
Reply to this email directly or view it on GitHub
#16 (comment)

@amigalemming
Copy link
Author

On Mon, 23 May 2016, Edward Kmett wrote:

Just to be clear, I was not advocating that the instance I mentioned be
placed in base.

I understood it this way.

Only in the latter case do you break users who just want some instance
like instance NFData (a -> b) locally in their criterion benchmarks, and
they asked for it by depending on your code.

Is this a frequent usecase? How about using a newtype wrapper there? The
deepseq package could provide the wrapper even by itself.

@ekmett
Copy link
Member

ekmett commented May 23, 2016

The criterion example was one usecase. One that enumerates all of the
inputs and memoizes a function's results is another that has the same head
shape that also conflicts. Sure, we could hide it behind a newtype. We
could hide every instance for (->) behind a newtype, but we don't.

But if you can build with any version of your dependencies that doesn't
have the instance, then you can't be relying on it.

You'll only run into this problem when you explicitly call rnf on a
function and when someone you don't like that you depend upon brings an
instance into scope. That frankly strikes me as pretty far removed from an
active concern, and more a theoretical exercise.

-Edward

On Mon, May 23, 2016 at 10:38 AM, amigalemming [email protected]
wrote:

On Mon, 23 May 2016, Edward Kmett wrote:

Just to be clear, I was not advocating that the instance I mentioned be
placed in base.

I understood it this way.

Only in the latter case do you break users who just want some instance
like instance NFData (a -> b) locally in their criterion benchmarks, and
they asked for it by depending on your code.

Is this a frequent usecase? How about using a newtype wrapper there? The
deepseq package could provide the wrapper even by itself.


You are receiving this because you are on a team that was mentioned.
Reply to this email directly or view it on GitHub
#16 (comment)

@amigalemming
Copy link
Author

On Mon, 23 May 2016, Edward Kmett wrote:

The criterion example was one usecase. One that enumerates all of the
inputs and memoizes a function's results is another that has the same head
shape that also conflicts. Sure, we could hide it behind a newtype. We
could hide every instance for (->) behind a newtype, but we don't.

The existence of different possible implementations is an argument pro
dedicated newtype wrappers for me.

You'll only run into this problem when you explicitly call rnf on a
function and when someone you don't like that you depend upon brings an
instance into scope. That frankly strikes me as pretty far removed from an
active concern, and more a theoretical exercise.

I want to be sure that I do not accidentally call 'rnf' or that a library
function calls 'rnf' for me on any structure, where a function is
contained somewhere deep in it, which can happen if only one of hundred
transitively imported packages defines an orphan NFData (a -> b) instance.
Even if I check the absense of such an orphan instance today, it may
change easily if one of the transitively imported packages extends its
imports.

@ekmett
Copy link
Member

ekmett commented May 23, 2016

They can do that anyway if they just define their instance for any composite structure in the middle by hand. Even your newtype solution is evidence of this. You get no such transitive guarantee.

Sent from my iPhone

On May 23, 2016, at 12:03 PM, amigalemming [email protected] wrote:

On Mon, 23 May 2016, Edward Kmett wrote:

The criterion example was one usecase. One that enumerates all of the
inputs and memoizes a function's results is another that has the same head
shape that also conflicts. Sure, we could hide it behind a newtype. We
could hide every instance for (->) behind a newtype, but we don't.

The existence of different possible implementations is an argument pro
dedicated newtype wrappers for me.

You'll only run into this problem when you explicitly call rnf on a
function and when someone you don't like that you depend upon brings an
instance into scope. That frankly strikes me as pretty far removed from an
active concern, and more a theoretical exercise.

I want to be sure that I do not accidentally call 'rnf' or that a library
function calls 'rnf' for me on any structure, where a function is
contained somewhere deep in it, which can happen if only one of hundred
transitively imported packages defines an orphan NFData (a -> b) instance.
Even if I check the absense of such an orphan instance today, it may
change easily if one of the transitively imported packages extends its
imports.

You are receiving this because you are on a team that was mentioned.
Reply to this email directly or view it on GitHub

@hvr hvr added this to the 1.5.0.0 milestone May 24, 2016
@chessai
Copy link
Member

chessai commented Jun 24, 2019

I think the main issue is the removal of the instance. It seems that this instance is not widely useful, and can be implemented as an orphan (or perhaps a newtype) for those who need it. Making it unimplementable seems to be a more controversial stretch, and tacking it on to the issue of the instance's removal makes it more likely that nothing will get done at all.

I think we should just remove the instance, make a changelog entry, and majour version bump.

@amigalemming
Copy link
Author

amigalemming commented Jun 24, 2019 via email

@andrewthad
Copy link

I agree that this instance just needs to be removed.

@chessai
Copy link
Member

chessai commented Jun 24, 2019

see #47

@Zemyla
Copy link

Zemyla commented Nov 2, 2019

What it actually should do is use seq to turn it to WHNF, then use unpackClosure# to make sure everything inside it is evaluated as well.

@amigalemming
Copy link
Author

amigalemming commented Nov 2, 2019 via email

@Zemyla
Copy link

Zemyla commented Nov 2, 2019

unpackClosure# doesn't actually evaluate the function. It simply takes it apart into the component pieces where any data needed for actually calling the function are stored.

Basically, imagine a subset of Haskell where only named functions exist and can be passed to things. To represent a partially evaluated function of some sort, you'd need an existential datatype, and functions to manipulate it.

data Closure a = forall u. Closure u (u -> a) -- Here, the (u -> a) function has to be able to be named at compile-time.

closeFmap :: (u -> a, a -> b, u) -> b
closeFmap (g, f, u) = g (f u)

instance Functor Closure where
  fmap f (Closure u g) = Closure (g, f, u) closeFmap

closeLiftA2 :: (u -> a, v -> b, a -> b -> c, u, v) -> c
closeLiftA2 (gu, gv, f, u, v) = f (gu u) (gv v)

closeAp :: (u -> a -> b, v -> a, u, v) -> b
closeAp (gu, gv, u, v) = gu u (gv v)

instance Applicative Closure where
  pure a = Closure a id

  liftA2 f (Closure u gu) (Closure v gv) = Closure (gu, gv, f, u, v) closeLiftA2

  Closure u gu <*> Closure v gv = Closure (gu, gv, u, v) closeAp

closeBind :: (u -> a, u, a -> Closure b) -> b
closeBind (gu, u, f) = case f (gu u) of
  Closure v gv -> gv v

instance Monad Closure where
  Closure u gu >>= f = Closure (gu, u, f) closeBind

This may seem silly at first, but it's pretty much what the stackless, tagless G-machine behind GHC does when you write code with closed variables, and it's also what you need to do when you work with some forms of distributed Haskell.

@amigalemming
Copy link
Author

amigalemming commented Nov 2, 2019 via email

@int-index
Copy link
Contributor

I agree with @Zemyla that rnf for functions should force everything inside the function's closure.

I think I understand what you are suggesting, but I doubt that it is a useful definition.

I disagree @amigalemming. This definition would be very much in spirit of the goals of deepseq. The documentation on Hackage presents lazy IO as a use case for rnf:

import System.IO
import Control.DeepSeq
import Control.Exception (evaluate)

readFile' :: FilePath -> IO String
readFile' fn = do
    h <- openFile fn ReadMode
    s <- hGetContents h
    evaluate (rnf s)
    hClose h
    return s

Here, it is safe because rnf s forces the string fully. However, what if we applied some function to it first?

    s <- hGetContents h
    let x = f s
    evaluate (rnf x)
    hClose h

If, say, x :: [Bool], then this is still safe. For example, we can imagine f = map isDigit. However, if we had f = map (==), then we'd end up with x :: [Char -> Bool]. Then to make it safe we should force the closures of these functions.

Thus the definition offered by @Zemyla would make the pattern above safe in the general case.

@int-index
Copy link
Contributor

Although I'm not sure how @Zemyla imagines running rnf for the data captured in the function closure. Where would the NFData instances for the captured data come from? This suggests that rnf should be a primitive implemented in the RTS.

@cartazio
Copy link

yeah, as @int-index says, forcing the captured thunks in a closure requires RTS hooks to work, and closures are existentials, and thus the RTS would have to know about how to NFData everything underneath the function itself. Thats a pretty hard ask

@cartazio
Copy link

or maybe i'm missunderstanding this thread :)

@parsonsmatt
Copy link

So, the proposal here isn't just "Remove NFData (a -> b)," but more powerfully: "Remove NFData for any type which contains functions." Is that right?

Removing the instance already removes the ability to do an rnf :: [Int -> Int] -> (), after all, so it makes sense that we'd want to also discourage the NFData instance for records containing functions.

Whether or not NFData (a -> b) is legal or not depends entirely on what normal form means. There are many sources online which seem to think that a function or lambda is in "normal form" already. Or, that a lambda is in normal form if it's body is also in normal form (ie \a -> (1 + 2 + a) is not NF because it can be reduced to \a -> 3 + a).

We can simply sidestep this whole "problem" by saying "A function a -> b's normal form is equivalent to it's WHNF."

@Bodigrim
Copy link
Contributor

if it’s dropped outright. hedgehog is busted. Which in turn means anything depending on hedgehog is broken. Cool. So until I can upgrade to a newer hedgehog, which comes with all the requirements of upgrading hedgehog dependencies (which I do not know are compatible with my code), I can’t.

Please make this a deprecation warning for at least a year before dropping the instances outright.

deepseq is a boot package. Even if the change is released overnight, users will not notice until they upgrade to a yet-unreleased major GHC version. Which is likely to take around a year or so anyway.

(I'm not advocating in favor or against the change or in favor or against any specific migration strategy. This is just to clarify that no one will be busted immediately.)

@angerman
Copy link

deepseq is a boot package. Even if the change is released overnight, users will not notice until they upgrade to a yet-unreleased major GHC version. Which is likely to take around a year or so anyway.

(I'm not advocating in favor or against the change or in favor or against any specific migration strategy. This is just to clarify that no one will be busted immediately.)

True. It will come as a major surprise to everyone at the time they upgrade their compiler. Hence I'll advocate for this to be a deprecation warning for at least two major GHC release (at the current cadence of two per year), so that users have enough time to start thinking (and mitigating) around it before the final break happens.

The current GHC release is 9.6, stable(?) is 9.4? Ecosystem ready is 9.2? (It certainly was ~4mo ago). There is the awkward 9.0 in there as well. And lots of code is still on 8.10.

People today on 8.10 are >2 years behind. They wouldn't even know this breakage is coming their way.

@erikd
Copy link
Member

erikd commented Jan 25, 2023

People today on 8.10 are >2 years behind. They wouldn't even know this breakage is coming their way.

Full disclosure, I work with Moritz. We have a huge and complex code base that is still mostly in ghc-8.10 and moving to ghc-9.2. Partly this is our fault, but there are significant changes to the compiler that we have had to adjust to, from changes to errors and warning all the way through to changes in the implementation of Integer and Natural.

@Lysxia
Copy link

Lysxia commented Jan 25, 2023

Giving NFData a "best effort" status, so that it has the most potential instances, gives you freedom to refine it a posteriori. If you want a stronger guarantee like "no thunks", you can introduce a subclass of NFData, which lets you reuse its code, and then forbid the bad instances that you know have no correct implementation. This doesn't work the other way, starting from a more restricted class.

The root of the issue seems to be the contradiction between "NFData is for types that can be forced to normal form/fully evaluated" and "there is an instance (NFData (a -> b))". The proposal is to drop the instance. But the contradiction could just as well be resolved by changing the spec of NFData, with zero breakage. There seems to be a consensus that this spec is obtusely worded at the very least.

As @parsonsmatt pointed out, "apply seq as much as it makes sense" is an alternative specification which seems to fit this instance, that also reflects the existing practice of deriving NFData for data types that contain functions. It's a bit unusual because it's a very syntactic spec, but I believe it would still make a lot of sense to most users.

And as far as I can tell, whatever your interpretation, for every type there is at most one reasonable instance of NFData. I think there's something compelling about the lack of ambiguity in making NFData the broadest possible class.

@Bodigrim
Copy link
Contributor

Bodigrim commented Jan 25, 2023

And as far as I can tell, whatever your interpretation, for every type there is at most one reasonable instance of NFData. I think there's something compelling about the lack of ambiguity in making NFData the broadest possible class.

I'm not sure about it. Imagine I'm writing instance NFData (Bool -> a). One way to implement is just seq indeed. But a better effort would be to run the function on False and on True, forcing all internal thunks.

@mixphix
Copy link
Collaborator

mixphix commented Jan 25, 2023

@Bodigrim this was also mentioned in the mailing list. Again, the issue becomes: do we want instance (Enum a, Bounded a, NFData a) => NFData (a -> b)? Because I certainly don't want to accidentally run rnf ((+ 1) @Int64). :)

This does not preclude manual instance declarations of NFData for records that contain function types, nor does it preclude defining one's own orphan instance NFData (a -> b) where rnf = (`seq` ()). It will only affect Generic deriving of the class (edit: for these data-codata types only).

I'm of the opinion that \a -> (1 + 2 + a) is not in normal form, and that this is an issue that is beyond the capabilities of this library to address. I would rather see effort directed toward expanding the capabilities of the RTS, eventually subsuming this library, than to depend on a typeclass with no guarantees as a way to cope with lazy records :)

@Bodigrim
Copy link
Contributor

Can’t we start with deprecation warnings first? And have that deprecation warning point to the issue/pr that will implement the removal?

AFAIK there is no way to deprecate an instance, which is quite unfortunate.

@evincarofautumn
Copy link

AFAIK there is no way to deprecate an instance, which is quite unfortunate.

That’s right. That makes 2 things for ghc-proposals, I guess.

@angerman
Copy link

Let's start with a proposal to decrease instances first. And then deprecate this.

@angerman
Copy link

Actually, we can work kinda haphazardly around this today, with an orphan instance:

-- Instance.hs
module Instance {-# DEPRECATED "The X Int instance will be deprecated" #-} where
import Class

f :: Int -> Int
f = (+1)

instance X Int where x = f
-- Class.hs
module Class where
class X a where x :: a -> a
-- Lib.hs
module Lib ( module Class, module Instance) where
import Class
import Instance 
-- Test.hs
module Test where
import Lib

main = print $ x (1 :: Int)

This will result in the following:

$ runghc Test.hs 

Lib.hs:3:1: warning: [-Wdeprecations]
    Module ‘Instance’ is deprecated:
      The X Int instance will be deprecated
  |
3 | import Instance 
  | ^^^^^^^^^^^^^^^
2

Maybe I've missed something which makes this un-applicable here. Having a proper Instance deprecation mechanism however seems like a good idea nonetheless.

@Kleidukos
Copy link
Member

That's a neat hack :) I'll go to sleep less stupid this evening

@angerman
Copy link

... I guess this won't work so well here. It would complain while building deepseq, but not propagate that to consumers of that library.

@mixphix
Copy link
Collaborator

mixphix commented Jan 26, 2023

...unless the instance is kept in its own quarantined module. But as mentioned above, that only creates a different problem, wherein Control.DeepSeq.Function must be imported to retain this (questionable) deriving NFData for mixed datatypes.

The ideal migration path, in my view, is one of the following:

  • to simply define the orphan instance locally, where the user has full control over its implementation and downstream scope;
  • or to forego deriving NFData for types containing functions, writing such instances manually.

@angerman
Copy link

@mixphix to be clear here, I only care (independent of merits, motivation, or anything else), that this does not result in existing code breaking from one GHC to the next release. I hope this makes sense. I really want GHC N+2 (possibly with warnings) to accept the same code as GHC N (without warnings).

@mixphix
Copy link
Collaborator

mixphix commented Jan 26, 2023

(DISCLAIMER: I'm aware that deepseq, as a boot library, enjoys a privileged position in the ecosystem, and that as its maintainer, I should be proportionally more cautious when proposing changes this deeply in the module hierarchy.)


Thank you, @angerman, for putting so concisely in this comment your only concern about this change. I think the Haskell community's nigh-religious devotion to backwards compatibility is admirable, and has allowed many extremely handy tools to thrive for many years with minimal upkeep. This spirit has also permitted reference material to remain up-to-date even while the compiler continually gains new features.

Devotion is fickle. The passion it fuels can bring great things to the world and unite large groups of people to work toward a common goal. Yet it can often lead genuinely well-meaning and concerned individuals to do and advocate for things that are harmful to themselves and their communities, while simultaneously reinforcing their faith in their cause and their hostility toward vacillators and rivals in those same communities. (I trust the reader to be able to think of many such examples in today's political climate.)

Of the many great things a devotion to backwards compatibility and long-term support has brought to Haskell, Stack and Stackage top the list. These are excellent at what they do: they provide sets of mutually-compatible libraries, with fixed versions, that are trusted to work well in tandem and whose existence is maintained over extended periods of time. The Stack team also does their best to keep up with new compiler releases, even going as far as to submit compatibility patches when particularly crucial libraries need fixing. The Haskell Stack is the number-one tool to date for ensuring long-term support of a Haskell program.

On the other hand, there are many exciting things brought to us by the GHC development team with each release. I personally am ecstatic about -XOverloadedRecordDot nullifying my worries about type errors when using -XDuplicateRecordFields, despite seeing some immediate consequences at work (shakespeare cannot yet parse dot-accessors during quasi-quote interpolation). Apart from language extensions, new GHC versions strive to improve compile times, provide new compilation targets, and make the type system more powerful.

Maintainers, when they choose to update their repositories, also strive to provide better tools for the community. As a recent example, aeson had a major version bump around the same time a hash collision vulnerability was patched, and I remember having a few weeks of minor frustration over getting aeson and text and their mutual dependencies behaving correctly together. This was an issue in my personal projects, where I wanted to play around with the new versions right away, but not at work, where we use a Stackage snapshot for version control. The issues arose when we finally decided to use a newer compiler version (and thus Stackage snapshot), and were fixed at that time.

This gets to what I think is the heart of what the backwards-compatibility camp are saying: bumping the compiler version on a project often leads to project-wide breakage.

...

No shit?

Call me inexperienced, but I haven't worked on a Haskell project large enough that the cost of such a change is measured in developer-months rather than developer-days. I have read anecdotes of some in the community who experience this when updating, and I will admit that it sounds much more tedious than enjoyable. But I ask you: do you want to have your cake, or eat it?

Because how many GHC versions are enough? This issue was opened in 2016, just after the release of GHC 8.0.1. The instance was added in 2011 (835cad35, "for convenience"), that is GHC 7.2.1. If N is not 9.6, what will it be? I would bet money that someone comes and says that we should not break it for GHC 9.8. What then? Do we wait for someone to complain about breakage before GHC 9.10? 10.0?

Do you, dear reader, whose time I have already wasted, dream of a world where all of the new language pragmas, parsing adjustments, and library functions are magically compatible with your code as-written? Do you imagine the code you build as part of an eternal creation, written once and lasting through the ages, immune to any nefarious influence? Allow me to show you the path.

Or do you submit to the crushing reality that maintenance requires effort? Do you agree that in attempting progress, there must necessarily be things that cease to exist or function as they once did? If so, then I implore you to accept the cost of rectifying the things that we now view as mistakes, lest they continue to eat away at the trust we imbue our software.


Wow, that turned out way longer than I thought. Sorry to make you the scapegoat, Moritz, but this is an attitude I see stalling easily-achievable goals with clear {performance, integrity, safety, ...} wins across the entire Haskell ecosystem. Long-term support via Stack has been around forever. New features of the compiler and libraries will continue to get released. They are mutually exclusive things to desire: you can't have both. I prefer semantics over convenience.

TL;DR: Haskell is dead; long live Haskell.

@parsonsmatt
Copy link

Call me inexperienced, but I haven't worked on a Haskell project large enough that the cost of such a change is measured in developer-months rather than developer-days.

I have!

You want to know why it is usually somewhat easy to upgrade to a new GHC?

Other people have been putting in a lot of work to upgrading the ecosystem. I've personally spent months of work-time working on the ecosystem for GHC 9.2 and 9.4, and I'm gearing up for 9.6's ecosystem work. I try to keep my company as close to the most recent version of GHC as possible, and one of those reasons is so we can provide feedback to GHC as an industrial codebase spanning 500+kloc with a massive dependency footprint.

41% of all Haskell users are still on GHC 8.10.7 as of the '22 Haskell survey. Of folks that report that they're in industry in some way, I see that 254 out of 483 that volunteer a response say they're using GHC 8.10 - about 53%. Of folks that exclusively work in industry, 83/143 report using GHC 8 in some capacity - 58%.

I think the Haskell community's nigh-religious devotion to backwards compatibility is admirable,

I know I'm spoiled having only briefly worked in other ecosystems, but Haskell releases breaking changes far more often that what I saw in other languages.

Moreover, despite having fantastic facilities for providing smooth upgrade paths and deprecation warnings, GHC and many important libraries do not take advantage of them, instead just deleting functions and instances. Users are left with a message like Unknown name 'foobar' instead of DEPRECATED: please use barfoo intead of foobar. TypeError is unfortunately infrequent. Pattern synonyms are rare.


I'm not a "no breaking changes" evangelist. Breaking changes are an important and necessary aspect of life as a software developer. However, we need to weigh the costs and benefits, and figure out ways to solve the given problems in the best way possible.

The "problem" with the instance NFData (a -> b) is that there's confusion about "what it means for a function to be in normal form." Under one definition, the instance is illegal. Under another definition, the instance is totally fine. The instance (and the Generic derivation for NFData) heavily imply the latter definition. That instance has been in the library since February 2012. The mailing list conversation about the "problem" didn't happen until 2016. And now it's 2023, and we're talking about it again.

In practice, the ecosystem has been operating under the assumption that instance NFData (a -> b) is fine and reasonable for over ten years. And we're going to break that for what benefit exactly? That someone might want an (Enumerate a, NFData b) => instance NFData (a -> b)? Has anyone ever actually wanted that? Has anyone ever actually had a real bug or problem due to NFData (a -> b)?

Why is "breaking people's code" a better solution than providing a new class that has the implications some of y'all want?

@tomjaguarpaw
Copy link
Member

fantastic facilities for providing smooth upgrade paths and deprecation warnings

In case anyone's interested in one way to achieve smooth upgrade paths I wrote an article about it.

@int-index
Copy link
Contributor

Has anyone ever actually had a real bug or problem due to NFData (a -> b)?

I had. I assumed that deepseq would bring my data to normal form, and I relied on this to guarantee timely release of resources (with lazy IO). Before you start bashing lazy IO, I want to point out that my design would have worked if not for this instance and would allow me to make large amount of business logic in the application non-monadic.

In any case, I'm going to back @angerman on this one. Breaking changes in GHC and its boot libraries should be preceded by a deprecation period when the problematic code triggers a warning. No warning – no breaking change, even though I'd personally prefer this instance gone.

If we don't have the ability to attach deprecation warnings to instances, it shouldn't be hard to add this feature to GHC. It's not rocket science, it's a warning pragma.

@angerman
Copy link

angerman commented Jan 27, 2023

@mixphix thank you for taking the time to write out that response, and don't worry, I'm not taking any offence, or even having a problem with being a scape goat. If that is what it takes to get this moving, I'm here for that.

I think the Haskell community's nigh-religious devotion to backwards compatibility is admirable,

You and I seem to have significantly different experience with backwards compatibility wrt to GHC. GHC is for us one of the by far least backwards compatible compilers/ecosystems. I am however not focused on reference material; this applies to large and complex production codebases.

Call me inexperienced, but I haven't worked on a Haskell project large enough that the cost of such a change is measured in developer-months rather than developer-days.

Maybe you are, I won't know, and I won't judge. I can tell you that a live migration of a codebase from 8.10 -> 9.2, while ensuring compatibility with 8.10, is now a multi month endeavour. Why didn't we go straight for 9.4? I wish we could have, but large parts of the dependency tree were not ready for 9.4; even 9.2 wasn't, but that was a more manageable chunk.

Coming back to your previous point:

bumping the compiler version on a project often leads to project-wide breakage.

Yes, this is the primary complaint, and I'd like to turn this on its head and ask, why does it have to be? What are the fundamental reasons that almost every GHC upgrade completely breaks existing code that was accepted by a previous GHC release perfectly fine?

But I ask you: do you want to have your cake, or eat it?

Well, you might now know who I am, and I don't expect you to. But, yes, I do want to have my cake. My team and I have contributed to GHC: ghc-bignum, aarch64-ncg, and soon the JS backend. As well as various other fixes, including support for unboxed tuples and sums in the interpreter, significant work on the iOS and Android front, ... If I could I'd use a bleeding endge GHC all the time, it would make my teams and my life and work tremendously easier. Fun fact: patches we write against 8.10, have virtually no chance into getting integrated anywhere. There won't be a 8.10.8, and forward porting those patches to master is a lot of work and can't be tested against the codebase (because GHC is incompatible with that codebase). Breakage inhibits GHC contributions. So, I see myself porting our patches to 9.2, and maybe trying to find some time to see if 9.2 and master are close enough to make it worthwhile to try and port it into master. This all costs significant time, and I'd like to make sure we can have contributors who have full-time jobs, families and hobbies; but right now we are remarkably hostile to those.

Because how many GHC versions are enough? This issue was opened in 2016, just after the release of GHC 8.0.1.

I understand your frustration here fully. And as I outlined, I'd say two, but with the current cadence, I'd prefer four; yet two would already be a world of improvements (it would mean code that compiled warning free would compile with 9.2 as well, just with additional warnings). But it needs to have deprecation warnings. And should have migration strategies pointed out. Just randomly failing to compile, potentially non-obvious, in some remote dependency busting your whole codebase is just terrible.


Let me make this abundantly clear, because it appears as if I'm here to prevent innovation and moving us forward. I absolutely do not. As I've outlined in in this discourse.haskell.org post:

I’d like the following two policies set forward:

  • GHC will not include any library that abruptly changes its interface, without a proper deprecation story (with warnings, …) over a specified timeframe. (e.g. 2 years).
  • GHC will not include language features that conflict with past behaviour unless they are behind a language pragma (we have language pragmas, so let’s use them).

I don't like writing patches, I don't like cutting compatibility releases, I don't like extra work. And I somehow doubt most library maintainers like any of these either. A breaking change in the surface language of GHC Haskell, or in the boot libraries, means that every one downstream of that now has to adapt, and cut releases. To that point:

The Stack team also does their best to keep up with new compiler releases, even going as far as to submit compatibility patches when particularly crucial libraries need fixing.

To that point, it's laudable you guys do this work, but do you enjoy doing this; patching old releases, just so that they work with a newer compiler? Why does this need to be done in the first place? The old code doesn't use any new Language features, so the surface language should break. Then the only breakage would be from changes to the boot libraries, and maybe the solution here is to just make them reinstallable instead.

Again, what I'm arguing for is that we get proper deprecation notices (into those peoples faces that actually work on the code), I do not believe that we can expect everyone to follow all issues (hell, I didn't know of this issue until ~1wk ago) on all core libraries (and dependencies, ...), follow the Steering Committees proposals, ... just to figure out how to prepare for the next GHC release.

Adapting codebases to new compilers has near zero business value.

@angerman
Copy link

If we don't have the ability to attach deprecation warnings to instances, it shouldn't be hard to add this feature to GHC. It's not rocket science, it's a warning pragma.

@int-index could you coordinate with @hsyl20 on this? I've asked him to look into this as well. I believe if you two team up this could be implemented fairly fast, and give us a much better way ahead with these kinds of issues.

@erikd
Copy link
Member

erikd commented Jan 27, 2023

@mixphix wrote:

I think the Haskell community's nigh-religious devotion to backwards compatibility

Compared to what?

If you want to talk about backwards compatibility, look a C, where legal C code from the 1980s should still compile with the latest versions of GCC and Clang.

@int-index
Copy link
Contributor

int-index commented Jan 27, 2023

@int-index could you coordinate with @hsyl20 on this? I've asked him to look into this as well. I believe if you two team up this could be implemented fairly fast, and give us a much better way ahead with these kinds of issues.

@angerman Good idea. I started by finding and reopening the relevant GHC ticket #17485 and will write a GHC Proposal as the next step. I'm happy to assist with the implementation, too. The ticket is probably a good place to discuss this further.

EDIT: The proposal is up for discussion at ghc-proposals/ghc-proposals#575.

@mixphix
Copy link
Collaborator

mixphix commented Jan 27, 2023

Here's an alternative idea, that I think will address another pain point of this library, namely the confusion around what NFData and "normal form" actually mean.

The purpose of this library is to fully evaluate thunks in nested data structures. About that, there's no question. The issue is when data structures contain functions: we can't fully evaluate them, and often we don't want to. But that's not really the issue: we already know we can't "fully evaluate" an arbitrary function, but we still want to know that the function itself isn't a thunk (is this even possible?). The class isn't really about "normal form" at all: it's about removing thunks and strictifying data.

-- | appropriate haddocks, etc
class Unthunk x where
  unthunk :: x -> ()

type NFData x = Unthunk x
{-# DEPRECATED NFData "Please use `Unthunk` instead" #-}

rnf :: (Unthunk x) => x -> ()
rnf = unthunk
{-# DEPRECATED rnf "Please use `unthunk` instead" #-}

{- ... -}

-- | explaining in detail what this does & why it's here
instance Unthunk (a -> b) where
  unthunk = (`seq` ())

Someone using NFData presumably knows about thunks and strictness. They may or may not care about what the definition of "normal form" means, or which. All they want is to remove thunks from their data. The name NFData is a red herring.

@int-index
Copy link
Contributor

int-index commented Jan 27, 2023

That doesn't fix anything, it just changes the name from NFData to Unthunk, even though the unthunk method would still fail to force the thunks captured in a function's closure, in contradiction to its name.

And type NFData x = Unthunk x would break every hand-written instance out there.

@TeofilC
Copy link

TeofilC commented Jan 31, 2023

Here's an idea for a path forward.

In the next release of deepseq, we add a warning on generic derived instances that make use of NFData (a -> b) where the message suggests deriving via a new newtype called something like AssumeFunctionsDeepStrict. Concurrently we try to replace problematic instances on Hackage with ones that don't use Generic deriving.

The idea is that deriving NFData via AssumeFunctionsDeepStrict uses Generic deriving but allows making use of an NFData (a -> b) instance.

Then in the release after, the NFData (a -> b) instance is removed. So, if users want they can still use the old behaviour and all it requires is adding via AssumeFunctionsDeepStrict.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.