type.lhs

= Types and Typeclasses =

Types enable humans and computers to reason about our programs. The halting
problem is not a problem, because types can prove code must terminate. They can
do even more and prove a program terminates with the right answer: for example,
we can prove a given function correctly sorts a list.

Types can automate programming, namely,
https://www.youtube.com/watch?reload=9&v=mOtKD7ml0NU[a human supplies the type
of a desired function and the computer programs itself].

Types can be lightweight. Indeed, they can be invisible. A compiler can
use _type inference_ to type-check a program completely free of any type
annotations. However, it's good to throw a few type annotations in the source,
as they are a form of documentation that is especially reliable because the
compiler ensures they stay in sync with the code.

Therefore, we ought to add types to our language. We mimic Haskell, with at
least one deliberate difference:
https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/tldi10-vytiniotis.pdf[let is not generalized]. We only generalize top-level definitions.

== Typically ==

We shamelessly lift code from https://web.cecs.pdx.edu/~mpj/thih/thih.pdf[Mark
P. Jones, _Typing Haskell in Haskell_]. Our version is simpler because we lack
support for mutual recursion and pattern matching.

Since we're using the Scott encoding, from a data type declaration:

------------------------------------------------------------------------
data Adt a b c = Foo a | Bar | Baz b c
------------------------------------------------------------------------

we generate types for the data constructors:

------------------------------------------------------------------------
("Foo", a -> Adt a b c)
("Bar", Adt a b c)
("Baz", b -> c -> Adt a b c)
------------------------------------------------------------------------

Along with:

------------------------------------------------------------------------
("|Foo|Bar|Baz", Adt a b c -> (a -> x) -> x -> (b -> c -> x) -> x)
------------------------------------------------------------------------

which represents the type of `case` in:

------------------------------------------------------------------------
case x of
  Foo a -> f a
  Bar -> g
  Baz b c -> h b c
------------------------------------------------------------------------

The `case` keyword is replaced with the identity combinator
during compilation:

------------------------------------------------------------------------
I x (\a -> f a) (\ -> g) (\b c -> h b c)
------------------------------------------------------------------------

Our type checker is missing several features, such as kind checking and
rejecting duplicate definitions.

[#typically.toggleshow]
---------
include::typically.hs[]
---------

Compilng this code takes (very roughly) three times as much time and memory to
than its predecessor, which perhaps is reasonable as the source is
substantially longer and we chose several slow algorithms for the sake of
simplicity.

However, this compiler takes about eight times more time and space to build
its successor.

The new type checking phase certainly deserves blame, as does the even larger
size of the next compiler. But there is a subtler source of drag. More syntax
sugar; more problems. While functions such as `($)` let us reduce clutter in
our code, our bracket abstraction routine sees it as an opaque symbol,
hampering optimization. Laziness helps somewhat: we find `($)` compiles to the
I combinator, and the first time the VM reduces something like `Ifx` it
replaces all references to it with `fx`.

This is too little too late. As bracket abstraction fails to recognize `($)` is
the identity, it adds superfluous combinators. For example:

----------------------------------------------------------------
\x y -> f x $ g y
----------------------------------------------------------------

becomes:

----------------------------------------------------------------
R g(B B(B ($) f))
----------------------------------------------------------------

when it should just be:

----------------------------------------------------------------
R g(B B f)
----------------------------------------------------------------

== Classy ==

In the worst case, types are a burden, and force us to wrestle with the
compiler. We twist our code this way and that, and add eye-watering type
annotations until it finally compiles.

In contrast, well-designed types do more with less. Haskell's type system not
only enables easy type inference, but also enables _typeclasses_, a syntax
sugar for principled overloading. By bestowing Prolog-like powers to the type
checker, the compiler can predictably generate tedious code so humans can
ignore irrelevant details.

Again, _Typing Haskell in Haskell_ provides some background. Since we're
generating code as well as checking types, we also need techniques described in
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.53.3952&rep=rep1&type=pdf[John
Peterson and Mark Jones, _Implementing Type Classes_].

We choose the dictionary approach. A dictionary is a record of functions that
is implicitly passed around. For example, if we infer the function `foo` has
type:

------------------------------------------------------------------------
foo :: Eq a => Show a => a -> String
------------------------------------------------------------------------

then we may imagine our compiler turning fat arrows into thin arrows:

------------------------------------------------------------------------
foo :: Eq a -> Show a -> a -> String
------------------------------------------------------------------------

Our compiler then seeks dictionaries that fit the two new arguments of types
`Eq a` and `Show a`, and inserts them into the syntax tree.

With this in mind, we modify the type inference functions so they return a
syntax tree along with its type. Most of the time, they just return the input
syntax tree unchanged, but if type constraints are inferred, then we create a
`Proof` node for each constraint, and apply the syntax tree to these new nodes.

In our example, if `t` is the syntax tree of `foo`, then our type inference
function would change it to `A (A t (Proof "Eq a")) (Proof "Show a")`.
Here, we're using strings to represent constraints for legibiity; in reality,
we have a dedicated data type to hold constraints, though later on, we
do in fact turn them into strings when generating variable names.

We call such a node a `Proof` because it's a cute short word, and we think of a
dictionary as proof that a certain constraint is satisfied. Peterson and Jones
instead write "Placeholder".

Typeclass methods are included in the above. For example, while processing
the expression:

------------------------------------------------------------------------
(==) (2+2) 5
------------------------------------------------------------------------

we infer that `(==)` has type `Eq a => a -> a -> Bool`, so we modify the
syntax tree to:

------------------------------------------------------------------------
(select-==) (Proof "Eq a") (2+2) 5
------------------------------------------------------------------------

After type unification, we learn `a` is `Int`:

------------------------------------------------------------------------
(select-==) (Proof "Eq Int") (2+2) 5
------------------------------------------------------------------------

The next phase constructs records of functions to be used as proofs. We
loosely follow 'Typing Haskell in Haskell' once more, and search for instances
that match a given constraint. A matching instance may create more constraints.

We walk through how our compiler finds a proof for:

------------------------------------------------------------------------
Proof "Eq [[Int]]"
------------------------------------------------------------------------

Our compiler finds an instance match: `Eq a => Eq [a]`, so it rewrites the
above as:

------------------------------------------------------------------------
(V "Eq [a]") (Proof "Eq [Int]")
------------------------------------------------------------------------

The "Eq [a]" string is taken verbatim from an instance declaration, while the
"Eq [Int]" is the result of a type substitution found during unification
on "Eq a".

Our compiler recursively seeks an instance match for the new `Proof`. Again it
finds `Eq a => Eq [a]`, so the next rewrite yields:

------------------------------------------------------------------------
(V "Eq [a]") ((V "Eq [a]") (Proof "Eq Int"))
------------------------------------------------------------------------

and again it recursively looks for an instance match. It finds the `Eq Int`
instance, and we have:

------------------------------------------------------------------------
(V "Eq [a]") ((V "Eq [a]") (V "Eq Int"))
------------------------------------------------------------------------

This works, because our compiler has previously processed all class and
instance declarations, and has prepared the symbol table to map "Eq Int"
to a record of functions for integer equality testing, and "Eq [a]" to a
function that takes a "Eq a" and returns a record of functions for equality
testing on lists of type `a`.

Overloading complicates our handling of recursion. For example,
each occurrence of `(==)` in:

------------------------------------------------------------------------
instance Eq (Int, String) where (x, y) == (z, w) = x == z && y == w
------------------------------------------------------------------------

refers to a distinct function, so introducing the `Y` combinator here is
incorrect. We should only look for recursion after type inference expands
them to `select-== Dict-Eq-Int` and `select-== Dict-Eq-String`, and we
look for recursion at the level of dictionaries.

Among the many deficiencies in our compiler: we lack support for class
contexts; our code allows instances to stomp over one another; our algorithm
for finding proofs may not terminate.

Without garbage collection, this compiler requires unreasonable amounts of
memory.

[#classy.toggleshow]
---------
include::classy.hs[]
---------

== Barely ==

Our compilers are becoming noticeably slower. The main culprit is the type
unification algorithm we copied from the paper. It is elegant and simple, but
also grossly inefficient. The pain is exacerbated by long string constants,
which expand to a chain of cons calls before type checking.

Fortunately, we can easily defer expansion so it takes place after type
checking. This alleviates enough of the suffering that we'll leave improving
unification for another time.

This change is a good opportunity to tidy up. In particular, we eliminate the
murky handling of the `R` data constructor: before type checking, it
represented integer constants, while after type checking, its field was to be
passed verbatim to the next phase. Now, data types clear up the picture.

We immediately take advantage of the neater code and add a few rewrite rules
to cut down the number of combinators.

To make this incarnation of our compiler more substantial, we knuckle down
and implement a `Map` based on
https://pdfs.semanticscholar.org/0284/0ad5ab57dd13e388800094e52a3069df23cd.pdf[Adams
trees]. Again, we mostly copy code from a paper. The running time improves
slightly after replacing the association list used in the last compilation
step with `Map`.

We choose the BB-2.5 variant, based on the benchmarks in the paper, though
it is troubling that `Data.Map` chose BB-3 trees.

We also move the assembler from C to Haskell, that is, our compiler now outputs
bare machine code rather than ION assembly.

[#barely.toggleshow]
---------
include::barely.hs[]
---------

++++++++++
include::toggleshow.js[]
++++++++++