Make normalization/projections explicit and lazy in the type system #205

SebastianMestre · 2020-11-30T14:47:45Z

SebastianMestre
Nov 30, 2020
Maintainer

We've recently had some issues around metacheck not having enough information to resolve certain metatypes at certain times. I've read up on some things, and I think I found a solution.

The Problem -- Concretely

Before looking at the solution, I would like to briefly explain what the problem we are trying to solve is, exactly.

Essentially, we have a function from metatypes to metatypes, and we want to store the result calling that function as the metatype of an ast. However, we often don't know what the argument is.

To clarify, the function is essentially this (pseudo python):

def access_meta_from_record_meta(mt):
    if mt == Meta.Value:
        return Meta.Value
    elif mt == Meta.Monotype:
        return Meta.Constructor
    else
        raise ValueError

The Problem -- Abstractly

Let's pretend that the metatype system is a programming language.

Values

In this programming language, we bind every value to a name using the = operator. The values in
the programming language are either variables, or atoms (aka builtin values). The atoms are:
typefunc, monotype, constructor, and value.

We instanciate an atom by writing its name, and create a variable by writing Var, followed
by an identifier in parentheses.

here are some examples:

a = var(a)
b = var(c)
c = typefunc
d = value

Unify operator

There is also the unify operator, which lets us state that two values are equal, and it updates
values accordingly.

Here is some example code:

a = var(a) # after evaluation, a points to monotype
b = monotype
unify(a,b)

Projections

Now, for something new: 'projections'. A projection is a mapping from one value to another (essentially unificator-level functions).

For instance, few useful projections could be:

construct(constructor) = SUCCESS
construct(monotype)    = SUCCESS
construct(_)           = ERROR

access_field(monotype) = constructor
access_field(value)    = value
access_field(_)        = ERROR

Note how the access_field projection is basically the access_meta_from_record_meta function
described in the previous section

Here, I sneaked in the special values SUCCESS and ERROR. These just represent that a projection
may succeed without returning a value, or fail, stopping execution.

The Solution -- Implementation

Now that we have defined the language, the mapping to a concrete implementation is pretty direct.

vars map to a Var in the Unification::Core data structure
atoms map to a Term in the Unification::Core data structure
the unify operator maps to Unification::Core::unify

Well, pretty direct, until we get to projections.

To represent projections, we will use a new type of node in the Unification::Core data structure.

The Projection node -- Proj for short -- specifies an operation and a list of arguments to that
operation. A projection node cannot be meaningfully unified: it must be evaluated beforehand.

Normalization

The process of evaluating projections is a special case of what's called normalization.

Normalization is the conversion of an expression (or, in this case, a Unification::Core node)
to a canonical form. This lets us check if two expressions are equivalent by comparing their
canonical forms term by term.

We don't currently an explicit instance of this in Jasper, but I hope a Haskell-themed example will make it clear:

We want to check if

["a","b"]++["c"] == ["a"]++["b","d"]`

First, we take each expression to its canonical (or normal) form:

["a","b","c"] == ["a","b","d"]

Now, we compare term by term.

("a" == "a") and ("b" == "b") and ("c" == "d")

"c" != "d". Thus, ["a","b"]++["c"] != ["a"]++["b","d"].

In our case, the canonical form is one that has no projections in it.

Lazy vs Eager

There are two ways to do normalization:

normalize all expressions whenever possible -- this is called eager normalization
normalize an expression only when it's read (e.g. when trying to unify, etc) -- this is called lazy normalization

There are tradeoffs associated with the two normalization strategies but, essentially, it boils down
to eager normalization being more performant in some cases (because you can often avoid storing the
projection in the data structure), boasting a slightly simpler core algorithm, and with often easier
to track error reports (if you eval early, you fail early, and probably have the relevant data at
hand), but with more complicated user code.

In the other hand, lazy normalization has a slightly more complicated unify and find
implementation, but it's a lot easier to work with, and it deals with certain edge cases (like
cycles) somewhat better.

It's worth mentioning that some things are really hard to do with eager normalization, but
trivial with lazyness

How does this fix our bug?

If you look at the code, what metacheck(AccessExpression) is doing, is, essentially, eager
normalization. If you think about it, that's where the problem lies.

It's not that we don't have enough information, but that we are asking for it much too early.

Thus, delaying this as late as possible should be a very large improvement. (and we hopefully don't
need to do prolog-style backtracking, nor multiple passes).

SebastianMestre · 2022-02-11T02:24:08Z

SebastianMestre
Feb 11, 2022
Maintainer Author

So, turns out this is waaay overcomplicated. We can do this much more simply by splitting constraint finding and constraint solving into two passes. This was done in #286

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make normalization/projections explicit and lazy in the type system #205

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Make normalization/projections explicit and lazy in the type system #205

SebastianMestre Nov 30, 2020 Maintainer

The Problem -- Concretely

The Problem -- Abstractly

Values

Unify operator

Projections

The Solution -- Implementation

Normalization

Lazy vs Eager

How does this fix our bug?

Replies: 1 comment

SebastianMestre Feb 11, 2022 Maintainer Author

SebastianMestre
Nov 30, 2020
Maintainer

SebastianMestre
Feb 11, 2022
Maintainer Author