!!! Module System

This page will eventually explain the K module system and will go into the K manual, so please read carefully and provide feedback. The theoretical foundations are based on categorical pushouts and colimits, and are explained to some extent in these two papers:

K does not provide parametric modules, or support for hiding, or renaming yet, but it will eventually.

K Code Grouping

Before proceeding, we first explain how code can be grouped in K. Specifically, in K we use the well-established MarkDown convention for grouping, which only requires one reserved character, the backtick `, and some simple conventions. For example:

`Grouped together, using single backtick as brackets`
``Grouped but containing a backtick (`); use double backtick brackets``

The backtick delimiters surrounding a K code region may include spaces--one after the opening, one before the closing--allowing us to place literal backtick at the beginning or end of a K code region:

`` ` `` represents K code containaing a single character symbol, a backtick
`` `foo` `` represents K code `foo` inside K code

To place white-spaces at the beginning or end of a K code region, use the same space rule above, placing precisely one space after the opening and precisely one space before the closing backticks, and then whatever you wish in-between, including white-spaces:

` ` represents empty K code (there are two white spaces between the backticks; github's viewer only shows one in inlined-grouped text, like here, although it shows them properly on block-grouped text)
` ` represents one-space K code (there are three spaces between the backticks; github's MD viewer shows only one of them in inlined-grouped text)
the following represents a four white-space K code: one-character space, followed by one return, followed by one-character space followed by a return, and one-character space again; this time we use github's block-grouped text:

` 
 
`

Our current syntax specification formalism of K does not allow us to define grammars containing white spaces, so we do not discuss these any further here.

Let us discuss an example showing the importance and strength of grouping. Suppose that a user defines a language which contains a keyword rule, which happens to be the same as K's rule keyword. Then she needs a way to tell K how to differentiate between K's rule and the user's rule. Consider the following small and purposely artificial syntax

    syntax Entities ::= "people" | "things"
    syntax Rule ::= Entities "rule" Entities

and suppose that we want to add a K rewrite rule stating that things rule people rewrites to people rule things. Writing the rule naively like

    rule things rule people => people rule things

may or may not work, depending on how one sets the priorities of the various constructs, but even if it works it can be hard to read and confusing. With K grouping, one can help the parser and the human reading the rule as follows:

    rule `things rule people` => `people rule things`

Further, one may want to rewrite things to people and people to things locally, that is, to write `things => people` rule `people => things`, but now it would be really ambiguous to naively write

    rule `things => people` rule `people => things`

because K has no chance to guess that the second rule is the user-defined one and not K's. Thanks to nesting of grouping, we can write the rule non-ambiguously as follows:

    rule `` `things => people` rule `people => things` ``

How does K parse such backtick-surrounded code regions? As you know it by now, K's parsing is rather tricky and it is performed in stages. One simple way to think about parsing K code regions is that it happens as a separate step. That is, the K parser first uses a MarkDown parser to parse the top-level regions generically as "K code bubble of some sort which I don't care about yet", and then recursively parses the bubble using information inferred from the context and at the same time passing information to the context, to mutually help each other. As shown below, K allows you to change this default behavior by specifying what particular syntax to use for parsing each code bubble.

M1 + M2: The Pushout of M1 and M2

Like in many other languages that are serious about their module system, such as OBJ, Maude, ML, SpecWare, etc., the main categorical operation that models K's intended module semantics below is that of a pushout, or more generally that of a colimit. Intuitively, a pushout tells us how to merge two objects (in our case modules) that share some common parts in a consistent and minimal way: given two morphisms f : Z -> X and g : Z -> X, their pushout consists of an object P and two morphisms i1 : X -> P and i2 : Y -> P for which the following diagram commutes:

Moreover, the pushout (P, i1, i2) must be minimal, or universal, with respect to this diagram. That is, for any other such set (Q, j1, j2) for which the following diagram commutes, there must exist a unique u : P -> Q also making the diagram commute:

As with all universal constructions, the pushout, if it exists, is unique up to a unique isomorphism.

In our case of K, the pushout concept above gets materialized as follows. Consider modules Z, X, Y, such that Z is imported by both X and Y, that is,

module X
  import Z
...
endmodule

and

module Y
  import Z
...
endmodule

As shown in the pushout Wikipedia page, in general we can also rename symbols (i.e., sorts and operations (or K labels)) in Z as we import them in X and Y, but for simplicity assume ordinary inclusion of Z in X and Y. Now suppose that we want to aggregate X and Y in a module P, that is,

module P
  import X
  import Y
endmodule

The module expression X + Y can also be used when building the module P on the fly and its particular name is not needed; for example, when parsing (#parse(X+Y, "string")). The semantics of P above is then the pushout of the inclusions Z -> X and Z -> Y. Without going into category theory, this pushout can be constructed algorithmically as follows:

Each symbol (Sort or KLabel) z in Z will be referred to as z@Z (we have not fully agreed on using the @ notation for this, so this is still open) everywhere in Z, X, Y, and P
Each symbol x in X but not in Z will be referred to as x@X in both X and P
Each symbol y in Y but not in Z will be referred to as y@Y in both Y and P

As a consequence of the rules above, in case the same symbol s happens to be in both X and Y but not in Z, then in P we refer to the two as s@X and s@Y, respectively. In other words, all symbols are fully qualified with their defining modules, meaning that symbols introduced in distinct modules will be distinct symbols. Note that K provides no special syntax for "introducing" a symbol in a module. One simply uses the symbol in the module and, if the same symbol has not been used in any of the imported modules then the symbol is considered "introduced" by the module. This applies to both sorts and operations (or KLabels). Therefore, by design, it is impossible to redefine a symbol Sym in a module M2 that imports a module M1 which already defines Sym; i.e., in this case it is impossible to have both Sym@M1 and Sym@M2. Of course, this does not preclude M2 from defining new productions that refer to sorts or that have KLabels declared in M1. For example:

module INT
  syntax Int ::= ...
               | Int "+" Int
endmodule

module EXP
  import INT
  syntax Int ::= size(Exp)      // old Int sort of INT, new Exp sort
  syntax Exp ::= Int            // old Int sort of INT
               | Exp "+" Exp    // different KLabel from the one in INT
endmodule

Note that the _+_ in EXP has nothing to do with the _+_ in INT, in spite of the fact that Exp has Int as a subsort. In other words, unlike languages like OBJ and Maude, K does not adhere by default to the order-sorted algebra paradigm. Our reasons are mostly pragmatic, and not theoretical, being driven by effectiveness of translations of K definitions to various backends that do not provide support for order-sortedness. Also, order-sortedness comes with a performance degradation, because it requires to compute the least sort of a term at runtime, in addition to often confusing (pre-)regularity error reports, especially in combination with associative and commutative operations, because the existence of a least sort for a term require the overloaded operation symbols to obey a condition called regularity. For example, with the above grammar, 1 + 2 is ambiguous in K as either Int or Exp, while it is an Int in Obj and Maude. Indeed, the latter languages allow by default operations to be overloaded both at parse time and at runtime, each term being computed the least sort at runtime, so 1 + 2 is an integer and the + is inferred to be the variant on Int defined in module INT. The least sort calculations accumulate during the rewriting of a term, making matching slower; in our experiments, we found that computing the least-sorts of terms can take up to 5% of the total time required to rewrite a term.

Therefore, in K, apparently related productions like above are considered totally different by default. This is achieved at compile time, where the frontend of K by default assigns different KLabels to different productions, for example (but subject to change, because the exact naming conventions of KLabels are still under discussion) _+_:Int*Int->Int and, respectively, _+_:Exp*Exp->Exp. This has the apparent drawback that users have to write explicit rules to down terms to lower sorts; for example, in a module like EXP above, one may want to have a rule

    rule I1 + I2 => `I1 + I2`::Int

saying that the + on expressions is reduced to the "library" + on integers when its arguments become integers. Recall that K does sort inference and that Term::sort tells K to parse Term as sort Sort, so the right-hand side + can only be the one in INT, and the sorts of I1 and I2 can only be Int in that case. While writing such rules may look inconvenient at first sight, especially because in the order-sorted world you do not have to write them, we have found that such an inconvenience is acceptable when designing programming languages, because it is relatively well-accepted in this domain that there is an explicit transition from a language operation applied on evaluated arguments to an invocation of a library or mathematical function on those arguments. Additionally, we found it practically inconvenient when designing the syntax of a programming language to have its desired syntax be unexpectedly captured by library functions which just happened to have the same name (like the user-defined + above on Exp being captured by the library + in INT); considering that there are potentially hundreds or even thousands of library functions, to avoid such problems in a previous version of K where symbols were overloaded, we had to assign rather strange names to library operations with a low chance of being overloaded by a user-desired syntax, such as +Int, andBool, etc.

It is still possible to enforce K to assign the same KLabel to two distinct productions/operations, but one has to do this explicitly:

module INT
  syntax Int ::= Int "+" Int  [klabel(myplus)]
endmodule

module EXP
  import INT
  syntax Exp ::= Exp "+" Exp [klabel(myplus)]
endmodule

Now because of the "unique symbol" convention described above, the two productions will have the same KLabel, myplus, which is qualified to myplus@INT when translated to KORE. Note, however, that no least-sort calculations are performed by K at runtime, the user being fully in charge of adding the necessary sort-membership side conditions to their rules. We do not encourage beginners to the use overloaded symbols.

One may also criticize K's reluctance to overriding symbols from an object-oriented perspective. Note, however, that unlike in object-oriented programming where object instances are created dynamically and in that context dynamic dispatch makes full practical sense and at the same time works well with compositional reasoning, in K a language semantics is constructed statically, by putting together all the syntax and the semantic rules of the composing modules in a consistent way. When importing an existing module, its syntax and semantics become an intrinsic part of the importing module. At our knowledge, there is still no compositional parsing framework. Besides, K translates to various backend languages, some of which without support for module compositionality. Consequently, in the end, a K definition can be thought of as a large and flat module, putting together all the syntax and all the semantics of the imported modules. This operation is much cleaner to implement and less ambiguous for users if each symbol is unique throughout the entire K definition.

The above abstract three-module example was discussed in terms of pushouts for simplicity, but in practice we have complex graphs of module imports whose semantics then generalizes from pushouts to colimts, which can be algorithmically computed either as a sequence of pushouts, like OBJ and Maude do, or directly as SpecWare does.

KORE versus Frontend

All symbols will be fully qualified in KORE with the module in which they were introduced, as explained above. Here is another example, to consolidate the concept. If module M1 introduces symbol (sort or KLabel) Sym for the first time, and then modules M2 and M3 both import M1 and use Sym, and then module M4 imports M2 and M3 and also uses Sym, then in all these 4 modules, in KORE, Sym will actually be Sym@M1. On the other hand, if there is no M1 but only M2 and M3 each introducing their own symbol Sym and M4 imports both M2 and M3, then M2 will refer to its Sym as Sym@M2, M3 will refer to its Sym as Sym@M3, and M4 will refer to the two distinct Sym symbols as Sym@M2 and Sym@M3, respectively. This is, again, the case for KORE definitions. Therefore, in KORE each symbol name will already tell you where it was declared! So one less API method needed.

In the Full K frontend, things will be a bit more complicated, because we want users to just write the symbol name Sym and have the frontend infer the module that it is coming from. That is, the fronend will attempt to infer a unique module M so that Sym@M makes sense, and if not possible, it will issue an error or a warning. Of course, users will also be given the option to qualify symbols with module names in order to disambiguate. For example, if they write Sym@M then the frontend will start searching for Sym starting with module M, and if module M0 is found unambiguously to have defined symbol Sym, then Sym@M will be replaced with Sym@M0 in the generated KORE.

Due to its parsing capabilities, K allows us to go one step further. Specifically, we can tag any K code region with a module, for example KCode@Module, which tells K to parse KCode using Module. Then the above situations where symbols are tagged with their defining module fall as a special case of this general notation, where KCode is Sym. Before we can discuss this notation in its full generality, we first need to discuss how to extend a module syntax with K syntax in order to parse terms that use combined syntax.

M*: The K Closure, or Klosure, of Module M

A K module allows users to define syntax that can be used to parse not only programs to be analyzed, but also terms that combine the user-defined syntax with K syntax. These are needed to state a configuration pattern that one wants to reach in some krun commands, and also in K rules. Indeed, the K rules allow us to mix user-defined syntax with K syntax, such as in this rule giving the semantics of assignment in IMP:

    rule <k> X:Id = I:Int; => . ...</k>  <state>... X |-> `_ => I` ...</state>

The assignment in the <k/> cell _=_; and the binding in the <state/> cell _|->_ use concrete user-defined syntax, but everything else is K-specific syntax: the sorted variables X:Id and I:Int, the ellipses ..., the rewrite symbol =>, the ., etc. To parse such terms, the K frontend needs to extend any module M with K-specific syntax (and disambiguation). For example, special K-related sorts are added, such as KItem, KList, etc., and syntactic constructs for them, as well as subsort and priority relationships with user-defined sorts and operations, among others. This is a rather complex process that we do not describe here.

Here we only introduce the terminology of a module K closure, or more simply klosure, for the operation above, and write it M* when applied to module M. Note that, for performance reasons, the K frontend in fact implements several kinds of klosures, each involving only certain extensions of M as needed by different situations and command line options, but here we do not distinguish between them. Also, note that since K needs to internally distinguish its special syntax from user-defined syntax, it is expected that all the K special syntax is hooked through appropriate attributes when merged within M*, such as (note: these will likely change):

    syntax KTop [hook(kTopSort)]
    syntax KTop ::= UserSort [hook(kSubsortingToTop("UserSort"))]

These hooks not only help K properly build its module klosures, especially when they are built iteratively (e.g., (M1* + M2)*), but more importantly allow sophisticated users of K to change the syntax of K if the current syntax turns out to be in conflict with their language's syntax or if they simply don't like it.

Although it is clear that K needs the klosure operation internally in order to parse the various kinds of terms that it has to, it is less clear why such a klosure operation needs to be made available to users as an operation on modules. First, it gives us a nice separation of concerns: the module closure operation and the parser can be implemented and tested independently, without mixing any of the them with code needed for the other one; as side effects, this potentially will also allow us to easily change the parsing infrastructure to any off-the-shelf parser that can parse K grammars, and will also improve the grammar specification language of K (because that should be powerful enough to parse any language, including K itself). Second, it makes the development of K simpler by allowing desired functionality to be done in the user-space using K instead of hardwired in the K implementation, at the same time providing users with a powerful operation that they have at their fingertips in order to define their complex languages in K.

Indeed, module klosures are useful not only for parsing, using kualification like explained shortly or using operations of the form #parse(INT*, `1 ~> []+2`, Int), but also for rewriting. For example, most likely a user of K invoking manually a #rewrite command on a module M, will want to invoke it in the klosure of a module, that is, #rewrite(M*,...). Note that our current implementation automatically invokes rewriting in the klosure M* of the module M even if we write it #rewrite(M,...); so currently, writing #rewrite(M*,...) means that we rewrite with M**, which as discussed below is M*. This will have to eventually change, because with the current approach we will never be able to define certain languages in K, for example, K itself. For example, in K, a module suffers a series of transformations during kompilation, which can be implemented with a bare rewrite engine using modules which have not been yet applied any closure operations. Consider, for example, macro expansion; recall that users are allowed to use the attribute macro to a rule, with that meaning that that rule is applied "statically" everywhere, including in the module itself. This static, kompile-time operation can be easily achieved by invoking the bare rewrite engine on a module containing only the macros as rules and without any other operation applied to that module, and on a term which happens to be the module to be transformed.

Module Algebra

The two module operations above, sum (or pushout, or aggregation, or colimit) and klosure, give us a very simple language to construct modules from other modules. In the future, we should explore the mathematical laws of this module algebra, such as

    M included in M*    (extensive)
    M** = M*   (idempotent)
    M1 included in M2 implies M1* included in M2*   (increasing)
    (M1 + M2)* = (M1* + M2*)*
    ...

The first three properties above state that klosure is indeed a closure operator. For now, assume that this simple module algebra is implemented in the K frontend, that is, that one can write any such module expression over existing modules and the K frontend will provide the resulting module. In order for properties like the above to hold, K needs to know which special syntax has been added by itself as part of a klosure operation and which was defined by the user. This is where the hooks added to the special syntax help, because thanks to them K can distinguish between the two kinds of syntax, even if a user decides to change the syntax of K.

Note that for efficiency reasons, K will likely not flatten the module structure internally.

Kualification: Parse in Module

Before we start discussing the notion of kualification, or parsing in module, we want to remind the user that using the user-defined concrete language syntax when defining the K semantics of the language is nothing more than a convenience. Indeed, one can always use KLabels and the KAST prefix notation instead of concrete syntax, such as if_then_else_(B,S1,S2) instead if B then S1 else S2, and one will never have any parsing ambiguity problems with their semantics. With this in mind, qualifying only symbols (KLabels and sorts) with their modules as explained above, i.e., Sym@M, is sufficient. This is also exactly what happens in KORE. Nevertheless, K allows concrete user-defined syntax to be used in the semantics, and thus mix it with K syntax, in order to make semantics more human readable, understandable, and appealing even to non-experts (PL experts are used to working with abstract syntax). In the same spirit, we extend module qualification from symbols to any term.

The general notation for module qualification in K is KCode@ModuleExpression. If KCode is just a symbol (sort or KLabel) and ModuleExpression is just a module name, then we obtain the basic symbol qualification notation explained above, which will be fully applied when translating to KORE, so that KORE definitions will never need to search for a symbol. But the notation is a lot more general. For example, the following are all possible now (K brackets are not needed everywhere below, but we write them to avoid any ambiguity or distractions due to ambiguities):

1@INT                      = token 1 of INT
`1+2`@INT                  = expression 1+2 to be parsed in INT
`1 ~> []+2`@INT*           = the K term 1 ~> []+2 to be parsed in the klosure of INT, INT*
`X+1`@INT*                 = term X+1 parsed with INT* (we cannot parse with INT, because of X)
`_+_`@INT*                 = KLabel _+_ corresponding to INT*'s +  (we can now rename the ugly _+Int_ with _+_)
`_+_`@EXP*                 = KLabel _+_ corresponding to EXP*'s +
`_+_`@INT*(X,1)            = same like the above, but in KAST notation with the KLabel fully desugared
`_+_(X,1)`@INT*            = same like the above, but in KAST notation to be parsed within the INT module
`X+1`:Int                  = runtime cast to sort Int, possibly ambiguous if Int defined in multiple modules
`X+1`:`Int@INT*`           = same like the above, but with Int non-ambiguous
`` `X+1`:Int ``@INT*       = parsing `X+1`:Int  in module INT*
`_+_`@INT*(X:`Int@INT`,`1@INT`) = same like the X+1 above, but in fully desugared KAST notation

Besides the more general and compact notation that captures symbol qualification as special case, the parse-in-module operation allows for more efficient and controllable parsing and disambiguation than using normal sort casting. Consider, for example, `X+1`@INT* vs. `X+1`:Int. The module INT* has only one + operation, so parsing it is very fast and non-ambiguous. On the other hand, the module containing `X+1`:Int may have many overloaded + operations that could slow down parsing or even yield ambiguities that cannot be easily solved unless the user provides more sort information (e.g., X:Int).

Example: Language Translation

With a module system with appropriate kualification, it is possible to easily define language translators in K. For example, suppose that we want to define a translator from C to Java. Without kualification, we would have to define a module that imports both C and Java. Now if C and Java happen to each have a sort named Exp, which is quite likely, then we would be in trouble because two syntactic categories meant to be disjoint would get collapsed. Of course, we do not want to have to modify any of the existing semantics of C or Java in order to define a translator from one to the other. Thanks to kualification, we can now write something like this:

    syntax Exp@Java ::= translate(Exp@C)  [function]
    rule translate(E1+E2) => translate(E1)+translate(E2)
    ...

Note that K's sort inferencer allowed us to not write any sorts for the variables E1 and E2 above. The rule above is totally equivalent to the following more kualified one:

    rule translate(`E1+E2:Exp`@C*) => `translate(E1:Exp@C)+translate(E2)`:Exp@Java

Example: Using Different Syntax for Parsing and Semantics

We learned the hard way that for larger programming languages we need a different and typically less mathematically appealing syntax for parsing than for the semantics. That's because we want to generate efficient parsers for parsing programs, so that they can handle large programs, while the parsers needed to parse the semantics need not be that efficient, because the rules tend to be small, but need to be able to parse mathematically meaningful structures. For example, one may want to parse comma-separated lists Exps of expressions Exp using a grammar where an Exps is either empty "" or a sequence of zero or more Exp "," (i.e., an expression followed by comma) followed by an Exp. This syntax allows for an efficient LR(1) parser, but it is mathematically meaningless; indeed, we do not want to give semantics to the syntactic category corresponding to sequences of zero or more Exp ",". Instead, what we want for the semantics is to define Exps as an associative list that extends Exp and has a visible unit .Exps, because this will allow us to write rules performing associative matching on lists of expressions and not having to cover the one-expression list as a special case. Specifically, we would like to define something like the following:

module EXPS-SHARED
  import EXP
endmodule

module EXPS-PROGRAMS  // syntax good for parsing programs
  import EXPS-SHARED
  syntax Exps ::= ExpsAux  [klabel(wrapperExps)]  // we need an explicit inclusion label
  syntax ExpsAux ::= "" | ExpCommas Exp
  syntax ExpCommas ::= "" | Exp "," ExpCommas
endmodule

module EXPS-SYNTAX  // syntax good for semantics
  import EXPS-SHARED
  syntax Exps ::= Exp | ".Exps" | Exps "," Exps   [assoc, unit(.Exps)]
endmodule

module EXPS-SUGARING
  import EXPS-PROGRAMS
  import EXPS-SYNTAX
  syntax Exps ::= tr(ExpsAux)    [function]
                | tr(ExpCommas)  [function]
  rule wrapper(Ea:ExpsAux) => tr(Ea)  [macro]
  rule tr(`  `:ExpsAux) => .Exps   // recall that `  ` is the K empty string
  rule tr(ECs E) => tr(ECs), E
  rule tr(`  `:ExpCommas) => .Exps
  rule tr(E,ECs) => E,tr(ECs)
endmodule

module LANGUAGE-SHARED
  import EXPS-SHARED
  // shared syntax, say including sort Pgm for programs
  ...
endmodule

module LANGUAGE-PROGRAMS  // all language syntax used for parsing programs
  import LANGUAGE-SHARED + EXPS-PROGRAMS
  ...
endmodule

module LANGUAGE-SYNTAX  // all mathematical syntax used for parsing/giving semantics
  import LANGUAGE-SHARED + EXPS-SYNTAX
  ...
endmodule

module LANGUAGE-SUGARING  // all desired sugaring
  import EXPS-SUGARING + LANGUAGE-PROGRAMS + LANGUAGE-SYNTAX
  ...
endmodule

module LANGUAGE
  import LANGUAGE-SYNTAX  // so only the mathematical syntax is imported
  configuration
    <T>
      <k>
        #rewrite(#parse($PGM, LANGUAGE-PROGRAMS, PGM), LANGUAGE-SUGARING)
      </k>
      <state> .Map </state>
    </T>

  // now we can unambiguously refer to Exps as assoc lists, as we want them to
  // be in the semantics; we will never see Exps from EXPS-PROGRAMS here
endmodule

The methodology above may look complicated at first sight, but what is important here is that it is possible to do all the above, and that happens thanks to the kualification. We just showed how to define lists above, but the same methodology can be used for many other complex parsing problems. For example, parsing in stages, like in K (first parse a module using the fixed K syntax in order to extract the user-defined syntax, at the same time parsing the contents of rules as string bubbles, and then parse these bubbles using the K syntax extended with the user-defined syntax extracted from the module syntax). However, since defining lists is quite a common operation, the frontend provides a very convenient notational sugar for lists. Instead of doing all the above separation between the syntax for parsing versus the syntax for semantics, users can simply use the notation Exps ::= List{Exp,","} when defining their expressions:

module EXPS
  import EXP
  syntax Exps ::= List{Exp,","}
endmodule

module LANGUAGE-SYNTAX
  import EXPS
  ...
endmodule

or even simply define Exps directly in LANGUAGE-SYNTAX or in LANGUAGE as List{Exp,","}. The frontend will automatically generate all the modules above and perform all the necessary separation of syntax.

Note: Interestingly, it seems that this example should work just fine even without kualification. That is, there was no need to use @ anywhere in this example. All was needed was access to #rewrite and #parse with appropriate modules that were not necessarily imported in the current module.

Thoughts on Notation

As discussed above, we are currently thinking of using the notation KCode@ModuleExpression for deferring the parsing of KCode to module ModuleExpression (which may or may not be a klosure). This was Grigore's proposal, which is not meant to be already decided but rather some particular choice in order to move on with the design of the module system. We had several whiteboard discussions in the K team about this topic, but no concrete notation has been agreed upon. There seem to be all sorts of different notations for that in different languages (https://en.wikipedia.org/wiki/Namespace), for example:

namespace.localname
namespace::localname
namespace/localname
namespace localname
namespace:localname
namespace-localname
localname@namespace (email, for example)

So the choice is not obvious. Additionally, we have to also blend in smoothly with our other notations in K, for example Term:Sort, Term::Sort, X.Y which can parse as "X applied to .K applied to Y".

Dwight was OK with @: Now that you suggest it, value@namespace isn't actually that bad a choice. It's a character that isn't that commonly used in programming language syntaxes, is reasonably intuitive (people will recognize it as namespacing analogous to email addresses), and collides with no non-user K syntax. We can even go one step further and tell people to disambiguate between a programming language @ and the K @ using whitespace: ie, similarly to how there must be no whitespace between : and Int in X:Int, we can declare @INT as a single terminal, thus requiring that there be no space between @ and INT in X:Int@INT, which would allow for easy disambiguation in line with existing annotations.

Cosmin was not OK with it initially: My personal preference would be “.” if we can parse it unambiguously (Radu?). : can be tricky because it may introduce an ambiguity for a human reading the code: `_+_`:INT and `_+_`:Int. If we stick with the capitalized module convention (I’m not a particularly big fan, but it is here already) then at least this is not an issue. The choice of character will also influence the localname-namespace order. To follow current conventions, if we go with ., namespace.localname is better. If we go with : (or @), localname:namespace is probably better. Also, if we allow parsing each .k file using a different syntax of the K language (we could do this easily by some annotation at the beginning of the file (e.g., //parser=myFunckyVersionOfTheKSyntax.k), then users can easily choose a different syntax if they dislike the standard.

But there were some notational inconsistencies when lifting the notation to K code regions and module expressions, and then he did not continue the argument. Cosmin, feel free to continue to defend you notation if you want to, but we have to move on with this soon.

Radu and Brandon had some comments, but had no specific opinion or proposal. One interesting side observation came out of Radu's comments, namely that we do not really need module kualification in order to rename _+Int_ to _+_, because we can always do this:

module EXP
  import INT
  syntax EXP ::= Int | Exp + Exp  [strict]
  rule I1 + I2 => (I1 + I2)::Int
  // I1+I2 in the LHS parses as Exp+Exp, due to the max-sort convention
endmodule

Old Slides Explaining Parsing in K

These slides, parsers.pptx, were proposed back in 2013 as a first cut to the complex parsing philosophy of K.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly