Skip to content

CompilerOptimization

Felix S. Klock II edited this page Jul 28, 2013 · 2 revisions

Several aspects of the anticipated R6RS are expected to interact with compiler optimization.

  • whole-program compilation
  • immutability of pairs (as the default)
  • letrec* semantics for internal definitions
  • libraries
  • consistency checking for separately compiled libraries
  • no compile-time rejection except by macro expander

The R6RS is expected to require implementations to raise a violation exception in many circumstances where the R5RS has no such requirement. Since these exceptions are required only in Larceny's R6RS-conforming mode, whose only compiled code will be in specially crafted versions of the standard libraries, they do not interact with compiler optimization.

whole-program compilation

The R6RS notion of a program requires all of the libraries to be present when the top-level program is compiled. That makes it easy to perform simple whole-program optimizations. For example, it is easy to determine whether an R6RS program uses eval or mutable pairs. (Programs that use eval have potential to use anything, so whole-program optimizations won't apply to them.)

immutability of pairs

In programs that do not use mutable pairs, the list? procedure can be inlined, and many of the other procedures that operate on lists become simpler and faster.

The standard libraries can define an inlined pairs-are-mutable? predicate that returns false in programs that don't use mutable pairs, and returns true in programs that do. Then all of the standard procedures whose performance is sensitive to the presence of mutable pairs can ask this predicate whether they have to go through the slow case.

It might be better to have two different versions of the standard libraries whose performance is most sensitive to mutable pairs.

letrec* semantics for internal definitions

The compiler has to preserve the order of random definitions, and must guard references to variables that are defined by a random definition, but all bindings of constants and known procedures can be promoted to the head of the letrec* bindings, making it unnecessary to guard references to them.

(At worst, the transformed program will execute without raising an exception required by the R6RS. If the violation is deliberate, then the programmer's attempt to run it in one of Larceny's compiled modes, without using a compiler switch to request strict enforcement of the letrec* restriction, is like trying to execute an Emacs Lisp program in Common Lisp. If the violation is accidental, then the only casualty is portability, and Larceny's R6RS-conforming mode is the ideal solution for that.)

If the known procedures reference variables that are defined by random definitions, then the guards become unnecessary once control has reached the body following the definitions. The cleanest and most effective way to deal with this is for the compiler to generate two versions of each defined procedure, one with guards for use while the definitions are being executed, and one without guards for use after the definitions are complete. Whether this level of optimization (or a less effective solution, such as the flow analysis performed by Chez Scheme) becomes desirable will depend on what kind of code programmers write. Let's wait and see.

libraries

R6RS libraries have some nice properties:

  • all exported variables are immutable
  • in effect, only named values are exported (I think)
  • at run time, references to imported variables are always defined
  • the types of many imported variables are known at compile time
  • the values of many imported constants are known at compile time
  • the source code for many imported procedures are known at compile time

R6RS libraries also have a few not-so-nice properties:

  • the compiler may have to check for implicit exports
  • at meta times, imported variables might not be defined
  • the semantics is all balled up with low-level macros
  • so we're going to have to live with someone else's expander
  • which is likely to deliver code that's hard for us to optimize
  • and for us to extract relevant information for optimization

We need to:

  • make sure the expander delivers code that Twobit can optimize effectively within the library itself
  • extract type, constant, and source code information for exported variables
  • add that information to our FASL files
  • use that information to optimize libraries that import previously compiled libraries

consistency checking for separately compiled libraries

If libraries A and B import library C, then we need to make sure separate compilation of A and B is done with respect to the same version of C. This can be done by

  • fingerprinting each library when it is compiled
  • including the fingerprints of each imported library in the FASL file
  • comparing fingerprints at load time

no compile-time rejection except by macro expander

Either we rewrite the approximately 150 places where Twobit rejects programs at compile time, or we take advantage of the fact that Twobit is not used by Larceny's R6RS-conforming mode. The latter is easier and tends to be more useful, but we regard Twobit's propensity to reject programs as a minor bug that we should get around to fixing in due time.

Clone this wiki locally