These notes are intended for developers working on the internals of Joker itself. They are not comprehensive.
As with Clojure, Joker supports "libraries" of code organized into namespaces. It offers a number of namespaces that are built-in to the Joker executable itself, as well as the ability to dynamically (at run time and on-demand) extend these namespaces via external Joker source files typically organized into directory trees and deployed alongside the Joker executable. (Currently, Joker does not support dynamic extension via non-Joker code, such as Go plugins.)
Whether built-in (as described below) or separately deployed via source files written in Joker (as described in Organizing Libraries (Namespaces)), developers should be aware of the progression of any given namespace.
The states through which a given namespaces transitions are:
- Available
- Mapped
- Loaded
A namespace is available if its source code is either:
- compiled (in some form) directly into the Joker executable
- deployed as Joker code such that a running Joker executable can locate and load it
The compiled namespaces are also the built-in namespaces, and are described below. These start out mapped, though not necessarily loaded; the core namespaces are loaded on-demand when first referenced. (joker.core
is referenced immediately upon startup of the Joker executable; joker.repl
is as well, when running Joker as a REPL.)
The set of core namespaces is hardcoded (in any given Joker executable) in joker.core/*core-namespaces*
; this list is built automatically from information generated by core/gen_code/gen_code.go
.
Other built-in namespaces, the so-called std namespaces, start out (like the core namespaces) as both available and mapped.
A namespace that is available, but not mapped, is not found via e.g. (the-ns 'megacorp.biz.logic)
. Only when first referenced (via :require
or similar) is it searched for and then (if found) loaded.
There's (currently) no formal predicate for whether a namespace is available (due to being deployed). One could wrap a reference within a try
, as in (try (require 'megacorp.biz.logic) (catch Error e ...))
; however, this would have the side-effect (in the success case) of mapping and loading the namespace, so is not a pure predicate.
A namespace is mapped if it is present in (all-ns)
, which enumerates all the namespaces mapped into the current (global) environment.
In this state, the namespace is "registered" (to coin a synonym) with the canonical Clojure namespace mechanism as implemented by Joker.
But the namespace itself hasn't yet necessarily been initialized. Only when that happens (potentially "lazily") is the namespace said to be loaded.
When actually needed, via a :require
clause in an (ns ...)
specification, due to (require ...)
, or (for an already-mapped namespace) directly as a symbol qualifier via e.g. joker.some.namespace/somevar
, a namespace is loaded, meaning its internal code and data structures are fully initialized.
For example, running Joker with the --verbose
option to observe some of the pertinent transitions (and with a two-line Joker script in a/b/c.joke
that does (ns a.b.c)
and (println "here i am!")
):
$ joker --verbose
Lazily running fast version of string.InternsOrThunks().
NamespaceFor: Lazily initialized joker.string for joker.repl
NamespaceFor: Lazily initialized joker.repl for FindNameSpace
Welcome to joker v0.14.2. Use EOF (Ctrl-D) or SIGINT (Ctrl-C) to exit.
user=> (all-ns)
(joker.walk joker.template joker.io joker.json joker.base64 joker.csv joker.filepath joker.url joker.html user joker.core joker.hiccup joker.strconv joker.better-cond joker.bolt joker.crypto joker.math joker.os joker.uuid joker.yaml joker.string joker.test joker.pprint joker.hex joker.http joker.repl joker.set joker.tools.cli joker.time)
user=> joker.core/*core-namespaces*
#{joker.tools.cli joker.test user joker.template joker.core joker.walk joker.set joker.repl joker.hiccup joker.pprint joker.better-cond}
user=> (the-ns 'joker.hiccup)
Lazily running fast version of html.InternsOrThunks().
NamespaceFor: Lazily initialized joker.html for joker.hiccup
NamespaceFor: Lazily initialized joker.hiccup for FindNameSpace
#object[Namespace "joker.hiccup"]
user=> (use 'joker.hiccup)
nil
user=> (use 'joker.template)
NamespaceFor: Lazily initialized joker.walk for joker.template
NamespaceFor: Lazily initialized joker.template for FindNameSpace
nil
user=> (the-ns 'joker.hiccup)
#object[Namespace "joker.hiccup"]
user=> (the-ns 'a.b.c)
<repl>:7:10: Exception: No namespace: a.b.c found
Stacktrace:
global <repl>:7:1
core/the-ns <joker.core>:2316:18
user=> (use 'a.b.c)
here i am!
nil
user=> (all-ns)
(joker.string joker.test joker.pprint joker.hex joker.http joker.uuid joker.yaml joker.repl joker.set joker.tools.cli joker.time joker.walk joker.template joker.io joker.json joker.base64 joker.csv joker.filepath joker.url joker.html user joker.core joker.hiccup joker.better-cond joker.bolt joker.crypto joker.math joker.os joker.strconv a.b.c)
user=> (defn all-ns-as-set-of-strings [] (set (map str (all-ns))))
#'user/all-ns-as-set-of-strings
user=> (all-ns-as-set-of-strings)
#{"joker.crypto" "joker.strconv" "joker.pprint" "joker.csv" "joker.io" "joker.string" "joker.url" "joker.template" "joker.core" "joker.tools.cli" "joker.uuid" "joker.html" "joker.set" "joker.hex" "joker.time" "joker.json" "joker.bolt" "joker.hiccup" "user" "joker.yaml" "joker.filepath" "joker.repl" "joker.os" "joker.base64" "joker.better-cond" "joker.math" "joker.test" "joker.http" "joker.walk" "a.b.c"}
user=> ((all-ns-as-set-of-strings) "a.b.c")
"a.b.c"
user=> ((all-ns-as-set-of-strings) "joker.foo")
nil
user=> (joker.core/ns-initialized? 'joker.os)
false
user=> (joker.core/ns-initialized? 'joker.hiccup)
true
user=> (joker.core/ns-initialized? 'joker.html)
true
user=> (joker.core/ns-initialized? 'a.b.c)
true
user=>
First, note that joker.string
is lazily initialized. This is due to running Joker as a REPL, because that automatically loads joker.repl
, which in turn requires joker.string
.
Then, (the-ns 'joker.hiccup)
explicitly loads that namespace, meaning that it is initialized. (It needn't be initialized again during this same invocation of Joker.)
Further, because joker.hiccup
requires joker.html
(an already-mapped namespace), the latter is loaded (lazily initialized).
Similarly, referencing joker.template
causes joker.walk
to also be loaded (initialized). As a core namespace, joker.walk
doesn't define an InternsOrThunks()
function that needs to be called.
(the-ns a.b.c)
then fails because a.b.c
is not mapped. But because it is available (a/b/c.joke
exists), it can be referenced, and thus mapped and loaded, via (use 'a.b.c)
, (require 'a.b.c)
, or similar. (all-ns)
then includes a.b.c
in its result.
A helpful (if longwindedly-named) function is then defined to return a set of all mapped namespaces, as strings, which can in turn be used to easily determine whether a namespace (as a string) is mapped. a.b.c
and the nonexistent joker.foo
are each tested this way.
The private function joker.core/ns-initialized?
is then used to test whether various mapped namespaces have been initialized. Of these, only joker.os
returns false
, because it has not yet been referenced.
Note that, at present, there are no explicit tests for whether a namespace is available (in the general sense). One could attempt to load a namespace with a (try ...)
, but that would have the (potential) side effect of actually loading the namespace.
These distinctions should be of little, if any, important to developers of Joker code, since these transitions are (largely) managed automatically on behalf of canonical Joker code. But such distinctions are potentially of interest to developers working on Joker internals, including core or std namespaces.
As explained in the README.md file
, Joker provides several built-in namespaces, such as joker.core
, joker.math
, joker.string
, and so on.
All necessary support for these namespaces is included in the Joker executable, so their source code needn't be distributed nor deployed along with Joker. This allows Joker to be deployed as a stand-alone executable.
The built-in namespaces are organized into two sets:
-
Core namespaces, which provide functions, macros, and so on necessary for rudimentary functioning of Joker or expected to be of widespread interest
-
Standard-library-wrapping ("std") namespaces, which provide Clojure-like interfaces to various Go standard libraries' public APIs
The mechanisms used to incorporate these namespaces into the Joker executable differ substantially, so it is important to understand them when considering adding (or changing) a namespace to the Joker executable.
Core namespaces, starting with joker.core
, define the features (mostly macros and functions) that are necessary for even rudimentary Joker scripts to run.
Their source code resides in the core/data/
directory as *.joke
files.
Not every such file corresponds to a single namespace; the linter_*.joke
files modify the joker.core
namespace, while the remaining files do correspond to namespaces, and are named by dropping the joker.
prefix and changing all .
characters to _
. So, for example, the joker.tools.cli
namespace is defined by core/data/tools_cli.joke
.
When Joker is built (via the run.sh
script), go generate ./...
is first run. Among other things, this causes the following source line (as a Go comment) in core/object.go
to be executed:
//go:generate go run -tags gen_code gen_code/gen_code.go
This line builds and runs core/gen_code/gen_code.go
, which finds, in the CoreSourceFiles
array defined near the top of core/gen_code/gen_code.go
, a list of files (in core/data/
) to be processed.
As explained in the block comment just above the var CoreSourceFiles []...
definition, the files must be ordered so any given file depends solely on files (namespaces) defined above it (earlier in the array).
Processing a .joke
file consists of reading and evaluating forms in the file via Joker's (Clojure-like) Reader. This is done for the core-library-defining (that is, not linter-specific) files, yielding fully populated data structures as if all core namespaces (and std namespaces upon which they depend) have been fully loaded in a Joker invocation. (Keep in mind that this is done before a proper Joker executable is actually built.)
Then, the data structures defining (among other things) the resulting namespaces are compiled into Go code that, when (in turn) compiled into a Joker executable, creates them in toto, mostly via static initialization of numerous package-scope variables.
Linter-specific files (named core/data/linter_*.joke
) are treated differently. After all the core-library-defining files are compiled to Go code (as described above), these linter-specific files are read and evaluated, "packing" the resulting forms into a portable binary format, and encoding the resulting binary data as a []byte
array in Go source files named core/a_*_data.go
, where *
is the same as in core/data/*.joke
.
This approach does not involve the normal Read phase at Joker startup time (though the Evaluation phase remains largely the same). So, the overhead involved in parsing certain Clojure forms is avoided, in lieu of using (what one assumes would be) faster code paths that convert binary blobs directly to AST forms. But most of Joker's object types (corresponding generally to Clojure forms) are stringized into the binary-data stream, and parsed back out at load time; so not all parsing overhead is avoided.
A disadvantage of this approach is that it requires changes to core/pack.go
when changes are made to certain aspects of the AST.
As native-Go-code compilation (for core namespaces and linter files) occurs before the go build
step performed by run.sh
, the result is that that step includes those core/a_*.go
source files. The binary data contained in the core/a_*_data.go
(linter-data) files is, when needed, unpacked and the results used to modify the environment as appropriate for the linter mode involved.
The resulting Joker executable thus starts up with all the core-namespace-related data structures already nearly-fully populated, with remaining work done via a combination of initialization functions (func init()
), dynamic-variable initialization (of *out*
, *command-line-args*
, etc.), and lazy initialization (such as compiled regular expressions in joker.hiccup
) when the respective namespaces are actually referenced for the first time during that invocation.
When in linter mode, the forms encoded (as a []byte
array) in the pertinent core/a_linter_*_data.go
files are unpacked and evaluated upon startup, after joker.core
has been fully loaded.
IMPORTANT: Because the compiled structures are (mostly) statically initialized in the default, fast-startup, version of Joker, core libraries defining variables whose values depend on dynamic variables might not work properly even when the values are copied by namespaces other than joker.core
(and thus the values are referenced, in the slow-startup version, only when the namespaces are actually loaded).
The one case that currently exists, joker.test/*test-out*
, is defined as a copy of joker.core/*out*
; but the latter is set at runtime (hence the adjective "dynamic"), so gen_code.go
detects that and specially handles this case by copying the value of it into the value of *test-out*
only during the lazy-initialization phase of joker.test
, instead of leaving the assignment performed when gen_code.go
evaluates the forms in core/data/*.joke
(at which point in time *out*
is nil
).
But the general case of such a reference is neither handled nor detected (though either or both could be implemented if deemed necessary).
So while (def clargs *command-line-args*)
might work, even though *command-line-args*
is initialized at runtime, (def nargs (count *command-line-args*))
might silently always set nargs
to 0
in the fast-startup version, since there are no Joker command-line arguments present when gen_code
runs.
Now, this wouldn't work in joker.core
anyway, because that namespace is always processed (in both variants of Joker) before dynamic variables (such as *command-line-args*
, *classpath*
, and so on) are set.
If nargs
(in the example shown above) is defined in (say) joker.test
, however, the slow-startup version of Joker will perform that assignment after dynamic variables have been initialized, because that's when it reads in and evaluates the blobs comprising that namespace (once it's referenced).
But the fast-startup version of Joker will have already "baked in" the value of nargs
when gen_code.go
ran; there's no runtime code currently generated to dynamically set such a dependent variable after that variable has been set.
This doesn't affect functions that merely reference dynamic variables. E.g. (defn nargs [] (count *command-line-args*))
would work fine (and nargs
would be called as a function), since Joker does not compile such forms into Go code in optimized form.
Arguably, copying of dynamic variables is an unwise practice in any case: as highlighted above, the user of a namespace doesn't necessarily control when the namespace code is loaded and any such assignments performed. Providing initialization/reset functions for such namespaces, or simply promoting the desired assignees to functions that simply reference the dynamic variables, is probably better, as these approaches either give the namespace user control over when to perform the copying of values, or obviate the issue.
For a dynamic variable such as *command-line-args*
, this might not seem important; but for something like *out*
or *classpath*
, which user code might change while running, it's important for said user code (or any namespaces it uses) to be able to predict when their values will actually be captured by core namespaces, just as they would be by user-defined (3rd-party) namespaces/libraries.
The list of such dynamic variables is kept in core/gen_code/gen_code.go
, and is currently:
knownLateInits = map[string]struct{}{
"joker.core/*in*": struct{}{},
"joker.core/*out*": struct{}{},
"joker.core/*err*": struct{}{},
"joker.core/*command-line-args*": struct{}{},
"joker.core/*classpath*": struct{}{},
"joker.core/*core-namespaces*": struct{}{},
"joker.core/*verbose*": struct{}{},
"joker.core/*file*": struct{}{},
"joker.core/*main-file*": struct{}{},
}
Assuming one has determined it appropriate to add a new core namespace to the Joker executable (versus deploying it as a separate *.joke
file), one must code it up (presumably as Joker code, though some Go code can be added to support it as well).
Then, besides putting that source code in core/data/*.joke
, one must:
- Add it to the core/gen_code/gen_code.go
CoreSourceFiles
array (after any core namespaces upon which it depends)
Further, if the new namespace depends on any standard-library-wrapping namespaces:
- Edit the core/gen_code/gen_code.go
import
statement to include each such library's Go code - Ensure that code has already been generated (that library's
std/*/a_*.go
files have already been created), perhaps using an older version of Joker to rungenerate-std.joke
from within thestd
subdirectory
Create suitable tests, e.g. in tests/eval/
.
Finally, it's time to build as usual (e.g. via ./run.sh
), then run ./eval-tests.sh
or even ./all-tests.sh
.
When Joker is run, the namespace is automatically added to *core-namespaces*
as an "available" library; upon being loaded, it will be added to *loaded-libs*
. (The fast-startup version of Joker will have already loaded all core libraries upon startup.)
Note that, in the slow_init
version of Joker, core libraries (other than joker.core
and, when running the Repl, joker.repl
) do not show up in joker.core/*loaded-libs*
(which is returned by the public function loaded-libs
) until after they've been loaded via :require
or similar.
These namespaces are also defined by Joker code, which resides in std/*.joke
files.
These *.joke
files, however, have code of a particular form that is processed by the std/generate-std.joke
script (after an initial version of Joker is built). They cannot, as explained below, define arbitrary macros and functions for use by normal Joker code.
The std/generate-std.joke
script, which is run after the Joker executable is first built (by run.sh
), reads in the pertinent namespaces, currently defined via (def namespaces ...)
at the top of the script. This definition dynamically discovers all the *.joke
files in std/
.
(apply require :reload namespaces)
loads the target namespaces, then the script processes each namespace in namespaces
by examining its public members and "compiling" them into Go code, which it stores in std/*/a_*.go
, where *
is the same name, std/*/a_*_slow_init.go
, and possibly std/*/a_*_fast_init.go
.
For example, std/math.joke
is processed such that the resulting Go code is written to std/math/a_math*.go
.
Note: This processing does not handle arbitrary Joker code! In particular, "logic" (such as (if ...)
) in function bodies is neither recognized nor handled; it's actually discarded, in that it does not appear (in any form) in the final Joker executable. Similarly, no macros (public or otherwise) appear at all; so, as with logic in functions, they're useful only insofar as they might affect how other public members are defined during the running of std/generate-std.joke
.
Instead, the processing consists primarily of examining the metadata for each (public) member and emitting Go code that, when built into (the soon-to-be-rebuilt) Joker executable, creates the namespace (joker.math
in the above example), "interns" the public symbols, and includes (attached to those symbols) both suitable metadata and Go-code "stubs" that handle Joker code referencing a given symbol and the underlying Go implementation (typically a standard-library API, such as math.sin
for joker.math/sin
).
Those stubs handle arity, types, and results.
Whether they call Go code directly, or call support code written in Go (typically included in a file named std/*/*_native.go
, e.g. std/math/math_native.go
) -- and the specific Go-code invocation used -- is determined via the :go
metadata and return-type tags for the public member, as defined in the original std/*.joke
file.
The a_*.go
files generated for std namespaces cause the namespaces to be mapped by the time the Joker executable has finished starting up. That's why they appear in (all-ns)
, even when they haven't actually been loaded (lazily initialized).
As standard-library-wrapping namespaces are lazily loaded (i.e. on-demand), and needn't build up the ASTs that the core namespaces build up, they can be expected (in the standard, not fast-startup, build of Joker) to offer lower overhead at startup and/or first-use time. That is, only namespace generation, interning of symbols, and metadata is built up; other logic is "baked in" via compilation of the Go code accompanying these namespaces.
However, any logic (such as conditionals, loops, and so on) to be performed by them must be expressed in Go, rather than Joker, code; this mechanism is designed for easier creation of "thin" wrappers between Joker and Go code, not as a general mechanism for embedding Joker code in the Joker executable.
Another advantage (besides performance) of this approach is that the resulting code that builds up the target namespace has no dependencies on any other Joker namespaces -- not even on joker.core
.
That means a core namespace may actually depend on one of these (standard-library-wrapping) namespaces, as long as std/generate-std.joke
has been run and the resulting std/*/a_*.go
file has been made available in the working directory (e.g. by being added to the Git repository).
NOTE: generate-std.joke
generates two or three a_*.go
files per namespace, depending on whether the namespace is required by any of the core namespaces. a_*_slow_init.go
handles the runtime (including "lazy") initialization; if the namespace is required by a core namespace, it's generated for only the gen_code
program to use, and a_*_fast_init.go
is generated to handle the runtime/lazy initialization needed by Joker itself.
The run.sh
script includes an optimization that avoids building Joker a second (final) time after it runs std/generate-std.joke
to generate the std/*/a_*.go
files.
That optimization starts by computing a hash of the contents of the std/
directory before running the script, and another one afterwards.
If the hashes are identical, run.sh
assumes nothing has changed in the std/*.joke
files with respect to the std/*/a_*.go
files present prior to running the script, and thus there's no need to rebuild the Joker executable so the changed files are built in.
(Of course, even if a std/*.joke
file hasn't changed, any changes to std/generate-std.joke
or any of the std/*/*.go
files, handwritten or autogenerated, will result in a different hash being computed and thus a rebuild.)
Besides creating std/foo.joke
with appropriate metadata (such as :go
) for each public member (in joker.foo
), one must:
mkdir -p std/foo
(cd std; ../joker generate-std.joke)
to createstd/foo/a_foo*.go
** NOTE: If../joker
does not exist (due to a failed build while iterating through this process), any recent version (such as the installed, official, version you might have in$PATH
) may be used- If necessary, write supporting Go code, which goes in
std/foo/foo_native.go
and other Go files instd/foo/*.go
- Add the resulting set of Go files (in
std/foo
), as well asstd/foo.joke
, to the repository - Add the appropriate line to the
import
block at the top ofmain.go
- Add tests to
tests/eval/
- Rebuild the Joker executable (via
run.sh
or equivalent) - Run the tests (via
./all-tests.sh
or just./eval-tests.sh
)
While some might object to the inclusion of generated files (std/*/a_*.go
) in the repository, Joker currently depends on their presence in order to build, due to circular dependencies (related to the bootstrapping of Joker) as described below.
This script generates foo/a_foo*.go
files based on foo.joke
files.
Given:
(defn <RTN-TYPE> FN
DOCSTRING
{:added VERSION
:go GOCODE}
[ARGSPEC...])
This results in the following code in a_foo.go
:
var __GOFN__P ProcFn = __GOFN
var GOFN Proc = Proc{Fn: __GOFN__P, Name: "GOFN", Package: "std/foo"}
func __GOFN(_args []Object) Object { BODY }
That is, GOFN
is a Proc
that wraps a ProcFn
var (__GOFN__P
) to
which the implementation itself, named __GOFN
, is assigned.
GOFN
is a slightly mangled form of FN
(an underscore is appended,
etc.; see the go-name
function in the script) and BODY
chooses an implementation
based on the number of elements in _args. (So [ARGSPEC...]
could
actually be ([ARGSPEC1...]) ([ARGSPEC2...]...)
, each with a
unique number of arguments, in which case GOCODE
is not just a string,
but a map of the number of arguments to the corresponding string.)
PanicArity()
is called if the number of arguments does not match.
Each such implementation extracts the arguments based on their
ARGSPEC
-declared types (ARGSPEC
typically being ^ARGTYPE ARGNAME
), via ExtractARGTYPE(_args, N)
(where N
is the
argument index), then calls the corresponding GOCODE
, saving the
result in _res
, which is then returned.
If RTN-TYPE
is omitted, GOCODE
's result is returned as-is (which
typically requires GOCODE
to refer to a custom implementation in
foo/foo_native.go
, as in the case of a function that returns
nil
, aka NIL
in Joker's Go code); otherwise, Make<RTN-TYPE>
is called to wrap the result in the desired type.
Non-functions (such as constants and variables) and functions (see above) follow.
Next, this follows all those vars (functions and non-functions):
func Init() {
{non-fn-inits}
InternsOrThunks()
}
Any non-function runtime initializations are performed in
{non-fn-inits}
.
<NSNAME>Namespace
is then defined as a global variable
initialized to a global Clojure namespace with NSFULLNAME
(e.g. "joker.foo"
) as a symbol, said namespace being added
to the set joker.core/*loaded-libs*
:
var fooNamespace = GLOBAL_ENV.EnsureSymbolIsLib(MakeSymbol("joker.foo"))
a_foo.go
finishes with:
func init() {
fooNamespace.Lazy = Init
}
That is, upon Joker startup, the namespace is first registered (mapped),
then its lazy-initialization function (Init()
) is registered for it.
a_os_slow_init.go
defines (the "slow" version of) InternsOrThunks()
:
func InternsOrThunks() {
<NSNAME>Namespace.ResetMeta(MakeMeta(nil, "{NSDOCSTRING}", "VERSION"))
{interns}
}
NSDOCSTRING
comes from the :doc
metadata in the ns
invocation
at the top of foo.joke
; VERSION
is currently hardcoded to
"1.0"
. That's also where imports are specified; they're generated
near the top of foo/a_foo_slow_init.go
, just after the package
specification.
Then the non-function and function names are interned in that
same namespace (where {interns}
appears, above), with each such
intern looking like:
<NSNAME>Namespace.InternVar("FN", GOFN,
MakeMeta(NewListFrom(NewVectorFrom(MakeSymbol("ARG1"), ...)),
DOCSTRING)
ARGn
is basically each ARGSPEC
, including &
where applicable,
but without the tags (i.e. the type info is lost here).
This is where Joker looks up bar
in (bar ...)
, using the
applicable namespace in effect, and knows to call bar_
(the
GOFN
for bar
) with the array of Object
's comprising the
arguments in ...
.
Once a joker
executable has been built with the desired new and changed namespaces, online documentation is generated via:
$ (cd docs; ../joker generate-docs.joke)
Joker distributions currently include core
and std
libraries' documentation in their repositories, so new and changed .html
files should be added to the changeset(s) along with the corresponding library code.
Joker currently has circular dependencies between the core and std namespaces, as well as within the std namespace itself.
There's actually a circular dependency between the two sets of namespaces:
core/gen_code/gen_code.go
importsstd/string
, so the initialization code that adds the namespace is runstd/string/a_string.go
is generated bystd/generate-std.joke
std/generate-std.joke
is run by the first Joker executable built byrun.sh
- That Joker executable cannot be built until after
gen_code.go
has been run
This circular dependency is avoided, in practice, by ensuring that any std/*/a_*.go
files are already generated and present before any new dependencies upon them are added to gen_code.go
.
However, a std/*.joke
file therefore cannot depend on any core/data/*.joke
-defined namespace that, in turn, requires gen_code.go
to import its std/*/a_*.go
file.
So, while joker.repl
and joker.tools.cli
currently depend on joker.string
, std/string.joke
does not depend on them, and preexisted their being added to the core namespaces.
One approach to avoid this problem without (any longer) including generated artifacts (a_*.go
files) in the repository, nor requiring an old version of Joker for bootstrapping, would be for the build process (in run.sh
) to start by building a Joker executable that includes only joker.core
.
Then, that interim Joker executable could be used to run std/generate-std.joke
to generate the a_*.go
files in std/
, after which a "complete" version of Joker would then be built.
However, as explained below, that wouldn't solve the problem entirely, since std/generate-std.joke
currently requires more than just joker.core
to work.
The std/*/a_*.go
files are needed to build Joker, but are generated by std/generate-std.joke
, which needs Joker to run.
Further, std/generate-std.joke
requires both joker.os
and joker.string
.
Those dependencies mean that even if the Joker build process was changed to start by building a Joker executable supporting only joker.core
, the resulting executable would be unable to run std/generate-std.joke
.
Again, the presence of std/*/a_*.go
(at least for joker.os
and joker.string
) in the repository avoids this being a problem. (Another solution would be to use an older version of Joker to be used to run std/generate-std.joke
and thus build a "fresh" one.)
Converting joker.os
and joker.string
into core libraries (so, the underlying support code would be in package core
), and adding them to the list of libraries built into an "interim" Joker executable (as described above), is one approach to solving this issue.
run.sh builds (via the go generate ./...
step) an extra set of Go source files that, unless disabled via a build tag, statically initialize most of the core namespace info. (Some runtime initialization must still be performed, due mainly to limitations in the Go compiler.)
TBD, but something like this was done to search for Joker code that runs before main()
and determine how best to handle it in a slow-vs-fast split build:
grep --color -nH --null -E -e '^(func init\(|var )' *.go ../*.go ../std/*/*.go | grep -v ' ProcFn = '
The fast-startup version necessitated (as of this writing) these changes:
Regex
is now*Regex
(a reference type), mainly so runtime initialization (from aregexp.MustCompile()
call) can be assigned into the.Value
or equivalent member of a static structure.internalNamespaceInfo
is a new struct type that wraps[]byte
for the core namespace, addinginit func()
(the slow version uses this for lazy-loading of core namespaces; might be replaceable via the.Lazy
mechanism if we always map all core namespaces) andavailable bool
(which aids detecting missinga_*.go
files more elegantly).- Many (larger/complicated) static vars' definitions and initializations have been separated out into
_slow_init.go
files (e.g.procs_slow_init.go
,environment_slow_init.go
, etc.), which are// +build slow_init
, in that they aren't built into Joker itself. - An additional source file,
core/environment_fast_init.go
, contains an empty receiver to parallel the one incore/environment_slow_init.go
. Proc
now wraps the formerProc
(renamedProcFn
) and adds self-identifying info (the name of the procedure and its package), to help code generation when it encounters them.- A new
core/gen_code/gen_code.go
program replaces the oldgen_data.go
program. It generatesa_*code.go
files that mostly define static variables representing the structures resulting when loadingcore/data/core.joke
and the like; they are compiled only when building Joker itself. It also does the work thatgen_data.go
used to do, except only forcore/data/linter_*.joke
files and the generation ofcore/a_data.go
. - The new
run.sh
runs, via thego generate ./...
step,gen_code.go
, which takes about only a few seconds on my Ryzen 3, and which generates these static-initializing files. run.sh
continues on to building either the “original” (slower) Joker executable, hardlinked tojoker.slow
, or the fast-startup version, hardlinked tojoker.fast
. Whichever is built, it becomes the default for subsequent use (such as runningstd/generate-std.joke
and then running the executable itself with whatever arguments were provided torun.sh
).- A new
core/code.go
module is a helper forgen_code.go
, since the latter isn’t part ofpackage core
. - A new
core/gen_go
package is used solely bygen_code
and implements the details of compiling Go variables into (mostly) static Go code. - The new private function
joker.core/ns-initialized?
tells whether a namespace has been initialized (fully, including potentially lazily, loaded). Useful as a debugging tool, it's also used bystd/generate-std.joke
to determine whichstd
libraries are preloaded by loading all core libraries due to being required by them.
When built via e.g. go build -tags go_spew
, the private joker.core/go-spew
function is enabled. (Otherwise it does nothing and returns false
.)
This function dumps, to stderr
, the internal structure of the argument passed to it (i.e. a Joker object), and returns true
.
Optionally, a second argument may be specified that is a map with configuration options as described in the go-spew
documentation, though not all such operations are yet supported by Joker's go-spew
function.
For example, the internals of the keyword :hey
can be output in this fashion:
user=> (joker.core/go-spew :hey {:MaxDepth 5 :Indent " " :UseOrdinals true})
(core.Keyword) {
InfoHolder: (core.InfoHolder) {
info: (*core.ObjectInfo)(#1)({
Position: (core.Position) {
endLine: (int) 1,
endColumn: (int) 24,
startLine: (int) 1,
startColumn: (int) 21,
filename: (*string)(#2)((len=6) "<repl>")
}
})
},
ns: (*string)(<nil>),
name: (*string)(#3)((len=3) "hey"),
hash: (uint32) 819820356
}
true
user=>
Note: The SpewState
configuration option is not currently supported; each distinct call to go-spew
thus starts with a "fresh" state.
The source code comprising Joker currently supports these custom build tags:
This enables building code needed by gen_code
or disables code that gen_code
itself generates.
This enables joker.core/go-spew
and some internal code (typically depending on core.VerbosityLevel > 0
) calling go-spew
, rather than no-ops.