Skip to content

Commit

Permalink
Update compiler to use latest LLVM version (#471)
Browse files Browse the repository at this point in the history
* Start branch to rework LLVM generation
The plan is to generate a .ll file directly and compile and link that,
to move away from the unmaintained Haskell interface to LLVM.

* Remove all dependencies on llvm-hs and shim out all LLVM generation code.

* Compiler successfully builds, but doesn't generate LLVM

* A couple of minor code cleanups

* First version of prelude and set up for LLVM

* Add generated .ll file as a sample.

* Generate external declarations for resources.

* Make resource globals undef instead of extern

* Clean up resource global declaration generation

* Add code to write LPVM code into LPVM section

* Don't add 1 to specified lengths of strings

* Introduce LLVM state monad, currently just holding file handle

* Generate LLVM function signatures: start of code generation

* Take signature info from ProcImpln (PrimProto instead of Proto)

* pick lint

* First go at generating LLVM proc bodies; much is wrong.

* Generate foreign LLVM instructions and simple branches.  Will need to convert from single assignment to true SSA.

* Generate proper SSA; built and disassemble tuples for multi-argument return.

* Generate LLVM switches for LPVM forks with more than 2 branches.

* Generate C code to report sizes of C types.  Start work on generating foreign lpvm instructions.  Fix generation of llvm switches.

* Generate string constants and use them when generating code.

* Generate decimal integers as character constants in LLVM code.

* Introduce new CPointer TypeRepresentation, used to represent opaque pointers, represented as 'ptr' in LLVM.  Add  type representation in 'representation is' Wybe type declarations.  Use CPointer as the type of manifest string constants.  Generate an LLVM declaration for each string constant, and use that constant by name everywhere a manifest string constant appears in the code.

* Fix fencepost bug in switch generation.

* Implement LPVM cast, load, and store

* Now generates declarations of external called functions; generates good LLVM code for many library modules.

* Define the size of a C pointer
Also document assumptions made by c_config.c

* Generate correct form for Wybe strings; add code to generate constant structures, needed to generate manifest constant strings.

* Handle lpvm access and mutate; improve logging
Not handling OutByReference or TakeReference yet.

* Partially handle FlowOutByReference and FlowTakeReference

* Add separate partitionArgsWithRefs function to handle by-ref arguments

* Improve doc; start work on handling out-by-reference arguments.

* Reformat description of LLVM output parts

* regenerate src directory README

* Improve LLVM module doc

* Reorganise LLVM module

* More LLVM module reorg

* Now handle FlowTakeReference and FlowOutByReference; untested.

* Improve generated README.md

* fix duplicated types in type conversion exprs

* Fix bugs in deferring calls; add more logging

* Bug fixes, more logging

* Slight cleanup

* avoid use of bitcast for var = var assignment
use LLVMName for both local and global vars

* Use conversion instrs except for constants
include temp counter with proc dumps

* bug fix:  temp counter was messed up by expansion with fusion

* Properly set up LLVM monad for each proc

* Eliminate unwanted args/parameters
Exclude phantom and unneeded arguments and
parameters from calls and definitions.  Include
source type in constant type conversion
expressions.

* Be more systematic with LLVM arguments; fix bug
Generate \\ for backslash character in C strings;
was generating single \.
Avoid conversion instruction when marshalling
constants to pass in C function call.

* Generate HO calls

* Supply new 'volatile' argument to memcpy intrinsic

* Generate LLVM for proc specialisations, too
Scan specialisation bodies when scanning bodies
for strings and externs to generate.  Generate
separate LLVM definitions for specialisation
bodies.

* Systematially use LLVM names for instructions

* typo

* Fix type convertion during LLVM generation
Automatically convert between representations when
generating LLVM code.  Remove cast from smaller to
larger type in generating unboxed representations
of small constructors.  Make type checker aware of
automatic conversion during LLVM generation, so it
can be more forgiving.  Conversion includes zero
and sign extension, truncation, and bitcasting.

* Support switching on signed integers.

* Add XXX comments

* Fix generation of switch defaults

* Ensure SSA form by renaming assigned variables

* Fix generation of unboxed mutator

* Fix handling of load and store instructions

* Log LLVM code; generate .s files on request
Fix logging to include logging of LLVM code
Make a few changes to make LLVM logging closer to
old LLVM logging
Add support for generating .s (native assembler)
files.

* delete accidentally added file

* Improve compatibility with old LLVM generation

* Fix empty variable name for main LLVM section

* update version number to 0.2

* Fix problems with extern declarations
Generate extern declarations for imported
resources, but only if they're used.  Name the
main proc 'main', and use ccc instead of fastcc
for it.  Quote names if they contain characters
other than alphanumerics, underscore and period.

* Write LLVM code for submodules along with parent

* Reduce change to logged LLVM code;
Don't include submodules in generated LLVM when
logging, because logging is already recursive.
Omit git hash from header comment in .ll file,
because it gives spurious diffs when testing.
Explicitly specify C calling convention when
calling C code.  Generate extern decl for memcpy
intrinsic last instead of first.

* Get most tests to pass
... although many still fail.
Updated update-exp script to give tighter diffs
Understand void and intrinsic_bool (i1) C types
Show Pointer type as "pointer" instead of
"address".  Include "ccc" in calls to wybe_malloc.
Correctly consider string constants to be Pointers
instead of CPointers.  Eliminate many spurious
bitcasts of constants (but accidentally add one in
at least one place).

* Delete spurious directory

* Distinguish between wybe strings and C strings
Unfortunately, this change renumbers all the
string constants, but also, we don't generate a
wybe string constant when only a C string is used.

* filter out target triple when checking expected results.

* Bug fix:  non-destructive lpvm mutate mutated wrong structure

* bug fix:  declare wybe_malloc for non-destructive lpvm mutate

* Bug fix:  lpvm cast fields of unboxed constructors while constructing.

* Bug fix: support resources in submodules

* Track whether to prefix with 'tail' or 'musttail' based on alloca calls

* Finish omitted comment

* some code reorganisation

* Fix handling of HO calls; drop mustinline in defs

* Properly load outByRef value from the ref after the call, when it wasn't created by a takeRef

* Insert date at top of ERRS file

* Allocate closures on the heap

Ensure wybe_malloc is declared
Handle type conversion of closures correctly
Fix generation of alloca instruction

* Correct generation of HO trampoline

* Eliminate more unnecessary bitcasts

* more trivial bitcast removal test cases

* XXX turn HO call to known closure into FO call

* refactor LLVM prescanning to be monadic

* Rename fns to generalise from strings to consts

* Generate static constants for all-constant closures

* Don't turn all HO arguments to i64

* Fix duplicated externs, building dynamic closures
Generates externs in a different order now.

* Add a log message

* Store value through out-by-ref pointer as needed
Wherever a value is stored in a variable that was
an out-by-reference parameter, store the value
where the out-by-reference pointer points.
Use new function makeLLVMArg wherever we append an
llvm type and llvm value, separated by a space.

* Store floats in closures as floats, not ints

* Fix handling of externs for mutually recursive modules

* No conversion needed for equivalent types + add doc

* Handle conversions better
Just use bitcast for automatic integer <-> float
converstions, to preserve all information.

Track assigned LLVM variable types, so we can
convert when the value is subsequently used as a
different type.  This happens for generics, when a
typed value is supplied for a generic value.  This
really isn't ideal, but the type checker can give
the same variable different types in this case.

* Ignore changes to source_filename and target triple in complex tests

* Fix syntax error in last commit

* Fix bug in out-by-reference call argument
Handle the case where out-by-reference argument
is also an out-by-reference parameter.  In this
case, we can just use the pointer parameter,
rather than passing a pointer to alloca-ed memory
and fetching the contents after the call.  This
allows the call to be tail recursive.

* Delete no-longer-used source files

* Fix expected output for one test; fix test case to be more readable.

* Address most of Jame's review comments

* Install llvm package for CI testing

* Code cleanup, remove unused code, changesome XXX comments to TODO

* Update github CI runners to recent OSes; don't specify llvm 9

* Try debugging CI workflow

* Another try to debug github CI

* Heuristic for final-dump tests to show actual output for error cases rather than diff

* Explicitly specify llvm version in ubuntu CI workflow

* Also install llvm-18-dev for CI

* Keep trying to get ubuntu workflow working

* attempt to fix ubuntu build; normalise spaced LLVM array type syntax (#472)

* attempt to fix ubuntu build; normalie spaced LLVM array type syntax

* clean CConfig

* one more lpvm section name

* normalise tmp dir in complex tests

* normalise more paths

* add path to complx-test call; fix type on signum

* derive path in python

* Do LLVM type conversions (bitcasts) on call and return

* Fix:  delete .ll file after generating a .s file.

* Fix generated trampolines
Update trampoline parameters to have type AnyType
and generated name, and generate code to convert
from AnyType to the actual type of the parameter
on entry, and similarly on exit for output
parameters.  Do similarly for closure arguments.

* Delete accidental file; fix up documentation

* Some cleanups, mostly removing commented-out code

---------

Co-authored-by: James <[email protected]>
  • Loading branch information
pschachte and jimbxb authored Oct 18, 2024
1 parent 085c463 commit 17687d8
Show file tree
Hide file tree
Showing 271 changed files with 33,312 additions and 41,463 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/macos.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,14 +16,14 @@ on:

jobs:
build:
runs-on: macos-10.15
runs-on: macos-14

steps:
- uses: actions/checkout@v2
- name: Install dependencies
run: |
brew install llvm
brew install bdw-gc
brew install llvm-hs/llvm/llvm-9
brew install dwdiff
brew install coreutils
Expand Down
15 changes: 12 additions & 3 deletions .github/workflows/ubuntu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,23 +10,32 @@ on:

jobs:
build:
runs-on: ubuntu-20.04
runs-on: ubuntu-24.04

steps:
- uses: actions/checkout@v2
- name: Install dependencies
run: |
sudo apt-get install llvm
sudo apt-get install clang
sudo apt-get install libgc-dev
sudo apt-get install llvm-9-dev
sudo apt-get install libtinfo-dev
sudo apt-get install dwdiff
# GitHub-hosted runners provide Haskell Stack
# https://github.com/actions/virtual-environments/blob/master/images/linux/Ubuntu1804-README.md
# - name: Install Haskell Stack
# run: wget -qO- https://get.haskellstack.org/ | sh

- name: Verify dependencies
run: |
dpkg -s llvm
dpkg -s clang
dpkg -s libgc-dev
dpkg -s dwdiff
llvm-config --version
llc --version
clang --version
# cache Haskell Stack stuff
- name: Cache stack global package db
uses: actions/cache@v1
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -58,3 +58,5 @@ README.html
dist-newstyle
stack.yaml.lock
LOG*
src/c_config
src/CConfig.hs
40 changes: 27 additions & 13 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,13 @@
INSTALLBIN=/usr/local/bin
INSTALLLIB=/usr/local/lib/wybe

# Configure any extra C library and include directories
EXTRALIBS=-L /usr/local/lib -L /opt/homebrew/lib
EXTRAINCLUDES=-I /usr/local/include -I /opt/homebrew/include


# You shouldn't need to edit anything below here
VERSION = 0.1
VERSION = 0.2
SRCDIR = src
LIBDIR = wybelibs
WYBELIBS = wybe.o command_line.o logging.o random.o benchmark.o
Expand Down Expand Up @@ -40,7 +44,7 @@ install: wybemk
"$(INSTALLBIN)/wybemk" --force-all $(addsuffix ", $(addprefix "$(INSTALLLIB)/,$(WYBELIBS)))


wybemk: $(SRCDIR)/*.hs $(SRCDIR)/Version.lhs
wybemk: $(SRCDIR)/*.hs $(SRCDIR)/CConfig.hs $(SRCDIR)/Version.lhs
stack -j3 build && cp "`stack path --local-install-root`/bin/$@" "$@"

libs: $(addprefix $(LIBDIR)/,$(LIBS))
Expand All @@ -53,37 +57,46 @@ $(LIBDIR)/wybe.o: wybemk $(LIBDIR)/wybe/*.wybe


$(LIBDIR)/wybe/cbits.o: $(LIBDIR)/wybe/cbits.c
clang $(ISSYSROOT) -I /usr/local/include -c "$<" -o "$@"
clang $(ISSYSROOT) $(EXTRAINCLUDES) -c "$<" -o "$@"


$(SRCDIR)/Version.lhs: $(addprefix $(SRCDIR)/,*.hs)
@echo -e "Generating Version.lhs for version $(VERSION)"
@rm -f "$@"
@printf "Version.lhs automatically generated: DO NOT EDIT\n" > "$@"
@printf "\n" >> "$@"
@printf "> module Version (version,gitHash,buildDate,libDir) where\n\n" >> "$@"
@printf "> module Version (version,gitHash,buildDate,libDir,defaultTriple) where\n\n" >> "$@"
@printf "> version :: String\n> version = \"%s\"\n\n" "$(VERSION)" >> "$@"
@printf "> gitHash :: String\n> gitHash = \"%s\"\n\n" "`git rev-parse --short HEAD`" >> "$@"
@printf "> buildDate :: String\n> buildDate = \"%s\"\n\n" "`date`" >> "$@"
@printf "> libDir :: String\n> libDir = \"%s\"\n\n" "$(INSTALLLIB)" >> "$@"
@printf "> defaultTriple :: String\n> defaultTriple = \"" >> "$@"
@clang --version | sed -n 's/Target: *\(.*\)/\1\"/p' >> "$@"
@printf "\n\n" >> "$@"

$(SRCDIR)/CConfig.hs: $(SRCDIR)/c_config
$< > $@

$(SRCDIR)/c_config: $(SRCDIR)/c_config.c
clang $(ISSYSROOT) $(EXTRAINCLUDES) -o $@ $<


.PHONY: doc
doc: src/README.md


# Assemble README markdown source file automatically
src/README.md: src/*.hs Makefile src/README.md.intro src/README.md.outro
src/README.md: src/*.hs Makefile src/README.md.intro src/README.md.outro \
src/Compiler.png src/Detail.png
cat src/README.md.intro > "$@"

printf "The source files in this directory and their purposes are:\n\n" >> "$@"
printf "| File " >> "$@"
printf "| Purpose |\n" >> "$@"
printf "| ---------------------------- " >> "$@"
printf "| -------------------------------------------------------- |\n" >> "$@"
printf "| File | Purpose |\n" >> "$@"
printf "| ---- | -------------------------------------------- |\n" >> "$@"
for f in src/*.hs ; do \
b=`basename $$f` ; \
m=`basename $$f .hs` ; \
printf "| `printf '%-29s' [$$b]\(#$$m\)`| " ; \
printf "| `printf '%-20s' [$$b]\(#$$m\)` | " ; \
sed -n "s/^-- *Purpose *: *\(.*\)/\1/p" $$f | tr -d '\n' ; \
printf " |\n" ; \
done >> "$@"
Expand All @@ -92,15 +105,16 @@ src/README.md: src/*.hs Makefile src/README.md.intro src/README.md.outro
for f in src/*.hs ; do \
m=`basename $$f .hs` ; \
echo -e ; \
sed -E -e '/^-- *Purpose *:/{s/^-- *Purpose *:/## '"$$m -- "'/; G; p;}' -e '/BEGIN MAJOR DOC/,/END MAJOR DOC/{//d ; s/^-- ? ?//p;}' -e 'd' <$$f ; \
echo -e "## $$m <a id="$$m"></a>" ; \
sed -E -e '/BEGIN MAJOR DOC/,/END MAJOR DOC/{//d ; s/^-- ? ?//p;}' -e 'd' <$$f ; \
done >> "$@"
printf "\n\n" >> "$@"
cat src/README.md.outro >> "$@"
test: wybemk
@rm -f ERRS ; touch ERRS
@rm -f ERRS ; printf "Testing run " > ERRS ; date >> ERRS
@rm -f $(LIBDIR)/*.o $(LIBDIR)/wybe/*.o
@echo -e "Building $(LIBDIR)/wybe/cbits.o"
@make $(LIBDIR)/wybe/cbits.o
Expand All @@ -111,4 +125,4 @@ test: wybemk
clean:
stack clean
rm -f $(SRCDIR)/*.o $(SRCDIR)/*.hi $(SRCDIR)/Version.lhs documentation/*.pdf publications/*.pdf $(LIBDIR)/*.o $(LIBDIR)/wybe/*.o test-cases/*.o
rm -f $(SRCDIR)/*.o $(SRCDIR)/*.hi $(SRCDIR)/Version.lhs $(SRCDIR)/CConfig.hs documentation/*.pdf publications/*.pdf $(LIBDIR)/*.o $(LIBDIR)/wybe/*.o test-cases/*.o
10 changes: 3 additions & 7 deletions TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@


## Error checking:
* Error if foreign call has no outputs; suggest use !I/O.
* Error if foreign call has no outputs; suggest `use !io` or make it impure.
* Ensure no statement binds the same variable multiple times


Expand All @@ -38,11 +38,8 @@
* Fix the syntax!
* Support curley braces to specify sets and maps
* Interpolation (in strings, arrays, sets, and maps)
* "...@foo..." means "..." `,,` foo `,,` "..."
* "...@(foo(bar,baz))..." means "..." `,,` foo(bar,baz) `,,` "..."
* [foo,@bar(baz),zip] means [foo] `,,` bar(baz) `,,` [zip]
* if `,,` can run backwards, then [?foo,@?bar] and [@?foo,bar] can be patterns
* with this, do we need `[ ... | ...]` syntax?
* "...$(foo(bar,baz))..." means "..." `,,` foo(bar,baz) `,,` "..."
* [foo,$bar(baz),zip] means [foo] `,,` bar(baz) `,,` [zip]
* Support "commutative" resources, which don't need to be threaded everywhere
* Support unicode
* Investigate situation calculus
Expand Down Expand Up @@ -86,5 +83,4 @@


## Porting:
* to Windows
* Rewrite compiler in Wybe
65 changes: 45 additions & 20 deletions WYBE.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,9 @@ Hello, World!

Note that `wybemk` is like `make` in that you give it the name of the
file you want it to build, and it figures out what files it needs
to compile.
to compile. Currently, the Wybe compiler can generate an executable file, an
object (.o) file, an LLVM assembler (.ll) file, an LLVM bitcode (.bc), and a
native assembly language (.s) file from a wybe source file.

### Compiler Options

Expand All @@ -108,8 +110,8 @@ can be found with the following:

#### Optimisation Options

The `--llvm-opt-level` (`-O`) options specifies the level of optimisation used
within the LLVM compiler during the compilations stage of a Wybe module. By default, this is set to 3, yet supports the values 0, 1, 2, or 3. More information
The `--llvm-opt-level` (-O) option specifies the level of optimisation used
within the LLVM compiler during the compilation stage of a Wybe module. By default, this is set to 3, yet supports the values 0, 1, 2, or 3. More information
can be found [here](https://llvm.org/docs/CommandGuide/llc.html#id1).


Expand Down Expand Up @@ -2243,18 +2245,38 @@ Floating point multiplication
Floating point division
- `foreign llvm frem(`arg1:float, arg2:float`)`:float
Floating point remainder
- `foreign llvm fcmp_eq(`arg1:float, arg2:float`)`:bool
- `foreign llvm fcmp_ord(`arg1:float, arg2:float`)`:bool
Floating point ordered (neither is a NaN)
- `foreign llvm fcmp_oeq(`arg1:float, arg2:float`)`:bool
Floating point equality
- `foreign llvm fcmp_ne(`arg1:float, arg2:float`)`:bool
- `foreign llvm fcmp_one(`arg1:float, arg2:float`)`:bool
Floating point disequality
- `foreign llvm fcmp_slt(`arg1:float, arg2:float`)`:bool
- `foreign llvm fcmp_olt(`arg1:float, arg2:float`)`:bool
Floating point (signed) strictly less
- `foreign llvm fcmp_sle(`arg1:float, arg2:float`)`:bool
- `foreign llvm fcmp_ole(`arg1:float, arg2:float`)`:bool
Floating point (signed) less or equal
- `foreign llvm fcmp_sgt(`arg1:float, arg2:float`)`:bool
- `foreign llvm fcmp_ogt(`arg1:float, arg2:float`)`:bool
Floating point (signed) strictly greater
- `foreign llvm fcmp_sge(`arg1:float, arg2:float`)`:bool
- `foreign llvm fcmp_oge(`arg1:float, arg2:float`)`:bool
Floating point (signed) greater or equal
- `foreign llvm fcmp_ord(`arg1:float, arg2:float`)`:bool
Floating point unordered (either is a NaN)
- `foreign llvm fcmp_ueq(`arg1:float, arg2:float`)`:bool
Floating point unordered or equal
- `foreign llvm fcmp_une(`arg1:float, arg2:float`)`:bool
Floating point unordered or not equal
- `foreign llvm fcmp_ult(`arg1:float, arg2:float`)`:bool
Floating point unordered or strictly less
- `foreign llvm fcmp_ule(`arg1:float, arg2:float`)`:bool
Floating point unordered or less or equal
- `foreign llvm fcmp_ugt(`arg1:float, arg2:float`)`:bool
Floating point unordered or strictly greater
- `foreign llvm fcmp_uge(`arg1:float, arg2:float`)`:bool
Floating point unordered or greater or equal
- `foreign llvm fcmp_true(`arg1:float, arg2:float`)`:bool
Always returns true with no comparison
- `foreign llvm fcmp_false(`arg1:float, arg2:float`)`:bool
Always returns false with no comparison
##### <a name="conversion"></a>Integer/floating point conversion
Expand All @@ -2281,17 +2303,20 @@ declaration has the form:
where *rep* has one of these forms:
- `address`
the type is a machine address, similar to the `void *` type in C.
- *n* `bit` *numbertype*
a primitive number type comprising *n* bits, where *n* is any non-negative
integer and *numbertype* is one of:
- `signed`
a signed integer type
- `unsigned`
an unsigned integer type
- `float`
a floating point number; *n* must be 16, 32, 64, or 128.
- `pointer`
the type is the address of a Wybe data structure. Foreign code should not
treat this as an ordinary pointer.
- `opaque`
the type is a machine address, similar to the `void *` type in C. Wybe treats such values as opaque.
- *n* `bit signed`
a signed primitive number type comprising *n* bits, where *n* is any non-negative
integer. Represents integers between -2<sup>*n*-1</sup> and 2<sup>*n*-1</sup>-1 inclusive.
- *n* `bit unsigned`
an unsigned primitive number type comprising *n* bits, where *n* is any non-negative
integer. Represents integers between 0 and 2<sup>*n*</sup>-1 inclusive.
- *n* `bit float`
a floating point number type comprising *n* bits, where *n* is one of 16, 32,
64, or 128.
Like a `constructor` declaration, a `representation` declaration makes the
enclosing module into type. Also like a `constructor` declaration, a submodule
Expand Down
6 changes: 0 additions & 6 deletions hie.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,6 @@ cradle:
- path: "./src/BinaryFactory.hs"
component: "wybe:exe:wybemk"

- path: "./src/Blocks.hs"
component: "wybe:exe:wybemk"

- path: "./src/BodyBuilder.hs"
component: "wybe:exe:wybemk"

Expand All @@ -24,9 +21,6 @@ cradle:
- path: "./src/Clause.hs"
component: "wybe:exe:wybemk"

- path: "./src/Codegen.hs"
component: "wybe:exe:wybemk"

- path: "./src/Config.hs"
component: "wybe:exe:wybemk"

Expand Down
Loading

0 comments on commit 17687d8

Please sign in to comment.