Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incremental compilation progress #21063

Merged
merged 25 commits into from
Aug 18, 2024
Merged

Incremental compilation progress #21063

merged 25 commits into from
Aug 18, 2024

Conversation

mlugg
Copy link
Member

@mlugg mlugg commented Aug 13, 2024

Say you build quick? Baby I know,
That's that incremental

-Sabrina Carpenter 2024


We're getting close now, folks!

This PR makes progress towards incremental compilation. Various bugs and regressions have been fixed. The incr-check tool is expanded with some more options, such as passing through -fno-emit-bin to the compiler. With this option, the existing incremental compilation test, and the 3 added by this commit, all succeed. I don't know whether the right things are going to codegen, but some hacky CBE testing indicates... maybe?

The biggest blocker right now is linker work from @kubkon -- it's currently hard for me to know for sure whether the full pipeline is actually working correctly. Once one of the self-hosted linkers (presumably ELF) can perform basic incremental updates, we can ramp up the testing.

Trying to run the hello test without -fno-emit-bin currently results in the following linker crash:

error: update 'change the string' failed:
thread 16224 panic: reached unreachable code
/home/mlugg/zig/the-great-decl-split-mk2/lib/std/debug.zig:399:14: 0x18c9b9d in assert (zig)
    if (!ok) unreachable; // assertion failure
             ^
/home/mlugg/zig/the-great-decl-split-mk2/lib/std/hash_map.zig:1089:19: 0x1ce013e in putAssumeCapacityNoClobberContext (zig)
            assert(!self.containsContext(key, ctx));
                  ^
/home/mlugg/zig/the-great-decl-split-mk2/lib/std/hash_map.zig:1065:51: 0x1aeeb02 in putNoClobberContext (zig)
            self.putAssumeCapacityNoClobberContext(key, value, ctx);
                                                  ^
/home/mlugg/zig/the-great-decl-split-mk2/lib/std/hash_map.zig:1057:44: 0x193b813 in putNoClobber (zig)
            return self.putNoClobberContext(allocator, key, value, undefined);
                                           ^
/home/mlugg/zig/the-great-decl-split-mk2/src/link/Elf.zig:4175:53: 0x210e222 in allocateAllocSections (zig)
            try self.phdr_to_shdr_table.putNoClobber(gpa, shndx, phndx);
                                                    ^
/home/mlugg/zig/the-great-decl-split-mk2/src/link/Elf.zig:1309:35: 0x1e1b785 in flushModule (zig)
    try self.allocateAllocSections();
                                  ^
/home/mlugg/zig/the-great-decl-split-mk2/src/link/Elf.zig:1008:25: 0x1ba8d8d in flush (zig)
    try self.flushModule(arena, tid, prog_node);
                        ^
/home/mlugg/zig/the-great-decl-split-mk2/src/link.zig:610:77: 0x19c4382 in flush (zig)
                return @as(*tag.Type(), @fieldParentPtr("base", base)).flush(arena, tid, prog_node);
                                                                            ^
/home/mlugg/zig/the-great-decl-split-mk2/src/Compilation.zig:2444:17: 0x19c3bea in flush (zig)
        lf.flush(arena, tid, prog_node) catch |err| switch (err) {
                ^
/home/mlugg/zig/the-great-decl-split-mk2/src/Compilation.zig:2411:22: 0x19c7d96 in update (zig)
            try flush(comp, arena, .main, main_progress_node);
                     ^

Copy link
Member

@andrewrk andrewrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can feel free to merge this when you feel it is ready. I have some requests mainly dealing with making the codebase more understandable. I think this logic has become quite complex, more than I am comfortable with, so let's start by at least documenting it more meticulously.

Let's make sure follow-up tasks are on the issue tracker rather than in TODO comments, please.

src/Compilation.zig Outdated Show resolved Hide resolved
src/Compilation.zig Outdated Show resolved Hide resolved
/// This contains a list of AnalUnit whose analysis or codegen failed, but the
/// failure was something like running out of disk space, and trying again may
/// succeed. On the next update, we will flush this list, marking all members of
/// it as outdated.
retryable_failures: std.ArrayListUnmanaged(AnalUnit) = .{},

/// These are the modules which we initially queue for analysis in `Compilation.update`.
/// `resolveReferences` will use these as the root of its reachability traversal.
analysis_roots: std.BoundedArray(*Package.Module, 3) = .{},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does a dependency on BoundedArray really need to be dragged in? This could just be 3 flags, or even 0 flags with the (arguably trivial) logic repeated.

/// Rename to find all the logic that determines analysis roots.
pub const analysis_roots_logic = {};

elsewhere:

_ = analysis_roots_logic;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't want to use BoundedArray, that's fair -- I can just replicate the logic. (FWIW, I don't really understand your general aversion to BoundedArray -- it's a pretty convenient abstraction at times IME)

src/InternPool.zig Outdated Show resolved Hide resolved
src/InternPool.zig Show resolved Hide resolved
src/InternPool.zig Show resolved Hide resolved
src/InternPool.zig Show resolved Hide resolved
try type_queue.ensureTotalCapacity(gpa, zcu.analysis_roots.len);
for (zcu.analysis_roots.slice()) |mod| {
// Logic ripped from `Zcu.PerThread.importPkg`.
// TODO: this is silly, `Module` should just store a reference to its root `File`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion per #363:

  • don't put the string "TODO" in the comment
  • make a list of follow-up things to do in your PR writeup
  • after merging, file issues for the follow-up things

src/Zcu.zig Show resolved Hide resolved
src/Zcu/PerThread.zig Show resolved Hide resolved
@mlugg
Copy link
Member Author

mlugg commented Aug 16, 2024

Thanks for the review!

I'm currently going down a rabbit-hole of trying to make a more complex example work. Since there are no big conflicts coming in right now, I'll spend some time polishing this up -- implementing your review comments, and adding doc comments to explain things a little better. Sorry for the confusion -- this branch has had a lot of active experimentation, and my documentation efforts have fallen behind.

Something I might do is write one document which details the overall strategy for incremental compilation, since I have a suspicion I'm the only person with a solid understanding of it right now (my own fault!). If you're happy with that, would you prefer it go in the repo under doc/, or on the wiki?

@andrewrk
Copy link
Member

Something I might do is write one document which details the overall strategy for incremental compilation, since I have a suspicion I'm the only person with a solid understanding of it right now (my own fault!). If you're happy with that, would you prefer it go in the repo under doc/, or on the wiki?

That would be lovely. My suggestion is to put it somewhere other than the repo otherwise we'll get contributions to fix typos in it, and minor sentence rewordings that don't really matter.

@henrikolsen549
Copy link

@andrewrk, would love to see the docs in the repo:

  • High-level design docs are extremely valuable. When placed in the repo, they are directly accessible.
  • Historic versions of the docs are accessible in sync with the code via version control.
  • Typos are highly irritating while reading, at least for detail-oriented personality types like me.
    Zig zen states: Favor reading code over writing code. I think this should also apply to docs.

@mlugg
Copy link
Member Author

mlugg commented Aug 17, 2024

I've written up an informal high-level overview, mainly useful for existing compiler developers. @andrewrk, you may find this handy to read through at some point.

https://gist.github.com/mlugg/73b3e60c803006f3556d87c9ed3e8a0e

mlugg added 21 commits August 17, 2024 18:50
This commit makes more progress towards incremental compilation, fixing
some crashes in the frontend. Notably, it fixes the regressions introduced
by ziglang#20964. It also cleans up the "outdated file root" mechanism, by
virtue of deleting it: we now detect outdated file roots just after
updating ZIR refs, and re-scan their namespaces.
And add a new incremental test to match!
This commit updates `Zcu.resolveReferences` to traverse the graph of
`AnalUnit` references (starting from the 1-3 roots of analysis) in order
to determine which `AnalUnit`s are referenced in an update. Errors for
unreferenced entities are omitted from the error bundle. However, note
that unreferenced `Nav`s are not removed from the binary.
We were accidentally over-reporting most `namespace_name` deps and *not*
reporting some actually outdated ones!
Two fixes here.

* Prevent a crash when sorting the list of analysis errors when some
  errors refer to lost source locations. These errors can be sorted
  anywhere in the list, because they are (in theory) guaranteed to never
  be emitted by the `resolveReferences` logic. This case occurs, for
  instance, when a declaration has compile errors in the initial update
  and is deleted in the second update.

* Prevent a crash when resolving the source location for `entire_file`
  errors for a non-existent file. This is the bug underlying ziglang#20954.

Resolves: ziglang#20954.
An enum type is kind of like a struct or union type, in that field
errors are happening during type resolution. The only difference is that
type resolution happens at the time the type is created. So, errors in
fields should not cause the type to be deleted: we've already added a
reference entry, and incremenetal dependencies which must be invalidated
if the compile error is fixed. Once we call `WipEnumType.prepare`, we
should never call `WipEnumType.cancel`. This is analagous to logic for
enum declarations in `Sema.zirEnumDecl`.
This case is adapted from ziglang#11344, and passes with `-fno-emit-bin`.

Resolves: ziglang#11344
When a type becomes outdated, there will still be lingering references
to the old index -- for instance, any declaration whose value was that
type holds a reference to that index. These references may live for an
arbitrarily long time in some cases. So, we can't just remove the type
from the pool -- the old `Index` must remain valid!

Instead, we want to preserve the old `Index`, but avoid it from ever
appearing in lookups. (It's okay if analysis of something referencing
the old `Index` does weird stuff -- such analysis are guaranteed by the
incremental compilation model to always be unreferenced.) So, we use the
new `InternPool.putKeyReplace` to replace the shard entry for this index
with the newly-created index.
Another big commit, sorry! This commit makes all fixes necessary for
incremental updates of the compiler itself (specifically, adding a
breakpoint to `zirCompileLog`) to succeed, at least on the frontend.

The biggest change here is a reform to how types are handled. It works
like this:
* When a type is first created in `zirStructDecl` etc, its namespace is
  scanned. If the type requires resolution, an `interned` dependency is
  declared for the containing `AnalUnit`.
* `zirThis` also declared an `interned` dependency for its `AnalUnit` on
  the namespace's owner type.
* If the type's namespace changes, the surrounding source declaration
  changes hash, so `zirStructDecl` etc will be hit again. We check
  whether the namespace has been scanned this generation, and re-scan it
  if not.
* Namespace lookups also check whether the namespace in question
  requires a re-scan based on the generation. This is because there's no
  guarantee that the `zirStructDecl` is re-analyzed before the namespace
  lookup is re-analyzed.
* If a type's structure (essentially its fields) change, then the type's
  `Cau` is considered outdated. When the type is re-analyzed due to
  being outdated, or the `zirStructDecl` is re-analyzed by being
  transitively outdated, or a corresponding `zirThis` is re-analyzed by
  being transitively outdated, the struct type is recreated at a new
  `InternPool` index. The namespace's owner is updated (but not
  re-scanned, since that is handled by the mechanisms above), and the
  old type, while remaining a valid `Index`, is removed from the map
  metadata so it will never be found by lookups. `zirStructDecl` and
  `zirThis` store an `interned` dependency on the *new* type.
The old logic here had bitrotted, largely because there were some
incorrect `else` cases. This is now implemented correctly for all
current ZIR instructions. This prevents instructions being lost in
incremental updates, which is important for updates to be minimal.
This function is slow and should only be called in compile error cases.
This results in correct reporting to the build system of file system
inputs with `-fincremental --watch` on a `fno-emit-bin` build.
mlugg added 4 commits August 17, 2024 18:50
This function now has to allocate anyway to resolve references, so we
may as well just build the error bundle and check its length.

Also remove some unnecessary calls of this function for efficiency.
@mlugg
Copy link
Member Author

mlugg commented Aug 18, 2024

The CI failure here is a timeout which is also happening on master, so I'm going to disregard it and merge now to avoid any conflicts with the Dwarf+link changes that I anticipate will continue coming in.

@mlugg mlugg merged commit 4929cf2 into ziglang:master Aug 18, 2024
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants