-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incremental compilation progress #21063
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can feel free to merge this when you feel it is ready. I have some requests mainly dealing with making the codebase more understandable. I think this logic has become quite complex, more than I am comfortable with, so let's start by at least documenting it more meticulously.
Let's make sure follow-up tasks are on the issue tracker rather than in TODO comments, please.
/// This contains a list of AnalUnit whose analysis or codegen failed, but the | ||
/// failure was something like running out of disk space, and trying again may | ||
/// succeed. On the next update, we will flush this list, marking all members of | ||
/// it as outdated. | ||
retryable_failures: std.ArrayListUnmanaged(AnalUnit) = .{}, | ||
|
||
/// These are the modules which we initially queue for analysis in `Compilation.update`. | ||
/// `resolveReferences` will use these as the root of its reachability traversal. | ||
analysis_roots: std.BoundedArray(*Package.Module, 3) = .{}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does a dependency on BoundedArray
really need to be dragged in? This could just be 3 flags, or even 0 flags with the (arguably trivial) logic repeated.
/// Rename to find all the logic that determines analysis roots.
pub const analysis_roots_logic = {};
elsewhere:
_ = analysis_roots_logic;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't want to use BoundedArray
, that's fair -- I can just replicate the logic. (FWIW, I don't really understand your general aversion to BoundedArray
-- it's a pretty convenient abstraction at times IME)
try type_queue.ensureTotalCapacity(gpa, zcu.analysis_roots.len); | ||
for (zcu.analysis_roots.slice()) |mod| { | ||
// Logic ripped from `Zcu.PerThread.importPkg`. | ||
// TODO: this is silly, `Module` should just store a reference to its root `File`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion per #363:
- don't put the string "TODO" in the comment
- make a list of follow-up things to do in your PR writeup
- after merging, file issues for the follow-up things
Thanks for the review! I'm currently going down a rabbit-hole of trying to make a more complex example work. Since there are no big conflicts coming in right now, I'll spend some time polishing this up -- implementing your review comments, and adding doc comments to explain things a little better. Sorry for the confusion -- this branch has had a lot of active experimentation, and my documentation efforts have fallen behind. Something I might do is write one document which details the overall strategy for incremental compilation, since I have a suspicion I'm the only person with a solid understanding of it right now (my own fault!). If you're happy with that, would you prefer it go in the repo under |
That would be lovely. My suggestion is to put it somewhere other than the repo otherwise we'll get contributions to fix typos in it, and minor sentence rewordings that don't really matter. |
@andrewrk, would love to see the docs in the repo:
|
I've written up an informal high-level overview, mainly useful for existing compiler developers. @andrewrk, you may find this handy to read through at some point. https://gist.github.com/mlugg/73b3e60c803006f3556d87c9ed3e8a0e |
ddb91ff
to
e69c8bd
Compare
This commit makes more progress towards incremental compilation, fixing some crashes in the frontend. Notably, it fixes the regressions introduced by ziglang#20964. It also cleans up the "outdated file root" mechanism, by virtue of deleting it: we now detect outdated file roots just after updating ZIR refs, and re-scan their namespaces.
And add a new incremental test to match!
This commit updates `Zcu.resolveReferences` to traverse the graph of `AnalUnit` references (starting from the 1-3 roots of analysis) in order to determine which `AnalUnit`s are referenced in an update. Errors for unreferenced entities are omitted from the error bundle. However, note that unreferenced `Nav`s are not removed from the binary.
Two fixes here. * Prevent a crash when sorting the list of analysis errors when some errors refer to lost source locations. These errors can be sorted anywhere in the list, because they are (in theory) guaranteed to never be emitted by the `resolveReferences` logic. This case occurs, for instance, when a declaration has compile errors in the initial update and is deleted in the second update. * Prevent a crash when resolving the source location for `entire_file` errors for a non-existent file. This is the bug underlying ziglang#20954. Resolves: ziglang#20954.
An enum type is kind of like a struct or union type, in that field errors are happening during type resolution. The only difference is that type resolution happens at the time the type is created. So, errors in fields should not cause the type to be deleted: we've already added a reference entry, and incremenetal dependencies which must be invalidated if the compile error is fixed. Once we call `WipEnumType.prepare`, we should never call `WipEnumType.cancel`. This is analagous to logic for enum declarations in `Sema.zirEnumDecl`.
This case is adapted from ziglang#11344, and passes with `-fno-emit-bin`. Resolves: ziglang#11344
When a type becomes outdated, there will still be lingering references to the old index -- for instance, any declaration whose value was that type holds a reference to that index. These references may live for an arbitrarily long time in some cases. So, we can't just remove the type from the pool -- the old `Index` must remain valid! Instead, we want to preserve the old `Index`, but avoid it from ever appearing in lookups. (It's okay if analysis of something referencing the old `Index` does weird stuff -- such analysis are guaranteed by the incremental compilation model to always be unreferenced.) So, we use the new `InternPool.putKeyReplace` to replace the shard entry for this index with the newly-created index.
Another big commit, sorry! This commit makes all fixes necessary for incremental updates of the compiler itself (specifically, adding a breakpoint to `zirCompileLog`) to succeed, at least on the frontend. The biggest change here is a reform to how types are handled. It works like this: * When a type is first created in `zirStructDecl` etc, its namespace is scanned. If the type requires resolution, an `interned` dependency is declared for the containing `AnalUnit`. * `zirThis` also declared an `interned` dependency for its `AnalUnit` on the namespace's owner type. * If the type's namespace changes, the surrounding source declaration changes hash, so `zirStructDecl` etc will be hit again. We check whether the namespace has been scanned this generation, and re-scan it if not. * Namespace lookups also check whether the namespace in question requires a re-scan based on the generation. This is because there's no guarantee that the `zirStructDecl` is re-analyzed before the namespace lookup is re-analyzed. * If a type's structure (essentially its fields) change, then the type's `Cau` is considered outdated. When the type is re-analyzed due to being outdated, or the `zirStructDecl` is re-analyzed by being transitively outdated, or a corresponding `zirThis` is re-analyzed by being transitively outdated, the struct type is recreated at a new `InternPool` index. The namespace's owner is updated (but not re-scanned, since that is handled by the mechanisms above), and the old type, while remaining a valid `Index`, is removed from the map metadata so it will never be found by lookups. `zirStructDecl` and `zirThis` store an `interned` dependency on the *new* type.
The old logic here had bitrotted, largely because there were some incorrect `else` cases. This is now implemented correctly for all current ZIR instructions. This prevents instructions being lost in incremental updates, which is important for updates to be minimal.
This function is slow and should only be called in compile error cases.
I'm like 80% sure this is correct
This results in correct reporting to the build system of file system inputs with `-fincremental --watch` on a `fno-emit-bin` build.
This function now has to allocate anyway to resolve references, so we may as well just build the error bundle and check its length. Also remove some unnecessary calls of this function for efficiency.
The CI failure here is a timeout which is also happening on master, so I'm going to disregard it and merge now to avoid any conflicts with the Dwarf+link changes that I anticipate will continue coming in. |
-Sabrina Carpenter 2024
We're getting close now, folks!
This PR makes progress towards incremental compilation. Various bugs and regressions have been fixed. The
incr-check
tool is expanded with some more options, such as passing through-fno-emit-bin
to the compiler. With this option, the existing incremental compilation test, and the 3 added by this commit, all succeed. I don't know whether the right things are going to codegen, but some hacky CBE testing indicates... maybe?The biggest blocker right now is linker work from @kubkon -- it's currently hard for me to know for sure whether the full pipeline is actually working correctly. Once one of the self-hosted linkers (presumably ELF) can perform basic incremental updates, we can ramp up the testing.
Trying to run the
hello
test without-fno-emit-bin
currently results in the following linker crash: