Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compiler: split Decl into Nav and Cau #20964

Merged
merged 2 commits into from
Aug 11, 2024

Conversation

mlugg
Copy link
Member

@mlugg mlugg commented Aug 6, 2024

The type Zcu.Decl in the compiler is problematic: over time it has gained many responsibilities. Every source declaration, container type, generic instantiation, and @extern has a Decl. The functions of these Decls are in some cases entirely disjoint.

After careful analysis, I determined that the two main responsibilities of Decl are as follows:

  • A Decl acts as the "subject" of semantic analysis at comptime. A single unit of analysis is either a runtime function body, or a Decl. It registers incremental dependencies, tracks analysis errors, etc.
  • A Decl acts as a "global variable": a pointer to it is consistent, and it may be lowered to a specific symbol by the codegen backend.

This commit eliminates Decl and introduces new types to model these responsibilities: Cau (Comptime Analysis Unit) and Nav (Named Addressable Value).

Every source declaration, and every container type requiring resolution (so not including opaque), has a Cau. For a source declaration, this Cau performs the resolution of its value. (When #131 is implemented, it is unsolved whether type and value resolution will share a Cau or have two distinct Caus.) For a type, this Cau is the context in which type resolution occurs.

Every non-comptime source declaration, every generic instantiation, and every distinct extern has a Nav. These are sent to codegen/link: the backends by definition do not care about Caus.

This commit has some minor technically-breaking changes surrounding usingnamespace. I don't think they'll impact anyone, since the changes are fixes around semantics which were previously inconsistent (the behavior changed depending on hashmap iteration order!).

Aside from that, this changeset has no user-facing changes. Instead, it is an internal refactor which makes it easier to correctly model the responsibilities of different objects, particularly regarding incremental compilation. The performance impact should be negligible, but I will take measurements before merging this work into master.

@mlugg mlugg requested a review from Snektron as a code owner August 6, 2024 21:12
@andrewrk
Copy link
Member

andrewrk commented Aug 6, 2024

Holy line count, Batman!

@kubkon kubkon force-pushed the the-great-decl-split-mk2 branch from e96b967 to 30cea80 Compare August 9, 2024 22:09
@jacobly0 jacobly0 force-pushed the the-great-decl-split-mk2 branch from f33ee13 to 1631b46 Compare August 11, 2024 02:32
mlugg and others added 2 commits August 11, 2024 07:29
The type `Zcu.Decl` in the compiler is problematic: over time it has
gained many responsibilities. Every source declaration, container type,
generic instantiation, and `@extern` has a `Decl`. The functions of
these `Decl`s are in some cases entirely disjoint.

After careful analysis, I determined that the two main responsibilities
of `Decl` are as follows:
* A `Decl` acts as the "subject" of semantic analysis at comptime. A
  single unit of analysis is either a runtime function body, or a
  `Decl`. It registers incremental dependencies, tracks analysis errors,
  etc.
* A `Decl` acts as a "global variable": a pointer to it is consistent,
  and it may be lowered to a specific symbol by the codegen backend.

This commit eliminates `Decl` and introduces new types to model these
responsibilities: `Cau` (Comptime Analysis Unit) and `Nav` (Named
Addressable Value).

Every source declaration, and every container type requiring resolution
(so *not* including `opaque`), has a `Cau`. For a source declaration,
this `Cau` performs the resolution of its value. (When ziglang#131 is
implemented, it is unsolved whether type and value resolution will share
a `Cau` or have two distinct `Cau`s.) For a type, this `Cau` is the
context in which type resolution occurs.

Every non-`comptime` source declaration, every generic instantiation,
and every distinct `extern` has a `Nav`. These are sent to codegen/link:
the backends by definition do not care about `Cau`s.

This commit has some minor technically-breaking changes surrounding
`usingnamespace`. I don't think they'll impact anyone, since the changes
are fixes around semantics which were previously inconsistent (the
behavior changed depending on hashmap iteration order!).

Aside from that, this changeset has no significant user-facing changes.
Instead, it is an internal refactor which makes it easier to correctly
model the responsibilities of different objects, particularly regarding
incremental compilation. The performance impact should be negligible,
but I will take measurements before merging this work into `master`.

Co-authored-by: Jacob Young <[email protected]>
Co-authored-by: Jakub Konka <[email protected]>
Eliding the namespace when a container type has no decls was an
experiment in saving memory, but it ended up causing more trouble than
it was worth in various places. So, take the small memory hit for
reified types, and just give every container type a namespace.
@mlugg mlugg force-pushed the the-great-decl-split-mk2 branch from 1631b46 to 153e7d6 Compare August 11, 2024 06:30
@mlugg mlugg requested review from andrewrk and removed request for Snektron August 11, 2024 16:16
@mlugg
Copy link
Member Author

mlugg commented Aug 11, 2024

Performance tests are indicating that analysis performance suffers considerably, but codegen is improved enough to make up for it. I'm not surprised that codegen improved, but I am surprised at how significant these changes are (both for analysis and codegen). I'd be inclined to go ahead with merge, since for actual compilations this PR doesn't seem to represent a perf regression, but if @andrewrk is uncomfortable with the -fno-emit-bin perf regressions I can take a look into them and see if improvements are possible.

Analyze Behavior

Benchmark 1 (41 runs): /home/mlugg/zig/master/stage4/bin/zig test [snip]
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           488ms ± 4.81ms     480ms …  501ms          0 ( 0%)        0%
  peak_rss            120MB ±  173KB     119MB …  120MB          0 ( 0%)        0%
  cpu_cycles         2.00G  ± 20.5M     1.97G  … 2.09G           1 ( 2%)        0%
  instructions       3.14G  ± 21.0K     3.14G  … 3.14G           1 ( 2%)        0%
  cache_references    178M  ± 3.00M      176M  …  195M           2 ( 5%)        0%
  cache_misses       9.12M  ±  257K     8.59M  … 9.76M           0 ( 0%)        0%
  branch_misses      8.08M  ± 65.1K     7.95M  … 8.21M           0 ( 0%)        0%
Benchmark 2 (38 runs): /home/mlugg/zig/the-great-decl-split-mk2/stage4/bin/zig test [snip]
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           535ms ± 6.75ms     524ms …  559ms          1 ( 3%)        💩+  9.6% ±  0.5%
  peak_rss            121MB ±  189KB     121MB …  122MB          2 ( 5%)        💩+  1.2% ±  0.1%
  cpu_cycles         2.20G  ± 22.7M     2.18G  … 2.31G           1 ( 3%)        💩+ 10.1% ±  0.5%
  instructions       3.50G  ± 17.3K     3.50G  … 3.50G           0 ( 0%)        💩+ 11.6% ±  0.0%
  cache_references    198M  ± 1.53M      196M  …  206M           1 ( 3%)        💩+ 10.9% ±  0.6%
  cache_misses       9.74M  ±  256K     9.29M  … 10.5M           1 ( 3%)        💩+  6.7% ±  1.3%
  branch_misses      8.09M  ± 52.5K     7.99M  … 8.23M           2 ( 5%)          +  0.2% ±  0.3%

Build Behavior (x86_64-linux selfhosted)

Benchmark 1 (32 runs): /home/mlugg/zig/master/stage4/bin/zig test [snip]
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           630ms ± 7.13ms     617ms …  652ms          1 ( 3%)        0%
  peak_rss            141MB ±  450KB     140MB …  142MB          0 ( 0%)        0%
  cpu_cycles         3.84G  ± 55.6M     3.77G  … 4.03G           3 ( 9%)        0%
  instructions       7.16G  ±  144K     7.16G  … 7.16G           3 ( 9%)        0%
  cache_references    255M  ± 4.03M      251M  …  269M           4 (13%)        0%
  cache_misses       15.3M  ±  585K     14.2M  … 16.8M           1 ( 3%)        0%
  branch_misses      21.4M  ±  289K     21.0M  … 22.3M           3 ( 9%)        0%
Benchmark 2 (34 runs): /home/mlugg/zig/the-great-decl-split-mk2/stage4/bin/zig test [snip]
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           600ms ± 7.38ms     582ms …  618ms          0 ( 0%)        ⚡-  4.7% ±  0.6%
  peak_rss            138MB ±  329KB     137MB …  138MB          0 ( 0%)        ⚡-  2.3% ±  0.1%
  cpu_cycles         3.91G  ± 49.1M     3.84G  … 4.04G           5 (15%)        💩+  1.8% ±  0.7%
  instructions       7.39G  ±  199K     7.39G  … 7.39G           1 ( 3%)        💩+  3.2% ±  0.0%
  cache_references    262M  ± 2.56M      260M  …  271M           3 ( 9%)        💩+  2.7% ±  0.6%
  cache_misses       15.1M  ±  510K     14.2M  … 16.4M           0 ( 0%)          -  1.7% ±  1.8%
  branch_misses      20.8M  ±  249K     20.5M  … 21.5M           3 ( 9%)        ⚡-  2.9% ±  0.6%

Analyze Hello World

Benchmark 1 (154 runs): /home/mlugg/zig/master/stage4/bin/zig build-exe --zig-lib-dir lib /home/mlugg/test/hello.zig -fno-emit-bin
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           130ms ± 6.60ms     118ms …  162ms          5 ( 3%)        0%
  peak_rss           99.0MB ±  208KB    98.4MB … 99.5MB          1 ( 1%)        0%
  cpu_cycles          469M  ± 25.7M      435M  …  590M           6 ( 4%)        0%
  instructions        690M  ± 55.1K      689M  …  690M          14 ( 9%)        0%
  cache_references   38.7M  ± 1.01M     37.5M  … 45.3M           7 ( 5%)        0%
  cache_misses       2.73M  ±  254K     2.50M  … 4.23M           8 ( 5%)        0%
  branch_misses      2.23M  ± 56.8K     2.17M  … 2.55M           8 ( 5%)        0%
Benchmark 2 (145 runs): /home/mlugg/zig/the-great-decl-split-mk2/stage4/bin/zig build-exe --zig-lib-dir lib /home/mlugg/test/hello.zig -fno-emit-bin
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           139ms ± 7.60ms     126ms …  168ms          7 ( 5%)        💩+  6.8% ±  1.2%
  peak_rss           99.3MB ±  207KB    98.7MB … 99.8MB          1 ( 1%)          +  0.3% ±  0.0%
  cpu_cycles          505M  ± 29.2M      466M  …  624M           5 ( 3%)        💩+  7.8% ±  1.3%
  instructions        749M  ± 48.7K      749M  …  749M          11 ( 8%)        💩+  8.6% ±  0.0%
  cache_references   41.3M  ± 1.27M     40.0M  … 48.2M           9 ( 6%)        💩+  7.0% ±  0.7%
  cache_misses       2.93M  ±  288K     2.66M  … 4.33M           7 ( 5%)        💩+  7.3% ±  2.2%
  branch_misses      2.26M  ± 63.4K     2.20M  … 2.58M           5 ( 3%)          +  1.3% ±  0.6%

Build Hello World (x86_64-linux selfhosted)

Benchmark 1 (100 runs): /home/mlugg/zig/master/stage4/bin/zig build-exe --zig-lib-dir lib /home/mlugg/test/hello.zig -fno-llvm -fno-lld
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           200ms ± 4.89ms     193ms …  219ms          5 ( 5%)        0%
  peak_rss            107MB ±  198KB     106MB …  107MB          1 ( 1%)        0%
  cpu_cycles         1.03G  ± 29.4M     1.01G  … 1.16G          11 (11%)        0%
  instructions       1.94G  ± 63.3K     1.94G  … 1.94G           2 ( 2%)        0%
  cache_references   63.0M  ± 1.68M     61.3M  … 72.4M           8 ( 8%)        0%
  cache_misses       4.66M  ±  226K     4.37M  … 5.57M           5 ( 5%)        0%
  branch_misses      6.67M  ±  120K     6.53M  … 7.23M           5 ( 5%)        0%
Benchmark 2 (101 runs): /home/mlugg/zig/the-great-decl-split-mk2/stage4/bin/zig build-exe --zig-lib-dir lib /home/mlugg/test/hello.zig -fno-llvm -fno-lld
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           199ms ± 4.45ms     191ms …  217ms          5 ( 5%)          -  0.5% ±  0.6%
  peak_rss            106MB ±  203KB     106MB …  107MB          0 ( 0%)          -  0.1% ±  0.1%
  cpu_cycles         1.04G  ± 30.2M     1.02G  … 1.19G           8 ( 8%)          +  0.9% ±  0.8%
  instructions       1.98G  ± 50.0K     1.98G  … 1.98G           3 ( 3%)        💩+  2.3% ±  0.0%
  cache_references   62.7M  ± 1.69M     61.6M  … 71.6M           7 ( 7%)          -  0.4% ±  0.7%
  cache_misses       4.56M  ±  214K     4.32M  … 5.57M           6 ( 6%)          -  2.0% ±  1.3%
  branch_misses      6.52M  ±  119K     6.42M  … 7.07M           7 ( 7%)        ⚡-  2.2% ±  0.5%

@andrewrk
Copy link
Member

can you take a data point for building the self-hosted compiler? behavior tests and hello world are a bit of outliers.

@mlugg
Copy link
Member Author

mlugg commented Aug 11, 2024

These data points are for building the self-hosted compiler with -Ddev=x86_64-linux. Overall:

  • Notable performance regression in analysis (-fno-emit-bin)
  • Notable performance improvement overall (self-hosted codegen)
  • Major RSS improvements across the board

Analyze

Benchmark 1 (7 runs): /home/mlugg/zig/master/stage4/bin/zig [snip]
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          2.94s  ± 14.1ms    2.92s  … 2.96s           0 ( 0%)        0%
  peak_rss            282MB ±  170KB     282MB …  282MB          0 ( 0%)        0%
  cpu_cycles         14.3G  ± 75.2M     14.2G  … 14.4G           0 ( 0%)        0%
  instructions       23.4G  ± 14.6K     23.4G  … 23.4G           0 ( 0%)        0%
  cache_references   1.09G  ± 5.82M     1.08G  … 1.10G           0 ( 0%)        0%
  cache_misses       50.2M  ±  991K     49.1M  … 51.6M           0 ( 0%)        0%
  branch_misses      50.0M  ±  315K     49.8M  … 50.7M           0 ( 0%)        0%
Benchmark 2 (7 runs): /home/mlugg/zig/the-great-decl-split-mk2/stage4/bin/zig [snip]
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          3.16s  ± 25.3ms    3.13s  … 3.20s           0 ( 0%)        💩+  7.7% ±  0.8%
  peak_rss            236MB ±  162KB     235MB …  236MB          0 ( 0%)        ⚡- 16.5% ±  0.1%
  cpu_cycles         13.7G  ± 87.6M     13.6G  … 13.8G           0 ( 0%)        ⚡-  4.5% ±  0.7%
  instructions       22.7G  ± 42.4K     22.7G  … 22.7G           0 ( 0%)        ⚡-  3.0% ±  0.0%
  cache_references   1.12G  ± 6.64M     1.11G  … 1.13G           0 ( 0%)        💩+  2.9% ±  0.7%
  cache_misses       50.3M  ± 1.52M     48.3M  … 52.4M           0 ( 0%)          +  0.2% ±  3.0%
  branch_misses      37.9M  ±  363K     37.4M  … 38.4M           0 ( 0%)        ⚡- 24.3% ±  0.8%

Build (x86_64-linux selfhosted)

Benchmark 1 (4 runs): /home/mlugg/zig/master/stage4/bin/zig [snip]
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          5.34s  ± 24.0ms    5.30s  … 5.36s           1 (25%)        0%
  peak_rss            391MB ± 2.55MB     389MB …  395MB          0 ( 0%)        0%
  cpu_cycles         33.6G  ±  125M     33.5G  … 33.7G           0 ( 0%)        0%
  instructions       65.7G  ±  950K     65.7G  … 65.7G           1 (25%)        0%
  cache_references   1.86G  ± 12.8M     1.85G  … 1.88G           0 ( 0%)        0%
  cache_misses        112M  ± 3.12M      109M  …  116M           0 ( 0%)        0%
  branch_misses       180M  ± 1.16M      179M  …  182M           0 ( 0%)        0%
Benchmark 2 (4 runs): /home/mlugg/zig/the-great-decl-split-mk2/stage4/bin/zig [snip]
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          5.06s  ± 34.1ms    5.01s  … 5.09s           0 ( 0%)        ⚡-  5.3% ±  1.0%
  peak_rss            302MB ±  662KB     301MB …  303MB          1 (25%)        ⚡- 22.8% ±  0.8%
  cpu_cycles         32.0G  ±  126M     31.9G  … 32.2G           0 ( 0%)        ⚡-  4.7% ±  0.6%
  instructions       64.3G  ± 1.42M     64.3G  … 64.3G           0 ( 0%)        ⚡-  2.2% ±  0.0%
  cache_references   1.82G  ± 5.17M     1.81G  … 1.83G           0 ( 0%)        ⚡-  2.3% ±  0.9%
  cache_misses        105M  ± 1.89M      102M  …  106M           1 (25%)        ⚡-  6.5% ±  4.0%
  branch_misses       161M  ±  931K      161M  …  163M           0 ( 0%)        ⚡- 10.4% ±  1.0%

@andrewrk andrewrk merged commit fc29240 into ziglang:master Aug 11, 2024
9 of 10 checks passed
mlugg added a commit to mlugg/zig that referenced this pull request Aug 12, 2024
This fixes the regressions introduced by ziglang#20964. It also cleans up the
"outdated file root" mechanism by virtue of deleting it. We now detect
outdated file roots just after updating ZIR refs, and re-scan their
namespaces.

@jacobly0, this deletes some asserts in `InternPool` which didn't look
appropriate to me, in order to deal with ziglang#20697.
mlugg added a commit to mlugg/zig that referenced this pull request Aug 13, 2024
This commit makes more progress towards incremental compilation, fixing
some crashes in the frontend. Notably, it fixes the regressions introduced
by ziglang#20964. It also cleans up the "outdated file root" mechanism, by
virtue of deleting it: we now detect outdated file roots just after
updating ZIR refs, and re-scan their namespaces.
mlugg added a commit to mlugg/zig that referenced this pull request Aug 17, 2024
This commit makes more progress towards incremental compilation, fixing
some crashes in the frontend. Notably, it fixes the regressions introduced
by ziglang#20964. It also cleans up the "outdated file root" mechanism, by
virtue of deleting it: we now detect outdated file roots just after
updating ZIR refs, and re-scan their namespaces.
jacobly0 pushed a commit to mlugg/zig that referenced this pull request Aug 17, 2024
This commit makes more progress towards incremental compilation, fixing
some crashes in the frontend. Notably, it fixes the regressions introduced
by ziglang#20964. It also cleans up the "outdated file root" mechanism, by
virtue of deleting it: we now detect outdated file roots just after
updating ZIR refs, and re-scan their namespaces.
richerfu pushed a commit to richerfu/zig that referenced this pull request Oct 28, 2024
This commit makes more progress towards incremental compilation, fixing
some crashes in the frontend. Notably, it fixes the regressions introduced
by ziglang#20964. It also cleans up the "outdated file root" mechanism, by
virtue of deleting it: we now detect outdated file roots just after
updating ZIR refs, and re-scan their namespaces.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants