Merge pull request #31 from frengor/dev

- Update to 0.5.0 - Implement tracing doing a breadth-first traversal of the heap, making the collector not overflow the stack when analyzing very deep and nested structures - Improve performance when weak pointers are enabled. - *(nightly only)* Derive `SmartPointer` for `Cc` - Minor improvements and fixes - Update iai-callgrind to 1.12.2
frengor · Aug 6, 2024 · 449c8cd · 449c8cd
2 parents b61b5ed + d2051fa
commit 449c8cd
Show file tree

Hide file tree

Showing 20 changed files with 1,535 additions and 1,062 deletions.
diff --git a/.github/workflows/bench.yml b/.github/workflows/bench.yml
@@ -5,7 +5,6 @@ on:
     types: [labeled]
 
 env:
-  IAI_CALLGRIND_VERSION: 0.7.3
   CARGO_TERM_COLOR: always
   IAI_CALLGRIND_COLOR: never
   CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
@@ -27,7 +26,12 @@ jobs:
       - uses: taiki-e/install-action@valgrind
       - uses: taiki-e/install-action@cargo-binstall
       - name: Install iai-callgrind-runner
-        run: cargo binstall --no-confirm --no-symlinks iai-callgrind-runner --version ${IAI_CALLGRIND_VERSION}
+        run: |
+          version=$(cargo metadata --format-version=1 |\
+            jq '.packages[] | select(.name == "iai-callgrind").version' |\
+            tr -d '"'
+          )
+          cargo binstall --no-confirm --no-symlinks iai-callgrind-runner --version "${version}"
       - name: Bench base branch
         run: |
           cargo update

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -10,7 +10,7 @@ Also, remember to open the pull requests toward the `dev` branch. The `main` bra
 
 The core idea behind the algorithm is the same as the one presented by Lins in ["Cyclic Reference Counting With Lazy Mark-Scan"](https://kar.kent.ac.uk/22347/1/CyclicLin.pdf)
 and by Bacon and Rajan in ["Concurrent Cycle Collection in Reference Counted Systems"](https://pages.cs.wisc.edu/~cymen/misc/interests/Bacon01Concurrent.pdf).  
-However, the implementation differs in order to make the collector faster and more resilient to random panics and failures in general.
+However, the implementation differs in order to make the collector faster and more resilient to panics and failures.
 
 > [!IMPORTANT]  
 > `rust-cc` is *not* strictly an implementation of the algorithm shown in the linked papers and it's never been
@@ -23,24 +23,33 @@ The `POSSIBLE_CYCLES` is an (intrusive) list which contains the possible roots o
 Sometimes (see [`crate::trigger_collection`](./src/lib.rs)), when creating a new `Cc` or when `collect_cycles` is called,
 the objects inside the `POSSIBLE_CYCLES` list are checked to see if they are part of a garbage cycle.
 
-Therefore, they undergo two tracing passes:
-- **Trace Counting:** during this phase, starting from the elements inside `POSSIBLE_CYCLES`,
-  objects are traced to count the amount of pointers to each `CcBox` that is reachable from the list's `Cc`s.  
+Therefore, they undergo two tracing phases:
+- **Trace Counting:** during this phase, a breadth-first traversal of the heap is performed starting from the elements inside
+  `POSSIBLE_CYCLES` to count the number of pointers to each `CcBox` that is reachable from the list's `Cc`s.  
   The `tracing_counter` "field" (see the [`counter_marker` module](./src/counter_marker.rs) for more info) is used to keep track of this number.
   <details>
   <summary>About tracing_counter</summary>
   <p>In the papers, Lins, Bacon and Rajan decrement the RC itself instead of using another counter. However, if during tracing there was a panic,
      it would be hard for `rust-cc` to restore the RC correctly. This is the reason for the choice of having another counter.
-     The invariant regarding this second counter is that it must always be between 0 and RC inclusively. 
+     The invariant regarding this second counter is that it must always be between 0 and RC (inclusively). 
   </p>
   </details>
 - **Trace Roots:** now, every `CcBox` which has the RC strictly greater than `tracing_counter` can be considered a root,
-  since it must exist a `Cc` pointer which points at it that hasn't been traced before. Thus, a trace is started from these roots,
-  and all objects not reached during this trace are finalized/deallocated (the story is more complicated because of possible
+  since it must exist a `Cc` pointer which points at it that hasn't been traced in the previous phase. Thus, a trace is started from these roots,
+  and all objects not reached during this trace are finalized/deallocated (this is actually more complex due to possible
   object resurrections, see the comments in [lib.rs](./src/lib.rs)).  
   Note that this second phase is correct only if the graph formed by the pointers is not changed between the two phases. Thus,
   this is a key requirement of the `Trace` trait and one of the reasons it is marked `unsafe`.
 
+## Using lists and queues
+
+When debug assertions are enabled, the `add` method may panic before actually updating the list to contain the added object.
+Similarly, the `remove` method may panic after having removed the object from the list.
+
+Thus, marking should be done only:
+  - after the call to the `add` method
+  - before the call to the `remove` method.
+
 ## Writing tests
 
 Every unit test should start with a call to `tests::reset_state()` to make sure errors in other tests don't impact the current one.

diff --git a/Cargo.toml b/Cargo.toml
@@ -14,7 +14,7 @@ edition.workspace = true
 members = ["derive"]
 
 [workspace.package]
-version = "0.4.0" # Also update in [dependencies.rust-cc-derive.version]
+version = "0.5.0" # Also update in [dependencies.rust-cc-derive.version]
 authors = ["fren_gor <[email protected]>"]
 repository = "https://github.com/frengor/rust-cc"
 categories = ["memory-management", "no-std"]
@@ -49,12 +49,12 @@ std = ["slotmap?/std", "thiserror/std"]
 pedantic-debug-assertions = []
 
 [dependencies]
-rust-cc-derive = { path = "./derive", version = "=0.4.0", optional = true }
+rust-cc-derive = { path = "./derive", version = "=0.5.0", optional = true }
 slotmap = {  version = "1.0", optional = true }
 thiserror = { version = "1.0", package = "thiserror-core", default-features = false }
 
 [dev-dependencies]
-iai-callgrind = "=0.7.3" # Also update IAI_CALLGRIND_VERSION in .github/workflows/bench.yml
+iai-callgrind = "=0.12.2"
 rand = "0.8.3"
 trybuild = "1.0.85"
 test-case = "3.3.1"
@@ -64,6 +64,10 @@ name = "bench"
 harness = false
 required-features = ["std", "derive"]
 
+[profile.bench]
+debug = true # Required by iai-callgrind
+strip = false # Required by iai-callgrind
+
 [lints.rust]
 unexpected_cfgs = { level = "warn", check-cfg = ['cfg(doc_auto_cfg)'] }
 

diff --git a/benches/bench.rs b/benches/bench.rs
@@ -8,7 +8,7 @@ mod benches {
 }
 
 use std::hint::black_box;
-use iai_callgrind::{library_benchmark, library_benchmark_group, main};
+use iai_callgrind::{library_benchmark, library_benchmark_group, LibraryBenchmarkConfig, main};
 use crate::benches::binary_trees::count_binary_trees;
 use crate::benches::binary_trees_with_parent_pointers::count_binary_trees_with_parent;
 use crate::benches::large_linked_list::large_linked_list;
@@ -53,4 +53,7 @@ library_benchmark_group!(
     benchmarks = large_linked_list_bench
 );
 
-main!(library_benchmark_groups = stress_tests_group, binary_trees_group, linked_lists_group);
+main!(
+    config = LibraryBenchmarkConfig::default().raw_callgrind_args(["--branch-sim=yes"]);
+    library_benchmark_groups = stress_tests_group, binary_trees_group, linked_lists_group
+);