This is the artifact for:
- Jaehwang Jung, Jeonghyeon Kim, Matthew J. Parkinson and Jeehoon Kang, Concurrent Immediate Reference Counting, PLDI 2024 [15].
- Jeonghyeon Kim, Jaehwang Jung and Jeehoon Kang, Expediting Hazard Pointers with Bounded RCU Critical Sections, SPAA 2024 [14].
- Jaehwang Jung, Janggun Lee, Jeonghyeon Kim and Jeehoon Kang, Applying Hazard Pointers to More Concurrent Data Structures, SPAA 2023 [11].
- Jeehoon Kang and Jaehwang Jung, A Marriage of Pointer- and Epoch-Based Reclamation, PLDI 2020 [8].
On Ubuntu 22.04,
sudo apt install build-essential python3-pip clang
# Install the Rust compiler and register its executable path.
curl https://sh.rustup.rs -sSf | bash -s -- -y
pip3 install --user pandas plotnine
cargo build --release
# To run a single map data structure benchmark (see later sections for details),
./target/release/<reclamation-scheme> -d <data-structure> -t <threads> -g <get-rate> -r <key-range> -i <time-interval-to-run-seconds>
# For dedicated experiment scripts (e.g., `experiment.sh`),
# see subdirectories in `bench-scripts`.
To add your own implementation of SMR and data structures, please refer to this guide.
We recommend the following environment:
- OS: Ubuntu 22.04
- Architecture: x86-64
- Python: 3.10.12
- Rustup: 1.26.0
smrs
: Reclamation schemes except EBR and PEBR.hp_pp
: An implementation of the original HP [9,10] and HP++ [11].hp-brcu
: An implementation of HP-RCU and HP-BRCU schemes [14].nbr-rs
: An implementation of NBR+ [13] with signal optimization.vbr-rs
: An implementation of VBR [16].cdrc-rs
: An implementation of CDRC [12].circ
: An implementation of CIRC [15].
src
: An implementaion of the benchmark suite.bin
: Benchmark drivers for each SMR.ds_impl
: Implementations of data structures based on each SMR.
For the implementation of EBR and PEBR, please refer to our dedicated repository kaist-cp/crossbeam.
crossbeam-ebr
: The original Crossbeam source code implementing EBR [1,7].crossbeam-pebr
: The fork of Crossbeam that implements PEBR [8] of which the main implementation lies under./crossbeam-pebr/crossbeam-epoch
.
The benchmark requires Rust (for building the implementation of reclamation schemes and data structures), Python (for generating the figures), and other essential build tools.
Install build tools by simply running:
# `clang` is necessary to compile NBR and HP-BRCU.
sudo apt install build-essential clang
# Install the Rust compiler and register its executable path.
curl https://sh.rustup.rs -sSf | bash -s -- -y
Some Python packages are necessary to execute end-to-end experiment scripts and generate plots. To install them locally, simply run:
sudo apt install python3-pip
pip3 install --user pandas plotnine
However, in some cases, you might encounter an issue due to the conflicts on the versions of Python packages. If this is the case, We suggest using pyenv
to set up the Python environment to minimize potential conflicts with the pre-installed system Python environment.
# Install the system Python3 and all dependencies for `pyenv`.
sudo apt update && sudo apt install -y \
build-essential python3 python3-pip curl \
libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev curl \
libncursesw5-dev xz-utils tk-dev libxml2-dev libxmlsec1-dev libffi-dev liblzma-dev
# Install `pyenv`. If you already have `pyenv` in your system,
# skip this lines and set up `bench-venv`.
curl https://pyenv.run | bash
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc
echo 'command -v pyenv >/dev/null || export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc
echo 'eval "$(pyenv init -)"' >> ~/.bashrc
echo 'eval "$(pyenv virtualenv-init -)"' >> ~/.bashrc
source ~/.bashrc
# Set up the `bench-venv` environment to generate figures.
pyenv install 3.10.12
pyenv virtualenv 3.10.12 bench-venv
pyenv local bench-venv
pip3 install -r requirements.txt
# Install the Rust compiler and register its executable path.
curl https://sh.rustup.rs -sSf | bash -s -- -y
After installing the necessary dependencies properly, build the benchmark binaries with the following command:
cargo build --release
cargo test --release -- --test-threads=1
After successful testing, outputs like the following appear, printing ok
for each test.
$ cargo test --release -- --test-threads=1
Finished release [optimized] target(s) in 0.05s
Running unittests src/lib.rs (target/release/deps/smr_benchmark-c42d60ebe9d214bd)
running 72 tests
test ds_impl::cdrc::bonsai_tree::tests::smoke_bonsai_tree_ebr ... ok
test ds_impl::cdrc::bonsai_tree::tests::smoke_bonsai_tree_hp ... ok
test ds_impl::cdrc::double_link::test::simple_all ... ok
test ds_impl::cdrc::double_link::test::smoke_all ... ok
... (omitted) ...
test result: ok. 72 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 25.24s
... (omitted) ...
Doc-tests smr-benchmark
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Even with the small configuration, the end-to-end benchmark scripts in bench-scripts
would take several hours. You can run a single benchmark by directly executing the benchmark binaries.
To run a single map data structure benchmark,
./target/release/<reclamation-scheme> -d <data-structure> -t <threads> -g <get-rate> -r <key-range> -i <time-interval-to-run-seconds>
where
- Data structure
h-list
: Harris's linked list [1]hm-list
: Harris-Michael linked list [2]hhs-list
: Harris’s list with wait-free get() method (HP not applicable) [3]hash-map
: Chaining hash table using HMList (for HP) or HHSList (for others) for each bucket [2]nm-tree
: Natarajan- Mittal tree (HP not applicable) [4]skip-list
: lock-free skiplist by Herlihy and Shavit, with wait-free get() for schemes other than HP [3]bonsai-tree
: A non-blocking variant of Bonsai tree [5]efrb-tree
: Ellen et al. ’s tree [6]
- Reclamation scheme
nr
: A baseline that does not reclaim memoryebr
: Epoch-based RCU [1,7]pebr
: Pointer- and epoch-based reclamation [8]hp
: Hazard pointer with asymmetric fence optimization [9,10]hp-pp
: An extension to hazard pointers that support optimistic traversal [11]nbr
: Neutralization based reclamation [13] with signal optimization (NBR+) with a moderate reclamation threshold (256)nbr-large
: NBR+ [13] with a large reclamation threshold (8192)hp-rcu
: An extension of HP scheme with RCU-expedited traversals [14]hp-brcu
: An extension of HP scheme with BRCU-expedited traversals [14]vbr
: Version based reclamation by Sheffi et al. [16]cdrc-ebr
: EBR flavor of CDRC [12]cdrc-ebr-flush
: EBR flavor of CDRC [12] flushing its local garbages on each operationcdrc-hp
: HP flavor of CDRC [12]circ-ebr
: EBR flavor of CIRC [15]circ-hp
: HP flavor of CIRC [15]
- Get rate
0
: Write-only (Insert 50%, Remove 50%)1
: Read-write (Get 50%, Insert 25%, Remove 25%)2
: Read-intensive (Get 10%, Insert 45%, Remove 45%)3
: Read-only (Get 100%)
It runs a single map data structure benchmark with the given configuration, and measures the throughput (operations per second) and memory usage (bytes).
$ ./target/release/circ-ebr -d nm-tree -t 64 -g 2 -r 10000 -i 10
nm-tree: 64 threads, n0, c1, E2, small bag
prefilling with 64 threads... prefilled... end
ops/s: 163839786, peak mem: 28.316 MiB, avg_mem: 4.970 MiB, peak garb: 99524, avg garb: 3589
For detailed usage information,
./target/release/<reclamation-scheme> -h
To run a single DoubleLink queue benchmark,
./target/release/double-link -t <threads> -m <reclamation-scheme> -i <time-interval-to-run-seconds>
where
- Reclamation scheme
nr
: A baseline that does not reclaim memoryebr
: Epoch-based RCUhp
: Hazard pointer with asymmetric fence optimizationcdrc-ebr
: EBR flavor of CDRCcdrc-ebr-flush
: EBR flavor of CDRC flushing its local garbage on each operationcdrc-hp
: HP flavor of CDRCcirc-ebr
: EBR flavor of CIRCcirc-hp
: HP flavor of CIRC
It runs a single DoubleLink queue benchmark with the given configuration, and measures the throughput (operations per second) and memory usage (bytes).
$ ./target/release/double-link -t 64 -m circ-ebr -i 10
circ-ebr: 64 threads
end
ops/s: 732030, peak mem: 219205664, avg_mem: 110476847
For detailed usage information,
./target/release/double-link -h
To run the entire benchmark, execute experiment.sh
script in bench-scripts
. This takes several hours and creates raw CSV data and figures under ./results/
.
We used AddressSanitizer to debug our implementation. For example, a line below runs the benchmark with LLVM address sanitizer for Rust and uses parameters that impose high stress on SMR by triggering more frequent retirements.
RUST_BACKTRACE=1 RUSTFLAGS="-Z sanitizer=address" cargo run --bin <Reclamation scheme> --target x86_64-unknown-linux-gnu --features sanitize -- -d<Data structure> -i3 -t64 -r10 -g1
Note that sanitizer may report memory leaks when used against CIRC EBR. This is because we used high bits of pointers for epoch tagging purposes, but the AddressSanitizer does not recognize those tagged pointers.
- [1] Timothy L. Harris. 2001. A Pragmatic Implementation of Non-Blocking Linked-Lists. In Proceedings of the 15th International Conference on Distributed Computing (DISC ’01). Springer-Verlag, Berlin, Heidelberg, 300–314.
- [2] Maged M. Michael. 2002. High Performance Dynamic Lock-Free Hash Tables and List-Based Sets. In Proceedings of the Fourteenth Annual ACM Symposium on Parallel Algorithms and Architectures (Winnipeg, Manitoba, Canada) (SPAA ’02). Association for Computing Machinery, New York, NY, USA, 73–82. https://doi.org/10.1145/564870.564881
- [3] Maurice Herlihy and Nir Shavit. 2012. The Art of Multiprocessor Programming, Revised Reprint (1st ed.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.
- [4] Aravind Natarajan and Neeraj Mittal. 2014. Fast Concurrent Lock-Free Binary Search Trees. In Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (Orlando, Florida, USA) (PPoPP ’14). Association for Computing Machinery, New York, NY, USA, 317–328. https://doi.org/10.1145/2555243.2555256
- [5] Austin T. Clements, M. Frans Kaashoek, and Nickolai Zeldovich. 2012. Scalable Address Spaces Using RCU Balanced Trees. In Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems (London, England, UK) (ASPLOS XVII). Association for Computing Machinery, New York, NY, USA, 199–210. https://doi.org/10.1145/2150976.2150998
- [6] Faith Ellen, Panagiota Fatourou, Eric Ruppert, and Franck van Breugel. 2010. Non-Blocking Binary Search Trees. In Proceedings of the 29th ACM SIGACT- SIGOPS Symposium on Principles of Distributed Computing (Zurich, Switzerland) (PODC ’10). Association for Computing Machinery, New York, NY, USA, 131–140. https://doi.org/10.1145/1835698.1835736
- [7] Keir Fraser. 2004. Practical lock-freedom. Ph. D. Dissertation.
- [8] Jeehoon Kang and Jaehwang Jung. 2020. A Marriage of Pointer- and Epoch-Based Reclamation. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (London, UK) (PLDI 2020). Association for Computing Machinery, New York, NY, USA, 314–328. https://doi.org/10.1145/3385412.3385978
- [9] Maged M. Michael. 2002. Safe Memory Reclamation for Dynamic Lock-Free Objects Using Atomic Reads and Writes. In Proceedings of the Twenty-First Annual Symposium on Principles of Distributed Computing (Monterey, California) (PODC ’02). Association for Computing Machinery, New York, NY, USA, 21–30. https://doi.org/10.1145/571825.571829
- [10] Maged M. Michael. 2004. Hazard Pointers: Safe Memory Reclamation for Lock-Free Objects. IEEE Trans. Parallel Distrib. Syst. 15, 6 (June 2004), 491–504. https://doi.org/10.1109/TPDS.2004.8
- [11] Jaehwang Jung, Janggun Lee, Jeonghyeon Kim, and Jeehoon Kang. 2023. Applying Hazard Pointers to More Concurrent Data Structures. In Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA '23). Association for Computing Machinery, New York, NY, USA, 213–226. https://doi.org/10.1145/3558481.3591102
- [12] Daniel Anderson, Guy E. Blelloch, and Yuanhao Wei. 2022. Turning Manual Concurrent Memory Reclamation into Automatic Reference Counting. In Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation (San Diego, CA, USA) (PLDI 2022). Association for Computing Machinery, New York, NY, USA, 61–75. https://doi.org/10.1145/3519939.3523730
- [13] Ajay Singh, Trevor Brown, and Ali Mashtizadeh. 2021. NBR: Neutralization Based Reclamation. In Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (Virtual Event, Republic of Korea) (PPoPP ’21). Association for Computing Machinery, New York, NY, USA, 175–190. https://doi.org/10.1145/3437801.3441625
- [14] Jeonghyeon Kim, Jaehwang Jung, and Jeehoon Kang. 2024. Expediting Hazard Pointers with Bounded RCU Critical Sections. In Proceedings of the 36th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2024), June 17–21, 2024, Nantes, France. ACM, New York, NY, USA, 34 pages. https://doi.org/10.1145/3626183.3659941
- [15] Jaehwang Jung, Jeonghyeon Kim, Matthew J. Parkinson, and Jeehoon Kang. 2024. Concurrent Immediate Reference Counting. Proc. ACM Program. Lang. 8, PLDI, Article 153 (June 2024), 24 pages. https://doi.org/10.1145/3656383
- [16] Gali Sheffi, Maurice Herlihy, and Erez Petrank. 2021. VBR: Version Based Reclamation. In Proceedings of the 33rd ACM Symposium on Parallelism in Algorithms and Architectures (Virtual Event, USA) (SPAA ’21). Association for Computing Machinery, New York, NY, USA, 443–445. https://doi.org/10.1145/3409964.3461817