Snapshot-Free, Transparent, and Robust Memory Reclamation for Lock-Free Data Structures
Wed 23 Jun 2021 21:25 - 21:30 at PLDI-A - Talks 1A: Concurrent and Distributed Programming
We present a family of safe memory reclamation schemes, Hyaline, which are fast, scalable, and transparent to the underlying lock-free data structures. Hyaline is based on reference counting – considered impractical for memory reclamation in the past due to high overheads. Hyaline uses reference counters only during reclamation, but not while accessing individual objects, which reduces overheads for object accesses. Since with reference counters, an arbitrary thread ends up freeing memory, Hyaline's reclamation workload is (almost) balanced across all threads, unlike most prior reclamation schemes such as epoch-based reclamation (EBR) or hazard pointers (HP). Hyaline often yields (excellent) EBR-grade performance with (good) HP-grade memory efficiency, which is a challenging trade-off with all existing schemes.
Hyaline schemes offer: (i) high performance; (ii) good memory efficiency; (iii) robustness: bounding memory usage even in the presence of stalled threads, a well-known problem with EBR; (iv) transparency: supporting virtually unbounded number of threads (or concurrent entities) that can be created and deleted dynamically, and effortlessly join existent workload; (v) autonomy: avoiding special OS mechanisms and being non-intrusive to runtime or compiler environments; (vi) simplicity: enabling easy integration into unmanaged C/C++ code; and (vii) generality: supporting many data structures. All existing schemes lack one or more properties.
We have implemented and tested Hyaline on x86(-64), ARM32/64, PowerPC, and MIPS. The general approach requires LL/SC or double-width CAS, while a specialized version also works with single-width CAS. Our evaluation reveals that Hyaline's throughput is very high – it steadily outperforms EBR by 10% in one test and yields 2x gains in oversubscribed scenarios. Hyaline's superior memory efficiency is especially evident in read-dominated workloads.