Technical Report PHD-2015-11

Title: Multi-Threaded Coordination Methods for Constructing Non-blocking Data Structures
Authors: Anastasia Braginsky
Supervisors: Erez Petrank
Abstract: Shared-memory multiprocessors concurrently execute multiple threads of computation that communicate and synchronize through shared memory. Typically, this communication and synchronization is done via concurrent data structures, whose efficiency is crucial to performance. Furthermore, new challenges arise in designing scalable concurrent data structures that can perform well with an increasing number of concurrent threads. Non-blocking data structures are scalable and provide a progress guarantee. If several threads attempt to concurrently apply an operation on the structure, it is guaranteed that one of the threads will eventually make progress. Consequently, the popular mutual exclusion synchronization protocol cannot be used in non-blocking data structures.

The first goal of this dissertation is to address the design of high-performance, concurrent, non-blocking data structures for shared memory multiprocessor platforms. Our intention is that the non-blocking data structure will become the primary choice for a concurrent data structure. To this end, we first present a high performance linked list. We extend state-of-the-art lock-free linked lists by building linked lists with special attention to locality of traversals. These linked lists are built of sequences of entries that reside on consecutive chunks of memory. When traversing such lists, subsequent entries typically reside on the same chunk and are thus close to each other, e.g., in same cache line or on the same virtual memory page. Such cache-conscious implementations of linked lists are frequently used in practice, but making them lock-free requires care. The basic component of this construction is a chunk of entries in the list that maintains a minimum and a maximum number of entries. This basic chunk component is an interesting tool on its own and is used to build the other lock-free data structures that we present.

Another high-performance, non-blocking data structure presented in this dissertation is a priority queue. Priority queues are an important algorithmic component and are ubiquitous in systems and software. With the rapid deployment of parallel platforms, concurrent versions of priority queues are becoming increasingly important. In this dissertation, we present a novel, concurrent, lock-free linearizable algorithm for priority queues that significantly outperforms all known (lock-based or lock-free) priority queues. Our algorithm employs recent advances, including lock-free chunks and the use of the efficient fetch-and-increment atomic instruction. Measurements demonstrate a performance improvement by a factor of up to 2 over existing approaches to concurrent priority queues.

The second goal of this dissertation is to expand the coverage of existing lock-free variants for the data structures whose lock-free implementations have not yet been discovered. This is because the lock-free data structures provide a progress guarantee and are known for facilitating scalability, avoiding deadlocks and live- locks, and providing guaranteed system responsiveness. In this dissertation we present a design for a lock-free balanced tree, specifically, a B+tree. The B+tree data structure has an important practical applications, and is used in various storage-system products. To the best of our knowledge, this is the first design of a lock-free, dynamic, and balanced tree that employs standard compare-and-swap operations.

The third and final dissertation goal is to support the efficient memory management for non-blocking data structures. Efficient memory management of dynamic non-blocking data structures remains an important open question. Existing methods either sacrifice the ability to deallocate objects or reduce performance notably. In this dissertation, we present a novel technique, called Drop the Anchor, which significantly reduces the overhead associated with the memory management while reclaiming objects even in the presence of thread failures. We demonstrate this memory management scheme on the common linked list data structure. Using extensive evaluation, we show that Drop the Anchor significantly outperforms Hazard Pointers, the widely used technique for non-blocking memory management.

All the above-mentioned algorithms were evaluated empirically and shown to provide high throughput and performance. They all utilize our new method of thread coordination for the case when threads need to be redirected from an obsolete part of the data structure to a new one. We denote this technique freezing. The freezing technique supports the restructuring of the lock-free data structures, in order to notify threads to move to another part of the data structure, usually because the part they are currently using is obsolete. This is done by setting a special freeze-bit on data or pointer words in the obsolete part, making the data unsuitable for updating. A thread that fails in its attempt to use a frozen pointer or data realizes that this part of the data structure is obsolete and restarts its operation. For performance optimization we later batch the freeze-bits into separate freeze-words.

CopyrightThe above paper is copyright by the Technion, Author(s), or others. Please contact the author(s) for more information

Remark: Any link to this technical report should be to this page (, rather than to the URL of the PDF files directly. The latter URLs may change without notice.

To the list of the PHD technical reports of 2015
To the main CS technical reports page

Computer science department, Technion