MULTIRACE

Efficient On-the-Fly

Data Race Detection Tool for
Multithreaded C++ Programs

 

 

Assaf Schuster and Eli Pozniansky

Computer Science Department

Technion – Israel Institute of Technology

 

Multithreading is a common programming paradigm, that is well-suited for multiprocessor environments. The advantages of multithreading over single threaded programming are obvious – it enables parallelism and improves performance. However, multithreading also introduces the problem of data races.

Data race occurs when two or more threads concurrently access a shared variable, and at least one is for writing. Such situation is usually considered as a mistake (or bug) and in most cases undesired, since it might lead to unpredictable results and incorrect execution. The data races usually stem from errors made by programmers, who fail to place appropriate synchronization at correct places in the program.

The detection of data races is extremely essential for debugging of multithreaded programs and assuring their correctness. Nevertheless, there is no single universal technique, which is capable of handling the task efficiently, since in general case data race detection is computationally hard. Hence, all currently available techniques, when applied to some general program, usually result rather in excessive false positives (false alarms) or in large amount of false negatives (undetected races).

We developed a novel testing tool, called MULTIRACE, which combines two very powerful algorithms for dynamic detection of apparent data races in multithreaded programs. The first algorithm is a revised version of DJIT, called DJIT+, which is based on Lamport's happens-before partial order relation and is capable of efficiently detecting apparent data races as they occur during the execution. The second algorithm is an improved variant of LOCKSET (first implemented in Eraser tool), that can warn about variables for which the locking discipline (policy common among programmers) is violated during the program's execution.

Both DJIT+ and improved LOCKSET detect data races in programs that use global and two-way synchronization primitives – barriers and locks. In fact, they can be easily extended to use other common synchronization primitives required by different programming models. In addition, these algorithms complete each other, in a sense that the one compensates for the faults of the other, by this making the data race detection problem a much easier task. For example, the set of locking discipline violations, reported by LOCKSET for some program, usually remains unchanged under differences in threads' interleaving (except for some bizarre cases). Yet, such violations are not always crucial bugs and not necessarily lead to occurrences of feasible data races. On the other hand, the DJIT+ algorithm detects only the actual data races that occurred during the execution. Still, it is very sensitive to differences in threads' interleaving and it can miss some or even all of data races due to some especially unfortunate execution.

By exploiting the novel techniques of views and pointer swizzling MULTIRACE detects data races in granularity of variables and objects in the program, rather than in granularity of fixed number of bytes (a small unit usually results in missing some of the data races, while a large one leads to false detection). As far as we know, it is the first fully functional tool for multithreaded environment capable of doing so. MULTIRACE achieves this task through the help of source automatic and transparent source code instrumentation. In this approach the code of the tested program is preprocessed, modified and recompiled, such that the calls to detection mechanisms are injected in places where accesses to shared variables are performed. It should be emphasized here, that in contrast to most currently available tools, which require some kind of programmer's cooperation, the detection in MULTIRACE is completely transparent and does not require any interference except for some specific adjustments and configurations.

Finally, MULTIRACE imposes only a small overhead, compared to other currently available tools. This fact makes it an even more attractive tool for on-the-fly data race detection. The slowdowns measured for six common benchmark applications are very low, whereas previous works report much higher overheads for similar kinds of applications.

 

The relatively low overhead of MULTIRACE, its transparency, its powerful detection algorithms and the ability to match the detection granularity to that of the objects used in the program, make our tool superior to all previously suggested on-the-fly detection techniques.

 

Links:

·        MultiRace – Efficient On-the-Fly Data Race Detection in Multithreaded C++ Programs (the draft version of the article)

·        Distributes Systems Lab site in Technion

·        Concurrent and Distributed Programming course site in Technion

·        Dynamic Data-Race Detection in Lock-Based Multithreaded Programs - PowerPoint presentation by Eli Pozniansky

·        Millipede Project site (Distributed Shared Memory system)

·        Assaf Schuster homepage

·        Stefan Savage - Eraser: A Dynamic Data Race Detector for Multithreaded Programs

·        Tim Brecht publications – Region Trap Library (RTL)

·        Sarita Adve homepage

·        References