Int J Parallel Prog
Automatic Generation of Unit Tests for Correlated
Variables in Parallel Programs
Ali Jannesari1,2 · Felix Wolf1,2
Received: 13 August 2014 / Accepted: 10 March 2015 © Springer Science+Business Media New York 2015
Abstract A notorious class of concurrency bugs are race condition related to correlated variables, which make up about 30 % of all non-deadlock concurrency bugs. A solution to prevent this problem is the automatic generation of parallel unit tests. This paper presents an approach to generate parallel unit tests for variable correlations in multithreaded code. We introduce a hybrid approach for identifying correlated variables. Furthermore, we estimate the number of potentially violated correlations for methods executed in parallel. In this way, we are capable of creating unit tests that are suited for race detectors considering correlated variables. We were able to identify more than 85 % of all race conditions on correlated variables in eight applications after applying our parallel unit tests. At the same time, we reduced the number of unnecessary generated unit tests. In comparison to a test generator unaware of variable correlations, redundant unit tests are reduced by up to 50 %, while maintaining the same precision and accuracy in terms of the number of detected races.
Keywords Unit tests · Automatic testing · Parallel programming · Debugging ·
Race detection · Program analysis · Correlated variables 1 Introduction
Nowadays, unit testing plays a major role in the sphere of software development. Software may consist of billions of lines of code and a full error analysis can be very time
B Ali Jannesari firstname.lastname@example.org
Felix Wolf email@example.com 1 German Research School for Simulation Sciences, Aachen, Germany 2 RWTH Aachen University, Aachen, Germany 123
Int J Parallel Prog consuming and is often unnecessary. Normally, only new and modified code regions have to be tested. For this reason, developers create unit tests to effectively test small parts of the program without executing more code than necessary. This is even more helpful when dealing with multithreaded software and concurrency bugs. In multithreaded software, the same bug can result in different behavior for different thread interleavings, making the debugging of multithreaded software hard and expensive.
A specific type of unit test designed to address this problem is the parallel unit test, which focuses on concurrency bugs.
A class of concurrency bugs which are extremely difficult to debug are race conditions related to correlated variables. These race conditions involve multiple variables.
Studies showed that over 30 % of all race conditions involve correlated variables .
To the best of our knowledge, there exists no work which covers the creation of parallel unit test for data races related to correlated variables. There is no unit-test framework available to support race detectors capable of considering correlated variables in their analysis. Given that non-deterministic thread scheduling makes such races generally hard to reproduce, there is a strong need for unit tests covering race conditions on correlated variables.
In this work, we want to combine the benefits of automatic parallel unit test generation with the advantages of race detection for correlated variables.
To achieve this, our work is based on the existing unit test generator AutoRT . In the scope of this paper, we introduce an extension called CorrRT which enhances AutoRT by identifying possible correlation violations in method pairs accessing correlated variables. The higher the number of correlations found, the higher the probability that a potential race condition violates a variable correlation.
We automatically generated 81 parallel unit tests for correlated variables on eight different applications. After analyzing the unit tests, HCorr , a race detector for correlated variables, reported more than 85 % of the race conditions violating variable correlations. Furthermore, we were able to reduce the number of redundantly generated tests by up to 50 % in comparison to the original AutoRT approach.
The rest of this paper is structured as follows: In Sect. 2, we introduce some basic terms of parallel unit tests, race conditions and correlated variables. Further on, Sect. 3 discusses related work, detailing previous approaches to parallel unit test generators.
Section 4 describes the core of our approach. We present the dynamic approach to enhance the unit test generation on correlated variables. Section 5 briefly explains our implementation of the approach. In Sect. 6, we present experimental results of our implementation on 8 different applications. Finally, Sect. 7 summarizes the paper and gives an outlook on future works. 2 Background
In this section we introduce terms which we use in the scope of this paper. Also, we present some basic techniques our algorithms apply and explain how we define correlations between variables. 123
Int J Parallel Prog
Acquire Lock len = x2 + y2; x = 1len ∗ x; y = 1len ∗ y; end end
Acquire Lock x = 2 ∗ x; end
Acquire Lock y = 2 ∗ y; end end
Fig. 1 A high-level data race violating the semantics of the vector (x, y) 2.1 High-Level Data Races
In our work, we define a race condition as an anomalous behaviour of a program due to a variable’s value unexpectedly depending on the scheduling of threads. We divide race conditions into high-level and low-level data races, according to whether we need a semantic understanding of the code for identifying the race. For low-level data races we can neglect semantics. A low-level data race occurs when two concurrent threads access a shared variable without synchronization and when at least one of these accesses is a write operation.
A high-level race condition can be harder to detect. Generally, when the anomalous behaviour of a race condition is caused by a violation of the underlying program semantics and if the anomaly is not a low-level data race, we speak of a high-level data race. Figure 1 gives an example for such a high-level data race. All accesses have been secured by locks. However, if the runtime normalizes the vector in between the doubling operation, the values of x and y are not correctly tuned to each other any more. We recognize that the semantics of those variables have been violated.