PAIN: A framework for PArallel fault INjection experiments

Fault injection experiments are a common approach to assess the robustness of component-based software systems or, more precisely, the absence thereof. In order to find out if adverse behavior of less critical components affects the reliable operation of more critical components in the software system, faults are injected into the former and the behavior of the latter is observed in their presence. As these tests are usually conducted on fully integrated systems, individual fault injection experiments tend to have long run times. PAIN (PArallel fault INjection) exploits parallel hardware to improve on fault injection experiment throughput by executing multiple experiments concurrently. PAIN is based on the GRINDER fault injection framework.

To demonstrate throughput improvements, we have conducted fault injections on the Linux-based Android OS kernel. Although experiment throughput was greatly improved by concurrent executions, we also observed significant deviations in the obtained result distributions, i.e., concurrent experiment executions led to different results than sequential executions. We identified time-sensitive failure detectors as a major cause for the observed deviations and proposed a systematic timeout calibration strategy for their elimination.

The details of our study have been published in a conference paper:

Stefan Winter, Oliver Schwahn, Roberto Natella, Neeraj Suri, and Domenico Cotroneo.
“No PAIN, No Gain? The Utility of PArallel Fault INjections”
Proc. of the 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering (ICSE 2015)
Florence, Italy, May, 2015, pp. 494-505
DOI: 10.1109/ICSE.2015.67
ISBN: 978-1-4799-1934-5
IEEE Computer Society Conference Publishing Services

All source code and configurations used in our study have been released on github under the AGPL v3 license:

In case you encounter any problems with the tool, please drop us a note: