What: Improving the Quality of Error-Handling Code in Systems Software using Function-Local Information
Who: Suman Saha, PhD in the Inria REGAL group
When: March 12th, 11am
Where: Inria B31
Adequate error-handling code is essential to the reliability of any systems software. On an error, such code is responsible for releasing acquired resources to restore the system to a viable state. Omitting such operations leads not only to memory leaks, but also to system crashes and deadlocks.
The C language does not provide any abstractions for exception handling or other forms of error handling, leaving programmers to devise their own conventions for detecting and handling errors. The Linux coding style guidelines suggest placing error handling code at the end of each function, where it can be reached by gotos whenever an error is detected. This coding style has the advantage of putting all of the error-handling code in one place, which eases understanding and maintenance, and reduces code duplication. Nevertheless, this coding style is not always applied. In the first part of the thesis, we propose an automatic program transformation that transforms error-handling code into this style. We have implemented this algorithm as a tool and have applied this tool to five directories (drivers, fs, net, arch, and sound) in Linux 3.6 kernel source code as well as to five widely used open-source systems software projects: PostgreSQL, Apache, Wine, Python, and PHP. This tool successfully converts 22% of the conditionals containing state-restoring error-handling code that have the scope to merge code into one, from the basic strategy to the goto-based strategy.
Even when error handling code is structured according to the Linux coding style guidelines, the management of the releasing of allocated resources remains a continual problem in ensuring the robustness of systems software. Finding such faults is very challenging due to the difficulty of systematically reproducing system errors and the diversity of system resources and their associated resource release operations. To address these issues, over 10 years of research has focused on macroscopic approaches that globally scan a code base for common resource-release operations. Such approaches are notorious for their high rates of false positives, while at the same time, in practice, they leave many faults undetected.
In the second part of the thesis, we propose a novel microscopic approach to finding resource- release faults in systems software, taking into account such software’s diversity of resource types and resource-release operations. Rather than generalizing from the results of a complete scan of the source code, our approach achieves precision and scalability by focusing on the error-handling code of each function. Using a tool, Hector, that we have developed based on this approach, we have found 485 faults in 19 different C systems software projects, including Linux, Python, and Apache, with a false positive rate of 23%, well below the 30% that has been reported to be acceptable to developers. Some of these faults are exploitable by an unprivileged malicious user, making it possible to crash the entire system.