Computer programs are incorporating more and more safety features to protect users, but those features can also slow the programs down by 1,000 percent or more. Researchers at North Carolina State University have developed a software tool that helps these programs run much more efficiently without sacrificing their safety features.
“These safety features – or meta-functions – can slow a program down so much that software developers will often leave them out entirely,” says Dr. James Tuck, an assistant professor of electrical and computer engineering at NC State and leader of the research team that designed the new tool. “Leaving out those features can mean that you don’t identify a problem as soon as you could or should, which can be important – particularly if it’s a problem that puts your system at risk from attack.”
Historically, these safety features have been incorporated directly into a software program’s code, and are run through the same core – the central processing unit that serves as the brain of a computer chip – that the program itself uses. That is what slows the program down. Researchers at NC State have developed a tool that takes advantage of multi-core computer chips by running the safety features on a separate core in the same chip – most chips currently contain between four and eight cores – allowing the main program to run at close-to-normal operating speed.
“To give you some idea of the problem, we saw the application we were testing being slowed down by approximately 580 percent,” Tuck says. “Utilizing our software tool, we were able to incorporate safety metafunctions, while only slowing the program down by approximately 25 percent. That’s a huge difference.”
This multi-core approach has been tried before, but previous efforts were unwieldy and involved replicating huge chunks of code – a process that was time-consuming and used a great deal of power. The new tool, Tuck says, “significantly streamlines the safety feature work being done by other cores.”
Tuck stresses that that tool functions automatically, and does not involve manual reprogramming. In fact, Tuck’s team found that the tool is more effective than manual reprogramming for at least some applications, and is far less labor intensive.
The software tool is implemented as a plug-in for the Gnu Compiler Collection of software tools, and Tuck’s team is working to fine-tune and extend the tool to support a wider range of applications and meta-functions. “We plan to release the first version of this tool as open-source software later this spring,” Tuck says.
A paper describing the research, “Automatic Parallelization of Fine-Grained Meta-Functions on a Chip Multiprocessor,” will be presented April 5 at the International Symposium on Code Generation and Optimization in Chamonix, France. The paper was co-authored by Tuck and NC State Ph.D. student Sanghoon Lee. The research was supported by the National Science Foundation.
NC State’s Department of Electrical and Computer Engineering is part of the university’s College of Engineering.
Note to Editors: The study abstract follows.
“Automatic Parallelization of Fine-Grained Meta-Functions on a Chip Multiprocessor”
Authors: Sanghoon Lee, James Tuck, North Carolina State University
Presented: April 5 at the International Symposium on Code Generation and Optimization in Chamonix, France
Abstract: Due to the importance of reliability and security, prior studies have proposed inlining meta-functions into applications for detecting bugs and security vulnerabilities. However, because these software techniques add frequent, fine-grained instrumentation to programs, they often incur large runtime overheads. In this work, we consider an automatic thread extraction technique for removing these fine-grained checks from a main application and scheduling them on helper threads. In this way, we can leverage the resources available on a CMP to reduce the latency and overhead of fine-grained checking codes. Our parallelization strategy automatically extracts metafunctions from the main application and executes them in customized helper threads — threads constructed to mirror relevant fragments of the main program’s behavior in order to keep communication and overhead low. To get good performance, we consider optimizations that reduce communication and balance work among many threads. We evaluate our parallelization strategy on Mudflap, a pointer-use checking tool in GCC. To show the benefits of our technique, we compare it to a manually parallelized version of Mudflap. We run our experiments on an architectural simulator with support for fast queueing operations. On a subset of SPECint 2000, our automatically parallelized code is only 29% slower, on average, than the manually parallelized version on a simulated 8-core system. Furthermore, two applications achieve better speedups using our algorithms than with the manual approach. Also, our approach introduces very little overhead in the main program — it is kept under 100%, which is more than a 5.3x reduction compared to serial Mudflap.