Research Streamlines Data Processing To Solve Problems More Efficiently

March 10, 2010 Matt Shipman 3-min. read

Researchers at North Carolina State University have developed a new analytical method that opens the door to faster processing of large amounts of information, with applications in fields as diverse as the military, medical diagnostics and homeland security.

“The problem we address here is this: When faced with a large amount of data, how do you determine which pieces of that information are relevant for solving a specific problem,” says Dr. Joel Trussell, a professor of electrical and computer engineering at NC State and co-author of a paper describing the research. “For example, how would you select the smallest number of features that would allow a robot to differentiate between water and solid ground, based on visual data collected by video?”

This is important, because the more data you need to solve a problem, the more expensive it is to collect the data and the longer it will take to process the data. “The work we’ve done here allows for a more efficient collection of data by targeting exactly what information is most important to the decision-making process,” Trussell says. “Basically, we’ve created a new algorithm that can be used to determine how much data is needed to make a decision with a minimal rate of error.”

One application for the new algorithm, discussed in the paper, is for the development of programs that can analyze hyperspectral data from military cameras in order to identify potential targets. Hyperspectral technology allows for finer resolution of the wavelengths of light that are visible to the human eye, though it can also collect information from the infrared spectrum – which can be used to identify specific materials, among other things. The algorithm could be used to ensure that such a program would operate efficiently, minimizing data storage needs and allowing the data to be processed more quickly.

But Trussell notes that “there are plenty of problems out there where people are faced with a vast amount of data, visual or otherwise, – such as medical situations, where doctors may have the results from multiple imaging tests. For example, the algorithm would allow the development of a more efficient screening process for evaluating medical images – such as mammograms – from a large group of people.”

Another potential application would be for biometrics, such as homeland security efforts to identify terrorists and others on the Department of Homeland Security watchlist based on video and camera images.

The research, “Constrained Dimensionality Reduction Using A Mixed-Norm Penalty Function With Neural Networks,” was funded by the U.S. Army Research Office and co-authored by Trussell and former NC State Ph.D. student Huiwen Zeng. The work is published in the March issue of IEEE Transactions on Knowledge and Data Engineering.

NC State’s Department of Electrical and Computer Engineering is part of the university’s College of Engineering.

-shipman-

Note to editors: The study abstract follows.

“Constrained Dimensionality Reduction Using A Mixed-Norm Penalty Function With Neural Networks”

Authors: Huiwen Zeng and H.J. Trussell, North Carolina State University

Published: March 2010, IEEE Transactions on Knowledge and Data Engineering

Abstract: Reducing the dimensionality of a classification problem produces a more computational efficient system. Since the dimensionality of a classification problem is equivalent to the number of neurons in the first hidden layer of a network, this work shows how to eliminate neurons on that layer and simplify the problem. In the cases where the dimensionality cannot be reduced without some degradation in classification performance, we formulate and solve a constrained optimization problem that allows a trade-off between dimensionality and performance. We introduce a novel penalty function and combine it with bi-level optimization to solve the constrained problem. The performance of our method on synthetic and applied problems is superior to other known penalty functions such as weight decay, weight elimination, and Hoyer’s function. An example of dimensionality reduction for hyperspectral image classification demonstrates the practicality of the new method. Finally, we show how the method can be extended to multilayer and multiclass neural network problems.