New Technique Could Help Self-Driving Cars See Their Surroundings Better

October 16, 2024 Matt Simpson 2-min. read

An image of the interior of a car without a driver, and a center console mapping the traffic around it digitally. — Researchers developed a technique to help self-driving cars’ AI programs better map the spaces around them.

Thanks to a technique developed by researchers at NC State University, autonomous vehicles might one day be able to navigate the roadways much better. The technique allows artificial intelligence programs to more accurately map three-dimensional spaces using two-dimensional images.

“Most autonomous vehicles use powerful AI programs called vision transformers to take 2D images from multiple cameras and create a representation of the 3D space around the vehicle,” says Tianfu Wu, an associate professor of electrical and computer engineering at NC State and corresponding author of a paper on the new technique. “However, while each of these AI programs takes a different approach, there is still substantial room for improvement.”

While these AI programs use different approaches, Wu and his collaborators’ new technique holds the potential to substantially improve them all.

“Our technique, called Multi-View Attentive Contextualization (MvACon), is a plug-and-play supplement that can be used in conjunction with these existing vision transformer AIs to improve their ability to map 3D spaces,” Wu says. “The vision transformers aren’t getting any additional data from their cameras, they’re just able to make better use of the data.”

The research team tested MvACon’s performance with three leading vision transformers currently on the market, all of which rely on a set of six cameras to collect the 2D images they transform.

MvACon significantly improved the performance of all three vision transformers.

“Performance was particularly improved when it came to locating objects, as well as the speed and orientation of those objects,” Wu says.

The research team presented the paper “Multi-View Attentive Contextualization for Multi-View 3D Object Detection” at this year’s IEEE/CVF Conference on Computer Vision and Pattern Recognition. Xianpeng Liu, a recent NC State Ph.D. graduate, was first author. The paper was co-authored by Ce Zheng and Chen Chen of the University of Central Florida; Ming Qian and Nan Xue of the Ant Group; and Zhebin Zhang and Chen Li of the OPPO U.S. Research Center.

This article is based on a news release from NC State University.