Skip to main content
News Releases

Researchers Improve AI’s Ability to Learn New Tasks Without Sacrificing Performance

bright blue lines trace across a dark blue back ground; each line ends in a tiny white light. it represents strings of data
Image credit: Luke Jones.

For Immediate Release

A new framework allows AI models that have already been trained to learn new tasks without sacrificing performance when performing old tasks. The framework, called CHEEM, also improves an AI model’s operating efficiency by using fewer computational steps to perform simpler tasks.

“CHEEM addresses two longstanding challenges for AI models: continual learning and adaptive intelligence,” says Tianfu Wu, corresponding author of a paper on the work and an associate professor of computer engineering at North Carolina State University.

Continual learning refers to the ability of an AI model to take in new data and learn to perform new tasks. The challenge with continual learning is that training an AI model to perform new tasks often results in the model getting worse at tasks it was already trained to perform.

Adaptive intelligence refers to the ability of an AI model to change its computation process depending on the complexity of the task it is asked to perform. For example, many prominent AI models – including large language models – run the same chain of computations regardless of what they are being asked to do, which is not very efficient. The challenge here is training an AI model so that it uses fewer computations to solve simple tasks, more computations to solve complex tasks, and so on.

“We think these two challenges are intertwined, and that we can make progress toward adaptive intelligence by improving a model’s ability to engage in continual learning,” says Wu. “This is the fundamental idea behind CHEEM.”

CHEEM, which stands for Continual Hierarchical-Exploration-Exploitation Memory, gives models a great deal of flexibility in terms of how to use their existing computational architecture when learning a new task. A model can use an existing layer, modify an existing layer, skip an existing layer entirely, or add new layers. Ultimately, this flexibility helps a model find a good balance between leveraging its existing knowledge, integrating new data, and allocating computational resources depending on the complexity of the task it is being asked to perform.

To test the CHEEM framework, the researchers made use of a state-of-the-art vision transformer model – a large, complex model that is already in widespread use. Specifically, the researchers used CHEEM to train the vision transformer model using two benchmark datasets: MTIL and VDD.

“Both benchmarks are challenging, because they contain many different tasks and many different kinds of tasks,” says Wu. “That makes them good test cases.”

CHEEM significantly outperformed existing state-of-the-art continual learning methods against both benchmarks.

“CHEEM got very close to achieving the full fine-tuning upper bound for these new tasks, meaning that it was almost as good as if you had trained the model to only perform that one task,” says Wu.

“In addition, CHEEM improved the adaptive intelligence of the model significantly. The model tailored its computational structure depending on the complexity of the task, and it did so in a semantically meaningful way. In other words, if a new task was similar to a previous task, the model would use much of the pre-existing architecture; but if a new task was very different from any previous task, it would add new layers that allow it to perform the task.

“We’re excited about what we’ve been able to demonstrate with CHEEM,” says Wu. “At this point, we’re looking for collaborators who could help us access the computational resources necessary to evaluate CHEEM’s performance on large foundation models that have billions of parameters.”

The peer-reviewed paper, “CHEEM: Continual Learning by Reuse, New, Adapt and Skip – A Hierarchical Exploration-Exploitation Approach,” will be presented at the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), being held June 3-7 in Denver, Colo. First author of the paper is Chinmay Savadikar, a Ph.D. student at NC State.

This work was done with support from the Army Research Office, under grants W911NF1810295 and W911NF2210010; the National Science Foundation, under grants 1909644, 2024688 and 2013451; and an NC State Goodnight Early Career Award.

-shipman-

Note to Editors: The study abstract follows.

“CHEEM: Continual Learning by Reuse, New, Adapt and Skip – A Hierarchical Exploration-Exploitation Approach”

Authors: Chinmay Savadikar and Tianfu Wu, North Carolina State University; Michelle Dai, Johns Hopkins University

Presented: The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2026 (CVPR), June 3-7, Denver, Colo.

DOI: 10.48550/arXiv.2303.08250

Abstract: To effectively manage the complexities of real-world dynamic environments, continual learning must incrementally acquire, update, and accumulate knowledge from a stream of tasks of different nature without suffering from catastrophic forgetting of prior knowledge. While this capability is innate to human cognition, it remains a significant challenge for modern deep learning systems. At the heart of this challenge lies the stability-plasticity dilemma: the need to balance leveraging prior knowledge, integrating novel information, and allocating model capacity adaptively based on task complexity and synergy. In this paper, we propose a novel exemplar-free class-incremental continual learning (ExfCCL) framework that addresses these issues through a Hierarchical Exploration-Exploitation (HEE) approach. The core of our method is a HEE-guided efficient neural architecture search (HEE-NAS) that enables a learning-to-adapt backbone via four primitive operations – reuse, new, adapt, and skip – thereby serving as an internal memory that dynamically updates selected components across streaming tasks. To address the task ID inference problem in ExfCCL, we exploit an external memory of task centroids proposed in the prior art. We term our method CHEEM (Continual Hierarchical-Exploration-Exploitation Memory). CHEEM is evaluated on the challenging MTIL and VDD benchmarks using both Tiny and Base Vision Transformers and a proposed holistic Figure-of-Merit (FoM) metric. It significantly outperforms state-of-the-art prompting-based continual learning methods, closely approaching full fine-tuning upper bounds. Furthermore, it learns adaptive model structures tailored to individual tasks in a semantically meaningful way. Our code is available at https://github.com/savadikarc/cheem.