Outshift | Efficiency in AI: Insights from the Cisco Research Summit

Cisco Research hosted an impactful Efficient AI Summit in June, bringing together artificial intelligence (AI) researchers to discuss the latest AI innovations, share best practices, and explore new techniques that can reduce computational costs and make AI systems suitable for deployment on devices.

Leading minds from prestigious institutions such as Purdue University, Georgia Tech, Michigan State University, and Italy’s University of Trento discussed and innovated around the theme of efficiency in artificial intelligence. The summit featured four presentations presented by the researchers.

Efficient AI helps make AI more practical and adaptable across different use cases and platforms. It addresses key challenges related to resource consumption, performance, and deployment, ultimately contributing to more widespread and effective use of AI technologies.

Advancing AI research

Task-agnostic vision-language pruning 

Elisa Ricci, an associate professor at the Department of Information Engineering and Computer Science (DISI) at the University of Trento and Head of Research in Unit Deep Visual Learning at Fondazione Bruno Kessler, shared groundbreaking research on task-agnostic vision-language pruning (TA-VLP)[1].

This innovative approach aims to streamline the deployment of vision-language models (VLMs) by creating universally transferable pruned versions. By eliminating the need for task-specific pruning, TA-VLP promises to enhance efficiency and scalability across diverse AI applications.

Zeroth-order optimization for memory-efficient fine-tuning 

Sijia Liu, Ph.D., an assistant professor at Michigan State University and affiliated professor at IBM Research and an MIT-IBM Watson AI Lab affiliated PI, discussed the application of zeroth-order (ZO) optimization [2] in fine-tuning large language models (LLMs). ZO optimization offers a gradient-free method to fine-tune LLMs, significantly reducing memory usage and computational costs.

This approach is particularly advantageous for handling large models and extended input sequences, marking a significant advancement in memory efficiency for AI applications.

Resource-efficient fine-tuning of pre-trained models 

Ling Liu, a professor in the School of Computer Science at Georgia Tech, addressed challenges in fine-tuning pre-trained large models. Her research focuses on optimizing training efficiency while managing graphics processing unit (GPU) memory constraints and ensuring model robustness. By exploring strategies to overcome data scarcity and privacy concerns, Ling Liu's work [3] underscores the importance of resource efficiency in scaling AI deployments.

Enhancing machine learning deployment on edge devices 

Yung-Hsiang Lu, a professor of electrical and computer engineering at Purdue University, presented innovative solutions for optimizing machine learning (ML) deployment on edge devices. His research leverages hardware-specific information to improve the efficiency and performance of ML models, catering specifically to the constraints and requirements of edge computing environments.

The summit also featured a dynamic roundtable discussion where the presenters explored key industry questions such as:

Collaboration and support: Strategies to enhance the working relationship between Cisco Research and universities to foster efficient AI research and development.
Efficiency preferences: Preferences for efficiency types (memory, model, data) during different AI stages (training, fine-tuning) and the reasons guiding these choices.
Challenges in efficiency: Achieving efficiency in vision-language models and potential side effects of focusing excessively on efficiency in LLMs.
Balancing multiple metrics: The challenge of balancing efficiency metrics (e.g., memory usage, computational cost) with broader considerations such as safety, robustness, and accuracy in AI systems.

“When we look at efficient fine-tuning, most people focus on preserving accuracy or sacrificing less accuracy with minimum ATPO memory footprint of efficient communication, a bandwidth But if the efficient AI would include not only performance as a matrix, but also safety as another matrix, then it would stimulate other researchers to measure the solutions approach with those safety metrics as the one or two criteria.” —Ling Liu

Shaping the future of efficient AI

The Efficient AI Summit provided a platform for thought leaders and practitioners to exchange insights, discuss challenges, and propose innovative solutions to enhance efficiency in AI applications. By exploring cutting-edge research and fostering collaboration, the summit contributes to advancing the frontier of AI technology. Stay tuned for more updates and join us in shaping the future of efficient AI!

Watch the highlights from the summit on our YouTube channel and subscribe to be notified of future Cisco Research summits.