Published on 00/00/0000
Last updated on 00/00/0000
Published on 00/00/0000
Last updated on 00/00/0000
Share
Share
INSIGHTS
10 min read
Share
In 1955, a group of researchers led by John McCarthy submitted a conference proposal titled “A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence.” Since that moment, the community of researchers hasn’t stopped developing ways to enable machines to perform tasks that would otherwise require human intelligence.
The launch of ChatGPT in late 2022 brought artificial intelligence (AI), specifically Generative AI (GenAI), to the mainstream. In the past few years, advancements in machine learning models, increased computational power, access to large datasets and other key factors all contributed to the rapid advancement of GenAI technologies, making it one of the most exciting areas in AI today.
GenAI has many use cases, such as content creation, process automation, chatbots, and virtual assistants. While many new GenAI services exist, vendors often provide GenAI capabilities as part of their existing SaaS offerings. To prepare for the initial adoption of GenAI technologies at Cisco, my colleagues in InfoSec and I conducted initial vendor security assessments of a few GenAI offerings from third-party vendors.
We learned that AI services are designed, developed, tested, deployed, and operated like any other cloud services. The existing security controls and best practices for cloud services also apply to AI services. However, there are additional security concerns when it comes to using Large Language Models (LLMs). Beyond security risks, there are also legal, privacy, and ethical concerns regarding LLMs and AI. Developing a comprehensive AI strategy is essential for enhancing security operations and ensuring responsible AI integration.
AI can be leveraged in various areas within an organization. However, not all organizations are ready to adopt AI in security operations. According to the Cisco Cybersecurity Readiness Index released in March 2024, more than half of organizations have yet to incorporate AI into their security operations to secure networks, identity, devices, and cloud.
My InfoSec colleagues and I set out to become early adopters of GenAI. As we learn more about GenAI, we are always thinking of ways to leverage these new technologies in enterprise security operations. Across Cisco, thousands of business applications are being used every day to run the business. More than half of these applications have sensitive data. The Cisco InfoSec team works hard to enforce measurable security controls for enterprise applications. This had us wondering how we could use AI to move faster and scale more effectively to meet the demands of our security operations. Conversations with others on the InfoSec teams led us to many potential use cases of using AI to deliver automated and agile business processes.
Specifically, we identified these four types of use cases in security operations:
Among the four use cases, the security chatbot was selected by our team as the first GenAI project. Playing with the OpenAI GPT models, we quickly realized that there are challenges of using LLMs. These foundation models are powerful out of box, but they were not trained with specific knowledge needed to respond properly to our prompts. Foundation models are limited to everything they learned during model training, known as parametric knowledge. They can’t account for current events or private data. To overcome this limitation, we needed to supply our own data.
These are four common ways to do it, listed from highest to lowest cost:
Both options three and four are part of a relatively new discipline, called prompt engineering, for developing and optimizing prompts to efficiently use LLMs. They are not without limitations. The most common one is token limit, which is a restriction on the number of tokens that an LLM can process in one interaction. LLMs can have different token limits; for example, GPT3.5 has a token limit of 4,096 while GPT4 has a token limit of 32,768. A common English word may consume one to several tokens. In general, LLMs with larger token limits can provide better answers, at a slower response speed and higher operating cost compared to LLMs with smaller token limits.
We are also concerned about the consequences of sending private data to LLMs. It’s imperative to conduct a comprehensive assessment of LLM providers with respect to security, privacy, and regulatory compliance. Inquiries should include, among others:
For the last few months, I worked with a few other security engineers to prototype a proof-of-concept chatbot application. Our project provides a web chat interface, a grand agent that orchestrates multiple tools carrying out different functions based on user inputs, with backend API connection to LLMs and data sources. The entire application runs in a Docker container that can be deployed to any container platform.
In May 2023, the Open Web Application Security Project (OWASP), a non-profit organization that works to improve the security of software, released the top 10 most critical vulnerabilities in LLM applications to inform people about the potential security risks when deploying and managing LLMs.
Among the OWASP top 10, prompt injection is listed as the topmost critical one. As we worked on developing our own AI chatbot applications, we became familiar with the risk of prompt injection. Prompt injection is like other injection attacks commonly seen in applications. The AI chatbot interface makes it easier to inject malicious prompts that could override system prompts. This type of attack on LLMs has been reported1 by researchers in the field as early as 2022. While the exact prompt injection can no longer be re-produced today, there are many variances of prompt injections.
For example, in one of our AI tools, users can ask for security risk information, and the LLM in the backend is smart enough to formulate an SQL query to retrieve relevant data from a backend database. If we don’t put guardrails around the database access used by the LLM, users could potentially instruct the LLM to take more actions in the database than intended for the tool. Since the small group of developers working on the AI project are all security engineers, we consciously incorporated security best practices into our architectural designs and throughout the DevOps lifecycle. However, such practices might not be the standard among other teams.
Hallucination is another big issue in LLMs. In our AI project, although we set the temperature to 0 and instructed the LLM not to make up answers, it still occasionally does. To measure the accuracy of outputs from AI tools, we implemented Python parametrized tests and used AI to verify output. In our test scripts, we asked the AI tools seven different questions, then repeated this process 10 times. On average, three out of 70 questions came back with wrong answers. That’s a 95.7% accuracy rate. This experiment taught us that output verification is essential in AI applications.
Through security research, prototyping an AI chatbot, and collaborating with other teams of AI adopters, we gained knowledge and expertise. For any team considering AI, we put together this short list of guidance on successful AI adoption:
To conclude, there are many potential use cases of GenAI in security operations. During the development of a prototype AI project, I gained valuable insights into GenAI's capabilities and challenges. Leveraging the guidelines above, I can help with the adoption of GenAI technologies within our teams. Given the unprecedented rate at which GenAI is evolving, it is imperative to start our engagement with these technologies early to remain at the forefront of this rapidly advancing field.
I regularly use GenAI tools for work and personal use. I strongly recommend that others do the same.
It can be as simple as using a chatbot powered by GenAI such as ChatGPT. For people who know how to code, look for opportunities to write simple scripts or apps using LLMs. Together, we will all benefit from learning more about GenAI to make informed decisions regarding these new technologies.
Learn more about how AI and security overlap here.
Get emerging insights on innovative technology straight to your inbox.
Discover why security teams rely on Panoptica's graph-based technology to navigate and prioritize risks across multi-cloud landscapes, enhancing accuracy and resilience in safeguarding diverse ecosystems.
The Shift is Outshift’s exclusive newsletter.
The latest news and updates on cloud native modern applications, application security, generative AI, quantum computing, and other groundbreaking innovations shaping the future of technology.