Published on 00/00/0000
Last updated on 00/00/0000
Published on 00/00/0000
Last updated on 00/00/0000
Share
Share
INSIGHTS
15 min read
Share
The Internet of Things (IoT) has been a transformative force in the last decade, connecting the physical world assets to the digital realm in unprecedented ways. However, as we move toward a more interconnected future, a new paradigm is emerging at the intersection of IoT, large language models (LLMs), and artificial intelligence (AI). This is known as the intent-driven internet of agents (IIOAs).
These agents, powered by sophisticated LLMs and machine learning algorithms, are not just reactive but proactive entities that can understand and act upon the end user’s intent. As the adoption of this generative AI (GenAI) technology increases, the potential for agents to revolutionize how enterprise IT or IOT interacts with the digital ecosystem will be monumental.
IIOA is inspired by the internet’s success with IoT technology in connecting people worldwide over the last decade. With IIOA the long-term goal is to create a similar platform for LLM-based AI agents, allowing them to collaborate on any generic task. IIOA introduces ‘intent’ driven ways for AI agents to integrate, communicate, and form teams, making the collaboration process more flexible and scalable. With IIOA the end customer gets better personalization as a user experience value.
Collaboration across agents or agentic devices: Unlike traditional systems that run on a single device, IIOA supports AI agents working from multiple devices, types, and locations all supporting natural language as the mode of communication.
Flexible communication: IIOA enables AI agents to form teams and communicate dynamically, adapting to the needs of the task at hand. This flexibility allows for more efficient teamwork.
Integration of various agents: IIOA allows for the integration of different third-party AI agents, increasing the diversity and capabilities of the system. This means the platform can handle a wider range of tasks effectively.
As per the Sequence Flow illustration below for the architectural view of “intent-driven internet of agents (IIOA),” the end user business intent realization consists of intent generation, intent translation, policy generation, and configuration deployment for developing a multi-agent system (MAS). Among them, intent is mainly given by end users, an IT customer persona, for example, or operators directly by using Text data or natural language processing (NLP) constructs. This business intent is inferred by the system automatically and a multi-agentic application which is composed of many agents is built dynamically in the workflow.
IIOAs are advanced AI systems that combine the capabilities of LLMs with the connectivity of IoT devices. They are designed to comprehend, predict, and respond to human intentions through NLP and machine learning. Unlike traditional IoT devices that require explicit commands, IIOAs can interpret the context and nuances of human language, allowing for a more intuitive and seamless interaction.
Intent is the cornerstone of IIOAs. It refers to the purpose or goal behind a user's command or query. By understanding intent, IIOAs can provide more accurate and relevant responses, anticipate user needs, and automate tasks without explicit instructions. This level of understanding is achieved through the integration of LLMs, which have been trained on vast amounts of text data to grasp the subtleties of human communication.
Smart homes: IIOAs embedded within IoT devices can transform smart homes by understanding the routines and preferences of residents. For example, an IIOA could adjust the thermostat, lighting, and music based on the inferred mood or schedule of the homeowner.
Enterprise IT: This landscape is undergoing a significant transformation, driven by the advent of the Internet of Agents (IIoA). This new wave of technology brings forth a network of intelligent agents that can communicate, learn, and make decisions autonomously. Some of the common use cases are code generation, cybersecurity, asset inventory management, enterprise resource and planning (ERP), customer relationship management (CRM), content creation, and financial forecasting.
While IIOAs hold immense promise, there are challenges to consider. Privacy and security are paramount, as IIOAs will handle sensitive data. Ensuring that these agents operate transparently and ethically is also crucial. Additionally, there is the need for robust error handling and the ability to deal with ambiguous or conflicting intents.
JavaScript Object Notation (JSON) is a text-based data format that is designed to be readable by humans and easy to parse for machines. It is based on a subset of the JavaScript language but is language-independent, with parsers available for virtually every programming environment. In the context of IIOAs, JSON serves as the backbone for data exchange between agents and the services they interact with, enabling them to communicate intent and context efficiently.
At the core of this sophisticated network lies the need for a robust and flexible data interchange format. JSON has emerged as the de facto standard for data modeling in this context, thanks to its lightweight nature, ease of use, and wide compatibility. JSON can be effectively utilized for data modeling in the development of intent-driven IIOAs.
To effectively model data for IIOAs using JSON, one must consider the types of data that agents will need to process. This typically includes user commands, NLP constructs, contextual information, sensor data, and the agents' responses. A well-designed JSON data model for IIOAs should:
JSON template:
{
"intent": " hmm...I am feeling cold today, AdjustTemperature",
"context": {
"location": "LivingRoom",
"user": "JohnDoe",
"timeOfDay": "Evening"
},
"parameters": {
"temperature": "22C",
},
“tools”: {
"name": "Real time change of Temperature based on End user’s Intent",
"description": " {Temp} change during {Time of Day} based on {Threshold}.",
"parameters": {
"type": "object",
"properties": {
"Temp": {"type": "number", "description": "Temperature"},
"Time of Day": {"type": "number", "description": "Time for Temperature change"}
“Threshold””: {"type": "number", "description": “Cutoff Threshold Value “},
}
"required": ["Temp", "Time of Day", "Threshold”]
}
}
In this example, the JSON object clearly defines the end user’s intent received via Voice NLP prompt. This intent and context are passed on to the LLM which is part of the agent entity for generating the plan. As part of the plan execution an LLM-based agentic workflow via an internal tool is used to adjust the temperature of the thermostat based on the threshold versus difference of outside temperature measured via an external weather API based tool and the inside room temperature via an internal tool. The intent, context and the end user who has given the NLP command, the parameters for the temperature adjustment, tool selection, and the security token for authentication are all part of this IIOA workflow.
Existing LLM based agentic application development frameworks like LangChain, LlamaIndex, crewAI, AutoGen, and AutoGPT have the system prompt text functionality, which can be used to define the ‘intent driven role’ for the specific agent in terms of functionality. However, there are some practical challenges to make this intent driven role execution scalable for building a large-scale enterprise IT platform comprised of multiple LLM-agentic workflows by leveraging these LLM-application building frameworks.
While JSON is a powerful tool for data modeling in IIOAs, developers must be mindful of potential challenges such as data validation, error handling, tool execution errors, tool security threat issues and maintaining backward compatibility. Additionally, as IIOAs become more sophisticated, the data models must be designed to accommodate increasingly complex intents and contexts to achieve the business outcome for many enterprise IT use cases.
Let’s imagine a real world IIOA use case on how LLM-based Internet of Agents (IoA) could be modeled and applied in real-world scenarios, using the example of a robotic arm in a manufacturing shop to illustrate the power of this intent driven technology.
The arm is equipped with sensors to measure pressure and vibration levels, ensuring the machinery operates within safe parameters. Let’s assume the robotic arm is not a standalone device and it's part of an IIOA ecosystem that spans across multiple geographies with multiple arms running small language models/SLM’s, with LLM-based planning and agentic design pattern-based reasoning capabilities any IIOA driven business task can be achieved. For example, with the power of agents for Bedrock hosted the AWS cloud or with the power of Bedrock API for leveraging LLM models the entire use case can be automated in an Intent and context driven paradigm.
As per the below AWS based end to end architectural reference, this use case can be implemented with the IIOA paradigm with a ReACT design pattern based LLM agent. Depending on the data payload/contents of the incoming query, the end user’s intent, context, geospatial coordinates of the IOT Sensor/device are all identified from the NLP text prompt input, and then it has to perform the action by using AWS Bedrock agent’s action group feature by invoking specific tools at runtime to accomplish the LLM-agentic workflow to provide the business outcome for this robotic arm use case.
There is a need to have a ‘distributed agentic runtime’ at the edge of the network to accomplish such an IIOA outcome at scale across geographies, which can be implemented by leveraging AWS services like GreenGrass et al. The basic LLM-agentic workflow could be implemented either with AWS Agents for Bedrock or directly with the Bedrock model API access. The outcome of AWS Bedrock for agents, which are different actions from the action groups, could be used to define the different actionable workflows for triggering the AWS Lambda function, SES email notification, and for generation the IoT Device_ID specific report with AWS Athena for data visualization. In a holistic manner, the below mentioned functionality can be abstracted to create different components in a layered view as illustrated in Fig (1) above for the IIOA paradigm.
To remain scalable and extensible, the IIOA's stack is designed with a layered approach:
Planning layer: The main LLM Application Programming Interfaces (API) call happens in the cloud, where the AWS Bedrock API processes the data and provides a planning strategy as the next steps. This layer is responsible for “planning” and "reasoning" over the input prompt text data and making informed decisions automatically.
Planning involved feedback (PIF): PIF involves an iterative process where the LLM agent learns from each interaction, continuously improving its planning and decision-making abilities over a period of time. In this scenario, the LLM agent engages in continuous planning based on feedback received during interactions. For instance, if the user asks the LLM about the best time to visit New York City, and the response provided by the LLM agent includes outdated information or doesn't fully address the user's query, the user might provide feedback or ask follow-up questions. Upon receiving feedback, the LLM agent re-evaluates its response and plans its next action accordingly. It may refine its search criteria, tool selection, or tool execution or adjust its communication strategy to better meet the end user's intent driven requirements.
For example, if the end user happens to watch cricket matches as per the past history and search related cookie details then the IIOA would implicitly call a tool to search for specific ‘temporal attributes based scheduling’ information from the ICC cricket website to see if there is any upcoming cricket match scheduled to happen in New York City. The final LLM response would be determined based on the tool execution and provide the updated travel data to New York for the end user.
Planning without feedback (PWF): Planning without feedback (PWF) requires the LLM agent to anticipate user needs and preferences, making informed decisions based on the context of the query and available information. In this scenario, the LLM agent operates without immediate feedback from the end user. For example, if the user asks the LLM to provide a list of recommended activities near New York City, the LLM agent must plan its response based solely on the initial query and available data sources.
Without feedback, the LLM agent relies on pre-defined strategies, past historical data about the end user, heuristics, and its internal knowledge base to generate a response. It may consider factors such as popular tourist attractions, weather conditions, and user preferences inferred from past interactions.
Tooling layer: In the context of LLM agents, tools refer to external resources, services, or APIs that the agent can utilize to perform specific tasks or enhance its capabilities. These tools serve as supplementary components that extend the functionality of the LLM agent beyond its inherent language generation capabilities. Tools could also include APIs and Search APIs such as Serp-API, DDG, web crawlers, databases, knowledge bases, and external models. Examples of tools the agents can employ are a RAG pipeline for producing contextually relevant responses, a code interpreter for addressing programming challenges, an API for conducting internet searches, or even straightforward API services, such as those for weather updates or instant messaging applications.
Execution layer or run-time layer: A lightweight Kubernetes engine like K3s orchestrates the deployment of the necessary tools and actions. This same functionality could be done by agentic tool execution engines like Toolformer or Gorilla’s GoRex. It ensures that the robotic arm executes the planned tasks, such as adjusting the conveyor belt's tension to maintain optimal pressure and vibration levels. There is a need to have distributed agentic runtime to solve this specific robotic arm use case. A more detailed blog post would be written in future on this layered approach.
The IIOA ecosystem is designed to perform real-time stitching of workflows. This means that the system can dynamically adjust its behavior based on the data it receives. For example, if the robotic arm detects that the vibration or pressure of the conveyor belt exceeds the expected threshold, it can trigger an automated response to bring the values back to normal.
A key aspect of IIOAs is the loosely coupled architecture, which allows for the flexible selection and execution of tools. Unlike a tightly coupled system where components are interdependent, a loosely coupled system enables late binding in real time. This programming strategy allows for components to be connected and interact with each other only when needed, enhancing the system's adaptability and scalability.
The intent-driven IIOAs represents a paradigm shift in how we approach enterprise IT, IoT ecosystems. By modeling IIOAs with a layered and loosely coupled architecture, we can create systems that are not only reactive but also proactive and capable of intelligent decision-making.
The example of the robotic arm in a manufacturing environment demonstrates how IIOAs can lead to more efficient, safe, and self-optimizing industrial processes with LLM-based agentic workflows.
As we continue to refine and implement these LLM-based agents in a layered architecture, the possibilities for innovation and optimization in IT are boundless, paving the way for a smarter and more connected future.
IIOAs represent a significant leap forward in how we interact with technology. By focusing on user intent, IIOAs promise to deliver more intuitive, efficient, and personalized experiences.
As we continue to develop and refine these agents, we stand on the cusp of a new era where the line between the physical and digital worlds becomes increasingly blurred, leading to a future where technology serves us in more natural and profound ways.
Stay up to date on more breakthroughs in artificial intelligence. Read more here.
References:
Get emerging insights on innovative technology straight to your inbox.
Discover why security teams rely on Panoptica's graph-based technology to navigate and prioritize risks across multi-cloud landscapes, enhancing accuracy and resilience in safeguarding diverse ecosystems.
The Shift is Outshift’s exclusive newsletter.
The latest news and updates on cloud native modern applications, application security, generative AI, quantum computing, and other groundbreaking innovations shaping the future of technology.