AI/ML

AI/ML

clock icon

10 min read

Blog thumbnail
Published on 04/15/2025
Last updated on 04/15/2025

Composing event-driven multi-agent workflows with a gRPC-based distributed agent runtime

Share

Distributed multi-agent software are systems composed of multiple autonomous agents running on independent computing nodes, communicating over a network through event-driven workflows to answer a user query.  

These applications appear to the end user as a single coherent system, even though the constituent agents are operating independently within their own environments and are often unaware about each other’s role and involvement in the workflow. This design enables unprecedented flexibility in how agent capabilities can be dynamically composed and orchestrated.  

Key characteristics of such systems are: 

  • Decentralization: No single point of control, with agents making local decisions.
  • Scalability: Ability to add more agents to the system without significant reconfiguration.
  • Concurrency: Agents operate independently and in parallel.
  • Autonomy: Each agent can perceive its environment, make decisions, and take actions.
  • Event-driven communication: Agents emit and react to events without knowing which other agents will handle their messages, promoting loose coupling.
  • Coordination: Workflows can emerge from agent interactions, even when agents have no direct interdependencies. 

The journey toward composing scalable multi-agent software has evolved from single process applications where all agents run within the same process to sophisticated distributed architectures. Central to this evolution is the agent runtime—a specialized program/API that manages agent identities, lifecycles, and communication patterns. Much like how programming language runtimes provide necessary infrastructure for code execution, agent runtimes provide communication infrastructure for agents to interact, collaborate, and solve complex problems together. 

The core constructs of Internet of Agents (IoA) remote gateway distributed agent runtime (DAR), address the unique challenges of multi-agent systems. DAR enables the construction of sophisticated multi-agent software that can operate seamlessly across network boundaries while maintaining the coordination necessary for collaborative problem-solving.  

By providing the critical infrastructure for agent communication, lifecycle management, security enforcement, and operational monitoring, agent gateway represents the next step in unlocking the full potential of multi-agent software.  

Agent runtimes 

1. Standalone agent runtime 

The simplest form of a multi-agent software operates within a standalone runtime—a single-process environment where specialized agents and tools execute collaboratively. In this configuration, data and memory sharing across agents and tools occurs naturally, similar to how methods within a class access shared attributes as shown below. When Agent2 needs to recognize Agent1 or share data, no special mechanisms are required as they operate within the same memory space.

    def __init__(self):
        # Shared data store for all agents
        self.shared_data = {}
    
    def agent1(self, key, value):
        # Agent 1 stores data in the shared store
        self.shared_data[key] = value
        print(f"Agent 1: Stored '{value}' with key '{key}'")
    
    def agent2(self, key):
        # Agent 2 retrieves data from the shared store
        value = self.shared_data.get(key, "Not found")
        print(f"Agent 2: Retrieved '{value}' using key '{key}'")
        return value

For straightforward use cases with limited complexity and well-defined agent interactions, a standalone runtime prodes efficiency and simplicity. Development teams can rapidly prototype multi-agent software without addressing cross-process communication and lifecycle management challenges.  

Limitations of standalone agent runtime 

Despite their simplicity, building multi-agent software using a standalone agent runtime becomes limiting as the system scales: 

Limited extensibility 

Extending multi-agent software requires substantial development effort such as building new agents from the ground up, reconfiguring the entire LangGraph workflow including modifying current edges, adding conditional edges, etc. to accommodate new message flows. This process becomes increasingly complex as the number of agents increases. 

Standalone architectures create silos that prevent leveraging externally developed components. Agents cannot be easily integrated across organizational boundaries, requiring redundant development of similar capabilities and limiting innovation. 

Limited reusability 

Organizations cannot effectively reuse agent components across projects or incorporate agents developed by third parties, resulting in duplicated efforts and inconsistent implementations. 

2. Distributed agent runtime (gRPC-based) - agent gateway and agent gateway protocol (AGP) 

A distributed agent runtime is comprised of agents and an agent gateway, which orchestrates inter-agent communication using agent gateway protocol (AGP) primitives. The AGP specification defines a standardized communication framework for AI agents. 

It supports diverse messaging patterns, including request-response, publish-subscribe, fire-and-forget, and streaming. Built on gRPC, agent gateway exposes a host server which remote agents connect to. This gateway ensures secure, scalable, and efficient interactions between agents, enabling robust multi-agent collaboration.  

Such a distributed agent runtime enables developers to build event-driven multi-agent workflows that address the limitations of standalone agent runtimes by enabling seamless cross-process communication, agent lifecycle management, and privacy preservation in multi-agent software. This runtime allows agents—developed in different programming languages and frameworks—to interact across distributed environments, running on different host machines over a network. 

The following messaging patterns are defined by the AGP and can be implemented by the agent gateway and the third party remote agents. 

  • Request-response: Supports synchronous communication between agents.
  • Publish-subscribe: Allows agents to publish messages to topics and subscribe to receive messages from topics.
  • Fire-and-forget: Enables agents to send messages without waiting for a response.
  • Streaming: Supports both unidirectional and bidirectional streaming.
  • Security: Employs authentication, authorization, and end-to-end encryption to protect data privacy and integrity. 

The agent gateway consists of three primary components: 

  1. Control plane  
  2. Data plane  
  3. Gateway interface (Developer friendly Python bindings) 

The control plane handles agent administration, including tenant management, namespace organization, agent categorization, and authentication. It features a registration service for agent onboarding, agent discovery for metadata, and token rotation for secure authentication with OAuth2 tokens. End-to-end encryption ensures message privacy via payload-level encryption by agents and transport-level encryption using HTTPS. 

The data plane is responsible for efficiently forwarding messages between agents, ensuring seamless communication. Designed for scalability, it is a critical component of the system, present in both the agent SDK and the backend service. 

The gateway interface (Python bindings) provides a convenient way for agents built using heterogeneous frameworks to communicate within a distributed system. These bindings act as a bridge between Python-based clients and the underlying gRPC-based gateway, enabling seamless interaction with both the control plane and data plane services.  

Key functions of the gateway interface 

  • Agent communication: The bindings allow agents to send and receive messages. Using gateway.publish() and gateway.receive(), agents can exchange messages asynchronously, ensuring efficient communication even in complex systems.
  • Session and agent management: The bindings provide methods to create and manage agent sessions. Through gateway.create_agent() and gateway.subscribe(), agents can register with the gateway server and establish communication channels for ongoing interactions.
  • Dynamic route management: With gateway.set_route(), agents can dynamically define routes for message delivery, ensuring messages are directed to the correct recipients across distributed systems.
  • Asynchronous operations: Built on Python’s asyncio library, the bindings enable non-blocking operations, allowing agents to handle multiple tasks concurrently without waiting for messages or replies, ensuring responsiveness in real-time environments.
  • OpenTelemetry integration: For observability, the gateway interface integrates with OpenTelemetry to trace agent activity and monitor performance, providing insights into the system’s health. 

The gateway server manages agent registrations and message routing, while the Python client interacts with these services, managing subscriptions, sending messages, and handling responses asynchronously. 

Agent identification and routing 

Each agent is uniquely identified using a hierarchical structure: 

AgentID = Organization/Namespace/Agent-type/Agent-UUID 

     Organization – The tenant that owns and registers the agent. 
     Namespace – A logical partition for traffic segmentation. 
     Agent-type – The category/role of the agent, such as "finance," "healthcare”, etc.
     Agent-UUID – A unique identifier assigned per agent instance, changing with each restart. 

Topic structure and message routing 

Messages are routed based on topic structures, which define how they reach their destination: 

  1. One-to-many: Organization/Namespace/Agent-type
    • Targets all agents of a specific type within the namespace.
    • The fan-out parameter controls whether messages go to all instances or a subset.
  2. One-to-one: Organization/Namespace/Agent-type/Agent-UUID
    • Sends messages to a specific agent instance based on its unique identifier. 

The gateway maintains subscription tables to efficiently map incoming messages to their intended recipients. The gateway keeps track of active agents and their connections via a connection table. 

  • Agent-to-connection table: Maps agent identifiers to active network connections.
  • Reverse connection table: Allows efficient cleanup when connections drop by mapping connections back to their agents. 

Optimized subscription tables 

To ensure efficient routing, subscriptions are structured hierarchically: 

  • Main table: Maps organization/namespace pairs to agent-type-specific tables.
  • Agent-type tables: Track agents within each category for quick lookups. 

This structure optimizes message delivery by reducing lookup overhead and ensuring efficient memory access. 

Communication patterns 

The gateway uses the following three message forwarding strategies to route messages efficiently based on the intended recipients.  

Unicast forwarding 

Messages are sent to a specific agent instance using a fully qualified topic. The system validates the destination and ensures point-to-point delivery. 

Broadcast forwarding 

Messages are sent to all agents of a particular type by omitting the agent-UUID and using a fan-out mechanism. This is useful for distributing system-wide updates or commands. 

Anycast forwarding 

A message is sent to one randomly selected instance of an agent type, ensuring load balancing while minimizing unnecessary message duplication. 

Event handling and message processing 

Connection events 

  • On connection: The system authenticates the agent and updates its connection tables.
  • On disconnection: The system removes the agent’s entries, updates subscription tables, and notifies other components if needed. 

Subscription management 

  • Subscribe events: Register an agent for receiving messages of a specific topic.
  • Unsubscribe events: Remove an agent from subscription lists and propagate updates. 

Message handling 

  • The system validates incoming messages, matches them against active subscriptions, and forwards them accordingly.
  • If no direct match is found, default routing strategies are applied to ensure message delivery. 

Proof of concept: Multi-agent software using distributed agent runtime (Agent Gateway and AGP) 

This section explores a distributed agent runtime for multi-agent software using an event-driven publish-subscribe model. The system leverages an agent gateway and the agent gateway protocol (AGP) to facilitate messaging between agents built on heterogeneous agentic frameworks. These four remote agents, running on different hosts, connect to a remote gateway using topics for structured communication. 
 
System design

One gateway host (gRPC server): Manages core logic and request processing. 

Four gateway clients (gRPC clients): Send requests to invoke gateway functions for agent interactions. Each client is part of a LangGraph or Autogen based agentic application. 


Components 

Agents & toolsFrameworkProtocolPublishes toSubscribes toHost/DeploymentRuntime
IOA Agent GatewaygRPCAGP--EC2 instance #1gRPC Agent Host Servicer
Writer AgentLangGraphAGP‘GroupChat’ TopicGroup Chat Topic, Writer TopicEC2 instance #4gRPC Agent Runtime Worker #1
Reviewer AgentAutoGenAGPGroupChat TopicGroup Chat Topic, Editor TopicEC2 instance #2gRPC Agent Runtime Worker #2
User Interface AgentAutoGenAGP-UI TopicLocal MachinegRPC Agent Runtime Worker #3
Central Orchestrator AgentAutoGenAGPUI TopicGroup Chat TopicEC2 instance #3gRPC Agent Runtime Worker #4

Flow Diagram

A screenshot of a computer

AI-generated content may be incorrect.

Watch a demo of the system in action


Key characteristics 

  1. Distributed agents with gRPC interconnectivity
    • Agents run on different hosts/machines, with no direct knowledge of each other
    • gRPC provides high-performance, language-agnostic communication between components
    • Geographic distribution becomes possible, enabling global-scale agent systems
  2. Event-driven communication through PUB-SUB messaging
    • Agents utilize a publish-subscribe model where messages are published to topics
    • The runtime ensures message delivery based on subscriptions
    • This event-driven approach decouples senders from receivers, enhancing system flexibility
    • Agents can respond dynamically to system events without tight coupling
  3. Heterogeneous framework integration
    • Agents are implemented using different frameworks such as LangGraph and Autogen
    • Each runtime instance typically hosts a single agent, enhancing fault isolation 

The proof of concept validates that these integrated technologies and architectural patterns work together to create a scalable, flexible system capable of supporting heterogeneous AI workflows across organizational boundaries.

Choosing the right architecture 

The decision between standalone and distributed runtime architectures depends on specific application requirements: 
 
Choose standalone runtime when: 

  • Building simple prototypes
  • Working with a limited number of agents that share common resources
  • Operating within a single development team
  • Performance is critical and inter-process communication would add unacceptable overhead

Choose distributed runtime when: 

  • Incorporating agents from multiple sources or organizations
  • Building complex systems that may scale beyond a single machine
  • Supporting diverse programming languages and frameworks
  • Creating resilient systems that require workload distribution
  • Developing enterprise-grade software with long-term extensibility requirements 

Benefits of the distributed runtime paradigm

The distributed agent runtime delivers multiple advantages for AI systems:

Cross-organizational integration 

  • Agents from different organizations can seamlessly interact
  • Third parties can leverage existing agents without rebuilding them
  • Agent capabilities become accessible services over networks 

Development and operational efficiency 

  • Teams can develop specialized agents independently
  • Workloads can be distributed across machines for better scaling
  • Runtime infrastructure handles agent lifecycle management 

Build with us at the AGNTCY

This architecture enables unprecedented integration possibilities while simplifying the development of complex, collaborative AI systems that work effectively across organizational boundaries. 

To learn more, collaborate with us at the AGNTCY - an open source collective building the infrastructure for the Internet of Agents

Subscribe card background
Subscribe
Subscribe to
The Shift!

Get emerging insights on innovative technology straight to your inbox.

Welcome to the future of agentic AI: The Internet of Agents

Outshift is leading the way in building an open, interoperable, agent-first, quantum-safe infrastructure for the future of artificial intelligence.

thumbnail

* No email required

Subscribe
Subscribe
 to
The Shift
!
Get
emerging insights
on innovative technology straight to your inbox.

The Shift is Outshift’s exclusive newsletter.

Get the latest news and updates on agentic AI, quantum, next-gen infra, and other groundbreaking innovations shaping the future of technology straight to your inbox.

By submitting this form, you agree that Cisco may process your personal information as described in its Online Privacy Statement. Cisco may contact you with offers, promotions, and the latest news regarding its products and services. You can unsubscribe at any time.

Outshift Background