Back to Blog

Deep Dive into Markus Architecture: Memory, A2A Protocol & Multi-Agent Runtime

Markus Team Markus Team
15 min read

Deep Dive into Markus Architecture: Memory, A2A Protocol & Multi-Agent Runtime

Meta Description: Explore the Markus multi-agent architecture — a production-grade cognitive runtime featuring Tulving three-tier memory, Agent-to-Agent (A2A) protocol, Cognitive Preparation Pipeline, 9-state task governance, and Heartbeat-driven autonomous agents. Learn how AI agents think, remember, and collaborate.


1. Introduction

As AI agents evolve from simple chatbots into autonomous digital employees, the underlying architecture must support memory persistence, inter-agent communication, task governance, and self-directed operation. Markus is an open-source multi-agent runtime that takes a principled approach to all of these challenges.

Inspired by cognitive psychology, distributed systems, and production-grade software engineering, Markus provides a complete infrastructure for deploying teams of AI agents that can remember past interactions, communicate with each other, delegate tasks, follow governance policies, and even initiate work on their own through a Heartbeat mechanism.

This deep dive explores the core architectural components that make Markus a compelling choice for developers building multi-agent systems in production. We’ll cover:

  • The three-layer architecture (Web UI → Org Manager → Agent Runtime)
  • Tulving three-tier memory (Procedural, Semantic, Episodic) and the Dream Cycle
  • The A2A protocol (Agent-to-Agent communication) with mailbox system and attention controller
  • Cognitive Preparation Pipeline (CPP) with four depth levels
  • Task governance: 9-state state machine, approval gates, trust levels, and workspace isolation
  • The Heartbeat mechanism for proactive agent behavior

2. The Three-Layer Architecture

Markus follows a clean separation of concerns with three distinct layers. Each layer has a clearly defined responsibility, and they communicate through well-defined interfaces.

┌──────────────────────────────────────────────────────────────┐
│                 Web UI (React + Vite + Tailwind)              │
│   Dashboard · Chat · Project Management · Builder · Hub      │
└──────────────────────────────┬───────────────────────────────┘
                               │ REST + WebSocket
┌──────────────────────────────┴───────────────────────────────┐
│                 Org Manager (API Server)                       │
│   Authentication · Task Governance · Project Management       │
│   Reporting · User Management                                 │
└──────────────────────────────┬───────────────────────────────┘

┌──────────────────────────────┴───────────────────────────────┐
│                 Agent Runtime (Core Runtime)                    │
│   Agent · LLM Router · Tool System · Memory · Heartbeat       │
│   Mailbox · Attention Controller · Context Engine             │
└──────────┬──────────────────────────────┬────────────────────┘
           │                              │
┌──────────┴──────────────┐   ┌──────────┴────────────────────┐
│  Storage (SQLite/PostgreSQL)│   Comms Bridges               │
│                           │   Slack · Feishu · WhatsApp     │
│                           │   Telegram · WeCom             │
└──────────────────────────┘   └─────────────────────────────┘

2.1 Web UI (Presentation Layer)

The frontend is built with React + Vite + Tailwind CSS, providing a responsive dashboard that works across desktop and mobile. It offers workspaces for chat, project management, agent configuration (Builder), capability discovery (Hub), and system settings. Communication with the server layer uses REST for standard CRUD operations and WebSocket for real-time updates like task status changes and agent messages.

2.2 Org Manager (API / Governance Layer)

The Org Manager serves as the central API server. It handles:

  • Authentication & Authorization — user and agent identity management
  • Task Governance — state machine transitions, approval routing, and policy enforcement
  • Project Management — project creation, milestone tracking, deliverable management
  • Reporting & User Management — audit logs, team organization

This layer is stateless from the agent perspective; it orchestrates governance without interfering with agent execution logic.

2.3 Agent Runtime (Core Cognitive Layer)

The Agent Runtime is where the actual intelligence lives. It manages:

  • Agent lifecycle — creation, session management, sub-agent spawning
  • LLM Router — intelligent model selection, failover, circuit breaker
  • Tool System — tool registration, execution, sandboxing
  • Memory System — Procedural, Semantic, and Episodic tiers
  • A2A Communication — mailbox, message routing, delegation
  • Heartbeat Scheduler — autonomous periodic task execution
  • Context Engine — 24-segment system prompt assembly with KV-cache optimization

The runtime is designed so that multiple agents can coexist, each with isolated workspaces and independent memory profiles.


3. Tulving Three-Tier Memory System

Named after cognitive psychologist Endel Tulving, Markus implements a three-tier memory architecture that mirrors human memory systems. This is a defining feature of its cognitive architecture and one of the key differentiators from simpler agent frameworks.

┌─────────────────────────────────────────────────────────────────┐
│                        Memory System Overview                    │
├─────────────┬─────────────────┬─────────────────────────────────┤
│  Procedural │    Semantic      │         Episodic                │
│  ("How to") │  ("Know what")   │      ("What happened")          │
├─────────────┼─────────────────┼─────────────────────────────────┤
│  ROLE.md    │  MEMORY.md      │  sessions/*.json (current)      │
│  Skills     │  memories.json  │  SQLite agent_activities         │
│  Behavior   │  Long-term      │  (past activities, on-demand)   │
│  Rules      │  Knowledge      │                                  │
└─────────────┴─────────────────┴─────────────────────────────────┘

3.1 Procedural Memory — “How to Act”

Procedural memory encodes the agent’s identity, behavioral rules, and skill definitions. It answers the question: Who am I, and how should I behave?

AspectDetail
Storagerole/ROLE.md + Skill definition files
ContentAgent identity, system prompts, behavior boundaries, action policies
LoadingPrepended to the system prompt at every inference cycle
MutabilityROLE.md is immutable by the agent — only human users can modify core identity

This layer ensures that an agent cannot rewrite its own fundamental character. It creates a stable anchor for identity, preventing drift during extended autonomous operation.

3.2 Semantic Memory — “What I Know”

Semantic memory stores factual knowledge, verified patterns, workflows, and domain expertise. It is the agent’s accumulated long-term knowledge base.

AspectDetail
StorageMEMORY.md (curated, always in prompt) + memories.json (observation buffer, searchable)
CapacityMEMORY.md: 3,000 characters per section, 15,000 total
Key Toolsmemory_save (save observation), memory_search (retrieve), memory_update_longterm (consolidate to MEMORY.md)

Semantic memory is the primary mechanism for learning from experience. The agent saves observations, searches for relevant knowledge during tasks, and periodically consolidates important patterns into its permanent knowledge base.

3.3 Episodic Memory — “What Happened”

Episodic memory records the agent’s past experiences — tasks it performed, messages it received, sessions it participated in.

AspectDetail
Storagesessions/*.json (current + recent sessions), SQLite agent_activities (historical)
Retrievalrecall_activity tool — query by task, type, or keyword
Use CaseContextual awareness, learning from past outcomes, continuity across sessions

Unlike semantic memory which stores generalized knowledge, episodic memory preserves specific experiences. This allows agents to answer questions like “What did I work on yesterday?” or “How did I solve that similar problem last time?“

3.4 The Dream Cycle — Memory Consolidation

Markus features an autonomous memory consolidation process called the Dream Cycle, inspired by how human brains consolidate memories during sleep.

Trigger: memories.json > 50 entries AND not run today


LLM reviews all observations

    ├── Merge duplicates
    ├── Prune outdated entries
    ├── Identify recurring patterns


Pattern appears 3+ times?
    ├── Yes → Promote to MEMORY.md
    └── No  → Retain or discard


Prune source entries from memories.json

The Dream Cycle ensures that:

  • Noise is filtered out — one-off events don’t clutter long-term memory
  • Patterns are promoted — recurring observations graduate to permanent knowledge
  • Storage is bounded — the observation buffer stays within reasonable limits

This is a critical feature for long-running agents that accumulate thousands of observations over time. Without consolidation, memory would become unwieldy and retrieval would degrade.


4. A2A Agent-to-Agent Communication Protocol

Agents don’t work in isolation — they communicate. Markus implements a proprietary A2A (Agent-to-Agent) protocol specifically designed for AI agent communication, built on top of a robust mailbox system.

Agent A                    Mailbox DB               Agent B
   │                          │                        │
   │── agent_send_message ──►│  (queued as INBOX)      │
   │                          │                        │
   │                          │── context switched ──► │
   │                          │   (picked from MAIL)   │
   │                          │                        │
   │◄── agent_send_message ──│─────────────────────────│
   │   (reply, wait_for_reply│                        │
   │    = true)              │                        │

4.1 Mailbox System

Every agent has a persistent mailbox stored in the database:

  • OUTBOX — Messages the agent has sent (for audit trail)
  • INBOX — Incoming messages waiting to be processed
  • MAIL — Processed messages (archived)
  • PARKED — Messages addressed but not yet picked up by their target agent

Messages are asynchronous by default — sending does not block either the sender or the receiver. The receiver processes messages on its own schedule during context switches.

4.2 Synchronous vs. Asynchronous Communication

ModeToolBehaviorUse Case
Asyncagent_send_message (default)Fire-and-forget; sender continues immediatelyStatus updates, notifications, non-blocking coordination
Syncagent_send_message({ wait_for_reply: true })Sender blocks until receiver respondsQuestions requiring immediate answers, decisions

The wait_for_reply: true mode is powerful but should be used judiciously — it pauses the sending agent’s execution until the receiver responds.

4.3 Attention Controller

Linked to the mailbox system is the Attention Controller, which determines how the agent spends its cognitive cycles. In each execution loop, the agent:

  1. Checks for high-priority tasks (blockers, reviews, urgent messages)
  2. Checks mailbox for new A2A messages
  3. Processes pending tasks in priority order

This ensures that an agent doesn’t get stuck on a single task while urgent messages pile up.


5. Cognitive Preparation Pipeline (CPP)

The Cognitive Preparation Pipeline is the system that assembles the agent’s running context before every inference call. It operates at one of four depth levels:

Context Assembly

Incoming Request


┌─────────────────────────────────────────────────────────────────────────────┐
│ 1. System Message     — Role identity, behavioral rules, system context     │
│ 2. Profile            — Agent name, role, team, assigned skills             │
│ 3. Environment        — OS, tools, resources, installed packages           │
│ 4. State (varies)     — Task details, context window management             │
│ 5. Instructions       — Tool usage rules, process guidelines                │
│ 6. Notifications      — Task status, DAG resolution, system updates        │
│ 7. Skills             — Activated skill instructions / MCP definitions      │
│ 8. Memory             — Semantic (MEMORY.md) + Episodic (recall results)    │
│ 9. Conversation       — Recent tool calls, results, and decisions           │
│ 10. Long Context      — Past conversation history (compressed if large)    │
│ 11. Tools             — Available function definitions                      │
│ 12. Tool Results      — Results from tool calls in current cycle            │
│ 13+ ...               — Additional dynamic context                          │
└─────────────────────────────────────────────────────────────────────────────┘

CPP Depth Levels

LevelNameSegmentsWhen Used
0Minimal1-5 (skips memory search)Simple tool calls, context switch decisions
1Standard1-12 (with memory recall)Normal task execution, most operations
2Extended1-10 (without tools)Planning, decomposition, reflection
3Full1-10 (with long context)Complex reasoning, architecture decisions, task creation

The system optimizes context size by excluding irrelevant segments at shallow depths, reducing token consumption by up to 60% for simple operations.

KV-Cache Optimization

One of Markus’s more advanced features is multi-agent KV-cache management:

  • Each agent maintains an independent KV-cache session with the LLM provider
  • Agent-specific caching ensures inference efficiency across an agent’s session lifespan
  • Context switching between agents preserves cache state, avoiding redundant computation when the agent resumes its session

This is a deceptively important optimization. In a 10-agent team, without proper caching, an agent would pay cold-start latency every time it’s called upon. With KV-caching, context switches are near-instantaneous.


6. Task Governance System

Task governance is the backbone of Markus’s reliability layer. Every piece of work — from code changes to content creation — flows through a precisely defined state machine.

6.1 9-State Finite State Machine

  pending ─► in_progress ─► review ─► completed
     │            │            │
     │            ▼            │
     └───► blocked ◄───────────┘


  rejected                     
                               
     ┌────────────────────┐    
     │ archived ◄─────────│    
     └────────────────────┘    
                       
  (Also: failed, cancelled)
StateDescription
pendingCreated but not yet started
in_progressAssigned and actively being worked on
blockedWaiting on external dependency
reviewSubmitted for peer review
completedApproved by reviewer
failedExecution error, agent or tool failure
rejectedRequirement rejected, not goaled
cancelledExplicitly cancelled by manager
archivedStored for historical reference

6.2 Submit-Review-Merge (SRM) Workflow

This is Markus’s built-in quality gate. No deliverable can be completed without going through this cycle:

  1. Submit: Worker agent calls task_submit_review with summary and deliverable references
  2. Review: The designated reviewer_agent_id inspects the submission
  3. Approve: Reviewer marks the task as completed (auto-completes)
  4. Reject: Task returns to in_progress with feedback
  5. Retry: Worker revises and resubmits

This enforces a four-eyes principle on every deliverable, preventing single-agent mistakes from reaching production.

6.3 Trust Levels and Approval Gates

Agents build trust over time based on their delivery track record:

LevelAutonomyReview RequirementPromotion Criteria
ProbationLowAll tasks reviewed by senior agentSuccessful deliveries
StandardMediumComplex tasks require reviewConsistent quality
TrustedHighSignificant tasks only need reviewTrack record of first-pass approvals
SeniorFullCan review others’ workMentorship and quality leadership

Approval gates sit between task creation and execution, allowing organizations to define which operations require human sign-off before proceeding.

6.4 Workspace Isolation

Each agent gets an isolated workspace directory. The system enforces file access boundaries:

  • Agents can read/write anywhere on disk (OS permissions allowing)
  • But they cannot write to other agents’ workspace directories
  • Shared workspace readable by all agents, writable only to designated ones

This prevents an agent from accidentally (or intentionally) corrupting another agent’s files while allowing collaboration via the shared space.


7. The Heartbeat Mechanism

The Heartbeat is what transforms Markus from a reactive system (waiting for input) into a proactive workforce that initiates work autonomously.

7.1 How It Works

Heartbeat Tick (configurable interval: every 60–300s)


Check agent's mailbox for unread messages


Check pending tasks (any assigned but unstarted?)


Check scheduled tasks (any recurring tasks due?)


Check own patrol items (defined in HEARTBEAT.md)


If nothing urgent: process next pending task

Each agent has a HEARTBEAT.md file where it defines its own personal patrol checklist — things it should regularly check or monitor.

7.2 What Heartbeat Enables

ScenarioWithout HeartbeatWith Markus Heartbeat
Codebase scanNeed to schedule via CI/CDAgent scans daily on its own
Content publishingManual trigger requiredAgent publishes on schedule
System monitoringRequires external tools (Prometheus, Datadog)Agent checks and reports hourly
Task managementNeed human to assignAgent picks up pending tasks autonomously

The Heartbeat makes Markus fundamentally different from chat-based AI tools. Your AI team doesn’t wait for you to give it work. It actively looks for things to do, within the boundaries you’ve set.


8. Context Engine & System Prompt Assembly

Every time an agent processes an input, the Context Engine assembles a system prompt from up to 24 segments. This dynamic assembly ensures that the agent always has the right context for the current operation without wasting tokens on irrelevant information.

Assembly Priority

The system prompt is built in this order:

  1. System Message — Core role identity
  2. Context Window Management — Conversation length tracking, compression triggers
  3. State Overrides — Current attention state (pending mailbox items, task context)
  4. Announcements — System-wide directives from human operators
  5. Policies — Security, workspace, delivery rules
  6. Team Working Norms — Project-specific procedures (NORMS.md)
  7. Notifications — Task status changes, dependency resolutions
  8. Skill Instructions — Active skill documentation
  9. Memory — Curated knowledge (MEMORY.md) + goal-relevant memories
  10. Human Feedback — Direct manager comments
  11. Conversation History — Recent interaction log
  12. Tool Definitions — Available function signatures
  13. Tool Results — Return values from prior tool calls 14+ Context from Prior Sessions — Compressed session summaries when context window is at risk

Dynamic Compression: When the total context exceeds the LLM’s context window, the system automatically compresses the oldest segments (typically conversation history and prior session summaries) into a condensed form.


9. Conclusion

The Markus architecture represents a principled approach to building a production-grade multi-agent runtime. It doesn’t take shortcuts — memory is not a vector store hack, communication is not shared chat history, and governance is not an afterthought.

If you are building multi-agent systems for real work, the Markus architecture offers proven solutions to the hard problems:

  • Memory — Three-tier, self-consolidating system inspired by human cognition
  • Communication — A2A protocol with mailbox system and attention controller
  • Governance — 9-state task FSM with trust levels, approval gates, and SRM workflow
  • Proactivity — Heartbeat-driven autonomous operation
  • Extensibility — Skill system with Markus Hub marketplace

Markus is free and open source (AGPL-3.0), available at github.com/markus-global/markus.


Markus is an open source AI Workforce Platform. Install it today with curl -fsSL https://markus.global/install.sh | bash.

Keywords: Markus architecture, multi-agent system, AI agent memory, A2A protocol, agent-to-agent communication, cognitive architecture, task governance, Heartbeat system, agent runtime, Tulving memory, dream cycle, context engine, agent orchestration, open source AI platform

On this page

Share this post