Skip to content

Multi-agent frameworks

Multi-Agent Framework Patterns — Structural Survey

Section titled “Multi-Agent Framework Patterns — Structural Survey”

Audience: Wes, designing 6-18 variant agent fleet Question: What folder/file structure do production multi-agent frameworks use? Does anything resemble Active/Reference/Logs/Archive? Method: Pulled docs + actual GitHub example trees for each framework. Citations inline. Date: 2026-05-11


Every production framework converges on roughly the same layout:

project-root/
├── src/<package>/
│ ├── agents/ # one file per agent (persona + tools + model)
│ ├── tools/ # one file per tool, shared across agents
│ ├── workflows/ # orchestration / graphs / flows
│ ├── config/ # YAML/JSON declarative configs (optional)
│ ├── state.py # shared schema (LangGraph) or memory blocks (Letta)
│ └── index.ts | main.py # wire-up entry point
├── knowledge/ | data/ # static reference docs (optional)
├── tests/
├── .env.example
└── package.json | pyproject.toml | langgraph.json | mastra config

Wes’s Active/Reference/Logs/Archive pattern: NO framework recommends it as a code-organization scheme. What every framework DOES recommend is splitting by role/responsibility (agents, tools, workflows, config) — i.e., what a thing IS, not what its lifecycle state is.

That said, the Active/Archive lifecycle distinction maps cleanly to a separate dimension that frameworks handle outside the code tree: D1/database state, vault notes, transcript logs. Wes’s pattern is reinventing what these frameworks push to runtime stores. Specifics in the verdict section.


Source: docs.langchain.com/oss/python/langgraph/application-structure + github.com/langchain-ai/new-langgraph-project + react-agent template

One Python file. Default template puts the whole agent in src/<pkg>/graph.py as a compiled StateGraph. The richer react-agent template splits it into graph.py + state.py + tools.py + prompts.py + context.py + utils.py.

No subfolder-per-agent convention. Multi-agent systems are expressed as nodes in a graph, not as files. Each agent is a function (or sub-graph) added with builder.add_node("agent_name", agent_fn). The langgraph-supervisor library deprecates folder splitting in favor of supervisor agents using tool-handoff. (langgraph-supervisor-py README)

StateGraph(TypedDict) schema — a single shared dict that all nodes read/write. Reducers (Annotated[str, operator.add]) merge concurrent updates. Subgraphs can have private state keys plus shared keys with the parent. (langgraph state docs)

Checkpointer patternMemorySaver, SqliteSaver, PostgresSaver, RedisSaver. State is serialized per thread_id. There is no agent-identity-file; identity = thread_id + checkpointer.

Section titled “5. Recommended structure (canonical, from new-langgraph-project)”
my-app/
├── src/
│ └── agent/
│ ├── __init__.py
│ └── graph.py # the only required file
├── tests/
│ ├── integration_tests/
│ └── unit_tests/
├── .env.example
├── langgraph.json # registers graphs + deps
├── pyproject.toml
└── Makefile

react-agent variant (more realistic):

my-app/
├── src/
│ └── react_agent/
│ ├── __init__.py
│ ├── graph.py # wire nodes/edges
│ ├── state.py # TypedDict schema
│ ├── tools.py # tool fns
│ ├── prompts.py # system prompts
│ ├── context.py # runtime config
│ └── utils.py
├── tests/
├── langgraph.json
└── pyproject.toml

Agents share the same model instance and tool registry; specialization happens by:

  • Different system_message per node
  • Different subset of tools passed to create_react_agent(model, tools=[...])
  • Different sub-graph compiled per agent role

Source: docs.crewai.com + github.com/crewAIInc/crewAI-examples + CLI crewai create crew

Split: YAML + Python decorator. Persona/role/goal/backstory in agents.yaml. Tools and behavior wiring in crew.py via @agent decorator that references the YAML key.

config/agents.yaml
lead_market_analyst:
role: Lead Market Analyst
goal: Conduct amazing analysis of products and competitors
backstory: As the Lead Market Analyst...
crew.py
@agent
def lead_market_analyst(self) -> Agent:
return Agent(config=self.agents_config['lead_market_analyst'],
tools=[SerperDevTool(), ScrapeWebsiteTool()],
verbose=True, memory=False)

ALL agents in one YAML file. agents.yaml is a flat dict keyed by agent name. Same for tasks.yaml. Multiple crews in one project go into src/<pkg>/crews/<crew_name>/ (the Flow pattern).

  • knowledge/ top-level folder for files agents can read (PDFs, txts).
  • memory=True on crew enables shared short/long-term memory via ChromaDB.
  • Inputs flow through crew.kickoff(inputs={...}).

CrewAI Memory module: short-term (RAG of recent context), long-term (SQLite at ~/.crewai/), entity memory. trained_agents_data.pkl stores DPO-style training from crew.train(). No per-agent identity file beyond the YAML.

Section titled “5. Recommended structure (canonical, from crewai create crew)”

Single crew:

my_crew/
├── src/
│ └── my_crew/
│ ├── __init__.py
│ ├── main.py # entry point
│ ├── crew.py # @CrewBase class wiring agents + tasks
│ ├── config/
│ │ ├── agents.yaml # ALL agents
│ │ └── tasks.yaml # ALL tasks
│ └── tools/
│ ├── __init__.py
│ └── custom_tool.py
├── knowledge/ # static reference docs
├── .env
├── pyproject.toml
└── README.md

Flow with multiple crews (real example from flows/email_auto_responder_flow):

my_flow/
├── src/
│ └── my_flow/
│ ├── main.py # Flow orchestration
│ ├── types.py # Shared Pydantic models
│ ├── crews/
│ │ └── email_filter_crew/
│ │ ├── config/
│ │ │ ├── agents.yaml
│ │ │ └── tasks.yaml
│ │ └── email_filter_crew.py
│ ├── tools/
│ └── utils/
└── pyproject.toml
  • Tools are imported per-agent in crew.py (no global “all agents have X”).
  • YAML inheritance is not native — duplication is the norm.
  • Hierarchical mode adds a manager agent automatically.

Source: github.com/microsoft/autogen/python/samples + autogen-studio

Pure Python class. AssistantAgent(name="x", model_client=..., system_message="...", tools=[...]). No file convention — agents are instantiated inline.

Convention is “all agents in one file” for simple samples, split-by-runtime for distributed. The core_distributed-group-chat sample is the most structured:

core_distributed-group-chat/
├── _agents.py # ALL agent classes (BaseGroupChatAgent, etc.)
├── _types.py # Shared message types
├── _utils.py # Shared helpers
├── config.yaml # Model config
├── run_host.py # gRPC host process
├── run_group_chat_manager.py
├── run_editor_agent.py # ONE FILE PER RUNTIME (when distributed)
├── run_writer_agent.py
├── run_ui.py
└── public/avatars/ # Per-agent assets
├── editor.png
└── writer.png

Note: when agents run as separate processes (gRPC), each gets its own run_<name>.py launcher. This is the closest the survey gets to “folder per variant.”

  • GroupChat keeps the shared message history in memory.
  • task_centric_memory sample uses a separate memory store.
  • Topic-based pub/sub in distributed mode.

Weak. AutoGen agents are mostly stateless in their core; persistence is opt-in via custom Memory implementations. autogen-studio saves team configs as JSON.

No CLI scaffolder. Manual. The distributed sample above is the de-facto reference.

Subclass BaseGroupChatAgent (seen in _agents.py). All inherit _chat_history, _model_client, _system_message. Override handle_request_to_speak for role-specific behavior.


Source: mastra.ai/docs/getting-started/project-structure + github.com/mastra-ai/mastra/examples

One TypeScript file per agent: src/mastra/agents/weather-agent.ts. Exports a named Agent instance with model, instructions, tools, memory wired inline.

One file per agent, all in src/mastra/agents/. Then re-exported from src/mastra/agents/index.ts and registered in src/mastra/index.ts. The real examples/agent sample has this:

src/mastra/agents/
├── index.ts # barrel re-export
├── dynamic-tools-agent.ts
├── gateway.ts
├── model-v2-agent.ts
├── request-context-demo-agent.ts
└── slack-agent.ts

src/mastra/index.ts then imports and registers them all on the Mastra constructor.

  • @mastra/memory — per-agent working memory + semantic recall via LibSQL/Postgres/Upstash.
  • Workflows pass context through typed steps.
  • MCP servers (src/mastra/mcp/) expose tools to external agents.

LibSQL/Postgres store via MastraCompositeStore. Threads + messages persisted per agent. Agent ID is its export name.

Section titled “5. Recommended structure (canonical from create mastra CLI)”
my-app/
├── src/
│ ├── mastra/
│ │ ├── agents/
│ │ │ └── weather-agent.ts
│ │ ├── tools/
│ │ │ └── weather-tool.ts
│ │ ├── workflows/
│ │ │ └── weather-workflow.ts
│ │ ├── scorers/ # optional: eval scorers
│ │ │ └── weather-scorer.ts
│ │ ├── mcp/ # optional: custom MCP servers
│ │ ├── public/ # optional: copied to build output
│ │ └── index.ts # Mastra config (REQUIRED)
│ └── index.ts # app entry
├── .env.example
├── package.json
└── tsconfig.json

Real example structure (examples/agent) adds:

src/mastra/
├── agents/ # 5 agents, one file each
├── auth/ # auth providers (better-auth, okta, workos, ...)
├── mcp/ # MCP servers + app-tools
├── processors/ # content filters (PII detection, moderation)
├── tools/
├── workflows/ # multi-step workflows
└── index.ts
  • Tools imported per-agent in the agent’s file.
  • A “supervisor agent” pattern: supervisorAgent references other agents as tools via agentTool().
  • Shared model client / shared memory store passed into each new Agent({...}).

Source: ai.pydantic.dev + github.com/pydantic/pydantic-ai/examples

Either:

  • Inline Python: agent = Agent('openai:gpt-4o', instructions='...', output_type=Foo)
  • Declarative YAML (newer feature): agent.yaml with model, instructions, capabilities. Loaded via Agent.from_file('agent.yaml').
model: anthropic:claude-opus-4-6
instructions: You are a helpful research assistant.
model_settings:
max_tokens: 8192
capabilities:
- WebSearch
- Thinking:
effort: high

One Python file = one whole multi-agent system in the example repo. medical_agent_delegation.py defines triage_agent, cardiology_agent, neurology_agent, senior_doctor_agent all in the same file. Delegation via @agent.tool functions that call other agents.

For real apps the slack_lead_qualifier example splits by responsibility, not by agent:

slack_lead_qualifier/
├── __init__.py
├── agent.py # ALL agents here
├── app.py # FastAPI app
├── functions.py # business logic
├── models.py # Pydantic models
├── modal.py # deployment
├── slack.py # Slack integration
└── store.py # persistence

RunContext[Deps] — a typed dependency container. Agents receive a shared Deps dataclass with DB connections, API clients, user info.

Built-in message_history parameter on Agent.run(). Pydantic AI ships pydantic_graph for stateful flows. No automatic identity file.

No CLI scaffold. The official example layout is the closest thing:

my-app/
├── my_app/
│ ├── __init__.py
│ ├── agent.py # one or many agents
│ ├── models.py # Pydantic output schemas
│ ├── functions.py # @agent.tool implementations
│ ├── store.py # persistence
│ └── app.py # web/API entry
├── tests/
└── pyproject.toml

Agents share the Deps type and base model. Each agent declares its own output_type (Pydantic model) and system_prompt. Tools are decorated per-agent: @triage_agent.tool vs @senior_doctor_agent.tool.


Source: docs.letta.com + github.com/letta-ai/letta

Agents are server-side objects, not files. Created via API: client.agents.create(model="...", memory_blocks=[...], tools=[...]). Returns an agent_id.

That said, Letta DOES ship a file-based persona convention from its MemGPT roots:

letta/
├── personas/examples/
│ ├── sam.txt
│ ├── memgpt_starter.txt
│ ├── google_search_persona.txt
│ ├── sleeptime_memory_persona.txt
│ └── voice_memory_persona.txt
└── humans/examples/
├── basic.txt
└── cs_phd.txt

One .txt file = one persona block. Loaded into a memory_block at agent creation.

Server-managed. Listed via client.agents.list(). No local file convention beyond personas/ + humans/.

Memory Blocks API — standalone, attachable to multiple agents:

block = client.blocks.create(
label="company_info",
value="Acme Corp...",
tags=["shared", "company"]
)
client.agents.blocks.attach(agent_id=..., block_id=block.id)

Plus Folders (formerly “sources”) — collections of files (PDFs, .txt) attached to an agent for vector search. Same folder can attach to multiple agents.

This is Letta’s headline feature. Every agent has:

  • Core memory blocks (persona, human, custom) — always in context, editable by the agent itself
  • Archival memory — vector store, unlimited, semantic search
  • Recall memory — full message history
  • Agent state persisted in Postgres

agent_id is the durable identity. Sessions resume by passing the same ID.

No app-side scaffold — Letta is server-first. Client apps look like:

my-app/
├── personas/ # optional: .txt persona library
│ ├── researcher.txt
│ └── writer.txt
├── humans/ # optional: user profiles
│ └── wes.txt
├── src/
│ ├── client.py # letta_client setup
│ ├── bootstrap.py # one-time agent creation
│ └── handlers.py # message routing
└── .env
  • Shared memory blocks (one block, many agents).
  • Shared folders (one knowledge source, many agents).
  • Per-agent persona block + tool list.
  • Letta Cloud has “agent templates” you can clone.

Source: docs.mem0.ai + github.com/mem0ai/mem0

Not a multi-agent framework — it’s a memory layer

Section titled “Not a multi-agent framework — it’s a memory layer”

Mem0 is the persistence backbone you bolt onto another framework. No project scaffold.

Memory hierarchy (this IS the interesting structural pattern)

Section titled “Memory hierarchy (this IS the interesting structural pattern)”
graph LR
A[Conversation turn] --> B[Session memory]
B --> C[User memory]
C --> D[Org memory]

Scoping identifiers:

  • user_id — long-term per-user (preferences, history)
  • agent_id — agent-specific knowledge
  • app_id — app-level isolation (multi-tenant)
  • run_id — session-scoped (ephemeral, deletable on session end)

Usage pattern:

# User-level: persistent preferences
client.add(messages, user_id="alice")
# Session-level: temporary context
client.add(messages, user_id="alice", run_id="session_123")
# Agent-level: agent-specific knowledge
client.add(messages, agent_id="support_bot", app_id="helpdesk")
# Multi-tenant: full isolation
client.add(messages, user_id="alice", agent_id="bot", app_id="acme_corp", run_id="ticket_42")

Key insight: the lifecycle dimension (session vs persistent) lives in runtime scopes, NOT folder structure. Sessions are tagged-and-deletable, not archived-on-disk.


CONVERGENT (strong signals — every framework does this)

Section titled “CONVERGENT (strong signals — every framework does this)”
PatternLangGraphCrewAIAutoGenMastraPydanticAILetta
src/<pkg>/ package rootyesyesn/ayesyesn/a
agents/ folder OR all-in-one agents fileone fileyamlone filefolderone fileserver
Separate tools/ folderyes (template)yesimplicityesimplicitserver
Workflows/orchestration distinct from agentsgraph.pycrew.pymanualworkflows/pydantic_graphserver
Shared state schema (typed)TypedDictinputs dictmessage typesTS typesdataclass Depsmemory blocks
.env / .env.exampleyesyesyesyesyesyes
Config file at root (langgraph.json / mastra config / pyproject.toml)yesyesmanualyesyesn/a
Tests folderyesyesyesyesyesn/a
Shared knowledge in dedicated diroptionalknowledge/manualimplicitmanualfolders/blocks
Persistence via runtime store (DB/checkpointer), NOT filesyesyesyesyesyesyes

Universal split: by responsibility (agent vs tool vs workflow vs config), not by lifecycle state.

Universal storage rule: durable state (memory, identity, message history, archival) lives in a database, NOT in folders next to code.

QuestionAnswers
One agent file vs one agent folder?Mastra: file-per-agent. CrewAI: all agents in one YAML. LangGraph: one node per agent (no file). AutoGen: all in one .py. PydanticAI: all in agent.py.
Declarative (YAML) or imperative (code)?CrewAI: YAML+code split. PydanticAI: optional YAML. Mastra/LangGraph/AutoGen: code only.
Persona stored where?Letta: .txt in personas/. CrewAI: YAML backstory. Everyone else: inline system_message string.
Multi-crew / nested teams?CrewAI Flows: crews/<crew_name>/ subdir. AutoGen: distributed run_<agent>.py files. LangGraph: subgraphs.
Shared base for specialization?AutoGen: inheritance (BaseGroupChatAgent). Others: composition + passing shared deps.

Verdict on Wes’s Active/Reference/Logs/Archive Pattern

Section titled “Verdict on Wes’s Active/Reference/Logs/Archive Pattern”

I searched docs and 20+ example trees across 7 frameworks. None of them organize code by lifecycle state. They all organize by role/responsibility.

The Active/Reference/Logs/Archive distinction is a runtime-state concern, and every framework pushes runtime state out of the code tree into a store:

Wes’s dimensionFramework equivalentWhere it lives
Active/running agent instancesruntime memory + checkpointer (Postgres/SQLite/LibSQL)
Reference/knowledge sourcesCrewAI knowledge/, Letta folders/blocks, vector stores
Logs/message history, tracesDB tables (Letta recall memory, LangGraph checkpoints), observability tools
Archive/inactive agents, old threadssoft-delete flags in DB, or vector-store TTL

The convergent answer: state lives in databases, code is organized by what it IS.

The instinct that every variant folder should follow the same layout is correct and matches every framework. Mastra is the strongest analog — every example has the same src/mastra/{agents,tools,workflows}/index.ts skeleton. Predictability of file location across variants is a real production virtue.

  1. Logs/ next to code blurs immutable code with mutable state. Logs grow without bound. Mixing them into git is painful. Every framework treats logs as runtime output, not source.
  2. Archive/ invites code rot. Mastra/CrewAI/LangGraph have no equivalent because deprecated agents get deleted (or feature-flagged in config), not moved. The vault is a better archive surface than the code tree.
  3. Active/ vs Reference/ — what’s the difference for an agent’s persona doc? A persona file is BOTH active (loaded into context) AND reference (read-only). The split forces a false binary.

The better pattern (still consistent at every level)

Section titled “The better pattern (still consistent at every level)”

Borrow from Mastra + Letta + CrewAI. Same skeleton everywhere, role-based, with runtime state pushed out:

<variant-or-project>/
├── CLAUDE.md # persona / identity (the "Active" identity surface)
├── agents/ # sub-agents, one file each
├── tools/ # tool definitions
├── workflows/ | flows/ # orchestration
├── knowledge/ # immutable reference docs (what Wes calls "Reference")
├── config/ # YAML or JSON declarative configs
├── tests/
├── .env.example
└── README.md

Then outside the tree:

  • State (the “Active” runtime): D1 / fleet-node KV / checkpointer
  • Logs (the “Logs”): ~/.claude/projects/*.jsonl + Cheesegrater central log + Cloudflare Worker telemetry
  • Archive (the “Archive”): vault Fleet/working/ for prior canon, git mv deprecated agents into a top-level archived/ only if you really need them on disk

The least bad way to keep Active/Reference/Logs/Archive is to make it the top-level dimension with role-based subdirs underneath, NOT the other way around:

fleet-root/
├── Active/
│ ├── pepper/
│ │ ├── agents/
│ │ ├── tools/
│ │ ├── workflows/
│ │ └── CLAUDE.md
│ └── nagatha/
│ └── ... same skeleton
├── Reference/ # shared docs, knowledge sources, persona templates
├── Logs/ # symlink to runtime log dirs, NOT committed
└── Archive/ # deprecated variants, full skeleton preserved

That gets you:

  • Consistent inner skeleton (matches Mastra/CrewAI)
  • Lifecycle dimension visible at the top
  • Logs/ as a symlink avoids the git pollution problem
  • Archive/ as a freezer (whole variant moved, not files within a variant)

This is closest in spirit to AutoGen’s run_<agent>.py distributed pattern (whole launcher per agent) and Mastra’s per-variant src/mastra/ skeleton.


  1. Best fit for Wes’s fleet: Mastra-style agents/ tools/ workflows/ index.ts skeleton replicated per variant, with state/logs/archive handled by D1 + vault + symlinks. Reason: TS or not, Mastra’s pattern is the cleanest production analog of what Wes is doing (per-variant identity files, shared tool library, declarative wire-up). It’s also the most consistent across examples.
  2. Strong alternative: CrewAI YAML-first pattern. Persona/role/goal/backstory in agents.yaml, tools wired in crew.py. Reason: Forces declarative persona separation from code. Maps perfectly to Wes’s idea of a variant-mapping doc as canon.
  3. For memory/identity: Steal Letta’s memory-block + folder model. Shared blocks attachable to multiple agents = perfect for fleet-wide “company info,” “client info,” “active priorities.”
  4. For state scoping: Steal Mem0’s 4-level scoping (user_id / agent_id / app_id / run_id). Maps to Wes’s machine / variant / project / session naturally.
  5. Do not adopt: AutoGen’s distributed multi-process layout. It’s the closest structural match but the operational complexity is wrong for solo-operator scale.