Cross-Framework Comparison
All 8 frameworks, side by side, across every dimension I could measure.
Architecture: Most do request-response; HermitClaw runs continuously; Spacebot delegates; the Agents SDK orchestrates multi-agent handoffs. Memory: Three use hybrid search (RRF), two have none, HermitClaw uses 3-factor retrieval, and the Agents SDK gives you 10+ pluggable session backends. Security: IronClaw goes deepest with 5-layer defense; Hermes adds prompt injection scanning; the Agents SDK uses guardrails + human-in-the-loop; pi has none by design. Channels: PicoClaw supports 10, OpenClaw 7, Hermes 6. Footprint: PicoClaw runs in <10MB RAM on a $10 board.
Architecture
Core Loop Patterns
Every framework has a main loop, and they're all different:
| Framework | Loop Pattern | Language | Concurrency Model |
|---|---|---|---|
| OpenClaw | Event-driven request-response with async tool loops | TypeScript | Lane-based command queues (per-session + global) |
| IronClaw | Event-driven message loop with tokio::select! | Rust | Single async loop, tools in-line |
| PicoClaw | Message bus with goroutine consumers | Go | Goroutines + channel-based bus |
| HermitClaw | Continuous autonomous loop (5s tick) | Python | Single asyncio task per crab |
| Spacebot | Concurrent process model (Channel/Branch/Worker) | Rust | tokio::spawn per process, broadcast events |
| pi (pi.dev) | Classic agentic loop with steering queues | TypeScript | Single-threaded with interrupt queues |
| Hermes Agent | Synchronous agentic loop with thread interrupts | Python | Synchronous core, subagent parallelism (up to 3) |
| OpenAI Agents SDK | Runner turn loop with guardrails and handoffs | Python / TypeScript | async/await, asyncio.gather for parallel guardrails |
Request-response (OpenClaw, IronClaw, PicoClaw, pi, Hermes): Wait for input, run it through an LLM + tool loop, return a response. Most frameworks do this.
Continuous (HermitClaw): No human trigger needed. Every 5 seconds it runs a think cycle -- picks topics, researches, writes. Human messages are just nudges it overhears.
Delegation (Spacebot): The user-facing Channel never executes work itself. It dispatches to Branches (thinking) and Workers (execution) running as concurrent tasks. Results flow back as events.
Orchestration (OpenAI Agents SDK): The Runner manages a turn loop where agents hand off to other agents, tools can pause for human approval, and guardrails validate inputs/outputs. Run state is fully serializable -- you can persist it and resume later.
Module Structure Comparison
| Framework | Source Files | Lines (est.) | Key Abstraction |
|---|---|---|---|
| OpenClaw | ~2,100 TS | ~100K+ | Channel plugins, tool registry, session lanes |
| IronClaw | ~88 RS | ~30K+ | Trait-based tools/channels, WASM sandbox |
| PicoClaw | ~125 Go | ~20K | Interface-based tools/channels, message bus |
| HermitClaw | ~14 Py | ~3K | Single Brain class, memory stream |
| Spacebot | ~88 RS | ~25K+ | Process types (Channel/Branch/Worker), memory graph |
| pi (pi.dev) | ~50 TS | ~15K+ | Operations interfaces, extension events |
| Hermes Agent | ~100+ Py | ~20K+ | AIAgent class, ToolRegistry singleton, BasePlatformAdapter |
| OpenAI Agents SDK | ~230+ Py / ~100+ TS | ~30K+ (Py) | Agent, Runner, Handoff, Guardrail, RunState |
Framework Type Spectrum
Most frameworks here are standalone applications -- you run them, you interact with them. The OpenAI Agents SDK is a library -- you import it and build agents in your own code. Hermes Agent lands somewhere in between: it's an application, but the tool registry and platform adapter patterns make it easy to extend.
Memory Systems
Memory Architecture Comparison
| Framework | Short-term | Long-term | Semantic Search | Compaction |
|---|---|---|---|---|
| OpenClaw | Session JSON transcript | Workspace files (MEMORY.md, daily notes) | Hybrid BM25 + vector (SQLite + sqlite-vec) | LLM-based summarization when context fills |
| IronClaw | In-memory session + DB persistence | Workspace filesystem in DB | Hybrid FTS + vector via RRF (PostgreSQL/pgvector) | Context summarization |
| PicoClaw | Session JSON files | File-based (MEMORY.md + daily notes) | None -- no embeddings, no vector DB | LLM summarization when >20 msgs or 75% tokens |
| HermitClaw | Last N thoughts in context | Append-only JSONL memory stream | 3-factor retrieval: recency + importance + cosine similarity | None -- memory grows unbounded |
| Spacebot | Channel conversation history | Typed memory graph (8 types) in SQLite + LanceDB | Hybrid FTS + vector + graph via RRF | Tiered: 80% background, 85% aggressive, 95% emergency |
| pi (pi.dev) | JSONL session with tree structure | AGENTS.md files (no cross-session memory) | None | LLM-based compaction with file tracking |
| Hermes Agent | Session JSON transcripts | MEMORY.md + USER.md (ยง-delimited entries) | Honcho AI cross-session memory; SQLite session search | ContextCompressor (middle turns summarized by auxiliary model) |
| OpenAI Agents SDK | Session interface (pluggable backends) | None built-in (use tools + external stores) | None built-in (file_search hosted tool available) | Server-side compaction via responses.compact API |
Memory Retrieval Approaches
Hybrid Search (RRF) -- OpenClaw, IronClaw, and Spacebot all do this. They combine full-text search (BM25/FTS) with vector similarity, then merge results via Reciprocal Rank Fusion. If something shows up in both result sets, it gets a boosted score.
Three-Factor Retrieval -- HermitClaw's approach, straight from the Generative Agents paper: score = recency + importance + relevance. Each factor is 0-1. Recency decays exponentially, importance is LLM-scored 1-10, relevance is cosine similarity.
External Memory Services -- Hermes Agent uses Honcho for cross-session user modeling, plus SQLite-indexed session search for conversation history.
Pluggable Session Backends -- The OpenAI Agents SDK defines a Session protocol with 10+ backends (in-memory, SQLite, Redis, SQLAlchemy, MongoDB, Dapr, encrypted, OpenAI Conversations). Long-term memory is your problem.
No Retrieval -- PicoClaw and pi skip semantic search entirely. PicoClaw just injects MEMORY.md + the last 3 days into the system prompt. pi relies on AGENTS.md context files and the LLM reading files via tools.
The Workspace File Pattern
Five of the eight frameworks converge on the same pattern for persistent identity and memory:
| File | Purpose | Used By |
|---|---|---|
AGENTS.md | Operational instructions | OpenClaw, IronClaw, PicoClaw, Spacebot, pi, Hermes |
SOUL.md | Personality/values | OpenClaw, IronClaw, PicoClaw, Spacebot, Hermes |
USER.md | Info about the human | OpenClaw, IronClaw, PicoClaw, Spacebot, Hermes |
MEMORY.md | Curated long-term memory | OpenClaw, IronClaw, PicoClaw, Hermes |
memory/YYYY-MM-DD.md | Daily logs | OpenClaw, IronClaw, PicoClaw |
HermitClaw does its own thing -- identity.json (genome-derived traits) and memory_stream.jsonl (append-only with embeddings).
The OpenAI Agents SDK has no workspace file convention. Agent instructions live in code, and memory is delegated to the Session interface.
Tool / Function Calling
Tool Inventory
| Tool Category | OpenClaw | IronClaw | PicoClaw | HermitClaw | Spacebot | pi | Hermes | Agents SDK |
|---|---|---|---|---|---|---|---|---|
| Shell exec | โ | โ | โ | โ (sandboxed) | โ (worker) | โ | โ (5 backends) | โ (ShellTool) |
| File R/W | โ | โ | โ | Via shell | โ (worker) | โ | โ | Via tools |
| File edit | โ | Via write | โ | Via shell | Via file tool | โ | โ (patch) | โ (ApplyPatchTool) |
| Web search | โ | โ | โ | โ (OpenAI built-in) | โ (worker) | Via extension | โ (Brave/Tavily) | โ (WebSearchTool) |
| Web fetch | โ | โ | โ | โ | Via shell | Via extension | โ | Via tools |
| Browser automation | โ (Playwright) | โ | โ | โ | โ (headless Chrome) | Via extension | โ (Playwright) | โ (ComputerTool) |
| Memory search | โ | โ | โ | Automatic (retrieval) | โ (branch) | โ | โ (Honcho + SQLite) | โ (FileSearchTool) |
| Message/channel | โ | Via channels | โ | โ (respond) | โ (reply) | Via extension | โ (send_message) | Via tools |
| Sub-agents | โ | โ | โ | โ | โ (branch/worker) | Via extension | โ (delegate_task) | โ (agent-as-tool) |
| Cron/scheduling | โ | โ (routines) | โ | โ | โ | Via extension | โ (natural language) | โ |
| TTS | โ | โ | โ | โ | โ | Via extension | โ (OpenAI/ElevenLabs) | โ (voice pipeline) |
| Image analysis | โ | โ | โ | โ | โ | โ (read) | โ (vision models) | Via tools |
| Image generation | โ | โ | โ | โ | โ | โ | โ (fal.ai) | โ (ImageGenerationTool) |
| Hardware I/O | โ | โ | โ (I2C/SPI) | โ | โ | โ | โ (Home Assistant) | โ |
| Code execution | โ | โ | โ | โ | โ | โ | โ (Python RPC) | โ (CodeInterpreterTool) |
| MCP support | โ | โ | โ | โ | โ | โ | โ | โ (local + hosted) |
Sandboxing Spectrum
| Framework | Isolation Level | Technology | Notes |
|---|---|---|---|
| IronClaw | Strongest | WASM sandbox (Wasmtime) + capability model | Fuel metering, no filesystem, credential injection at boundary |
| OpenClaw | Strong | Optional Docker sandbox | Multi-layer tool policy pipeline, exec security modes |
| Hermes Agent | Strong | Docker (cap-drop ALL, no-new-privileges, PID limits) + SSH + Modal | 5 terminal backends with varying isolation levels |
| Spacebot | Moderate | Process type separation | Channel can't exec, Worker can't access memory, path restrictions |
| OpenAI Agents SDK | Moderate | Container-based ShellTool + sandbox providers | Docker, E2B, Modal, Vercel, Cloudflare, Daytona sandbox clients |
| HermitClaw | Best-effort | Command blocklist + Python monkey-patching | Explicitly "not a security boundary" |
| PicoClaw | Basic | Regex deny patterns (40+ rules) | Workspace restriction, no process isolation |
| pi | None | No sandboxing | "YOLO by default" -- full filesystem and shell access |
Tool Definition Patterns
Every framework defines tools as name + JSON schema + execute function -- that's just what LLMs expect. The registration patterns differ though:
- Registry pattern (OpenClaw, IronClaw, PicoClaw, Spacebot, Hermes): Tools register in a central registry, filtered by policy before reaching the LLM
- Direct assembly (HermitClaw): Tools defined inline as OpenAI function schemas
- Operations pattern (pi): Tools wrap pluggable I/O interfaces, enabling tool redirection (e.g., SSH)
- Declarative pattern (OpenAI Agents SDK): Tools declared on agent construction via
tools=[]array; the Runner resolves them per turn. Supports@function_tooldecorator (Python) ortool()builder (JS) for automatic schema generation from type annotations.
Security
Security Model Comparison
| Feature | OpenClaw | IronClaw | PicoClaw | HermitClaw | Spacebot | pi | Hermes | Agents SDK |
|---|---|---|---|---|---|---|---|---|
| Process sandbox | Docker (optional) | WASM (Wasmtime) | None | Subprocess | Process separation | None | Docker/SSH/Modal | Container ShellTool |
| Credential encryption | N/A | AES at rest, inject at boundary | Plaintext JSON | Env vars only | Encrypted SQLite | Env vars/config | .env (sandboxed from agent) | Env vars |
| Prompt injection defense | External content wrapping | Aho-Corasick + regex detection | None | None | None | None | 25+ pattern scanner | None |
| Leak detection | N/A | Pattern scanning (pre/post request) | None | None | None | None | Memory entry scanning + log redaction | Trace data controls |
| Endpoint allowlisting | N/A | Host + path + method allowlist | None | Command blocklist | Path restrictions | None | Per-platform user allowlists | None |
| Tool policy layers | 5-layer pipeline | Capability-based + approval | AllowFrom per channel | Command blocklist | Process type isolation | None (extension-based) | Dangerous command approval (25+ patterns) | Guardrails (input/output/tool) + approval |
| Exec security modes | deny/allowlist/full | Approval per tool call | Regex deny patterns | Blocklist + env restriction | Workspace restriction | Full access | Confirmation (CLI) / approval (messaging) | needsApproval per tool |
IronClaw goes the deepest on security -- five layers (WASM sandbox, credential injection, prompt injection defense, leak detection, endpoint allowlisting). Hermes Agent adds prompt injection scanning and multi-backend isolation. The OpenAI Agents SDK takes a different angle: guardrails (tripwire-based safety checks at input, output, and tool levels) plus human-in-the-loop tool approval with serializable state.
LLM Integration
Provider Support
| Provider | OpenClaw | IronClaw | PicoClaw | HermitClaw | Spacebot | pi | Hermes | Agents SDK |
|---|---|---|---|---|---|---|---|---|
| Anthropic | โ | โ | โ (native) | Via OpenRouter | โ | โ | Via OpenRouter | Via LiteLLM / AI SDK |
| OpenAI | โ | โ | โ | โ (primary) | โ | โ | Via OpenRouter | โ (primary, Responses + Chat Completions) |
| โ | โ | โ | โ | โ | โ | Gemini Flash (aux) | Via LiteLLM / AI SDK | |
| Ollama | โ | โ | โ | โ (custom) | โ | โ | โ (custom endpoint) | Via LiteLLM |
| OpenRouter | โ | โ | โ | โ | โ | โ | โ (primary, 200+ models) | Via LiteLLM |
| Groq | โ | โ | โ | โ | โ | โ | Via OpenRouter | Via LiteLLM |
| DeepSeek | โ | โ | โ | โ | โ | โ | Via OpenRouter | Via LiteLLM |
| Bedrock | โ | โ | โ | โ | โ | โ | โ | Via LiteLLM |
| Chinese providers | โ | โ | โ (Zhipu, Moonshot, etc.) | โ | โ (Zhipu) | โ | Via OpenRouter | Via LiteLLM |
| Provider count | 7+ | 5 | 15+ | 3 | 11 | 10+ | 200+ (via OpenRouter) | 100+ (via LiteLLM) |
Resilience Features
| Feature | OpenClaw | IronClaw | PicoClaw | HermitClaw | Spacebot | pi | Hermes | Agents SDK |
|---|---|---|---|---|---|---|---|---|
| Streaming | โ | โ | โ | โ | โ (stubbed) | โ | โ (complete before send) | โ (RunResultStreaming) |
| Fallback chains | โ | โ (circuit breaker) | โ | โ | โ | โ | Graceful tool failure | โ (retry with backoff + jitter) |
| Cost tracking | โ (per-message) | โ (Decimal precision) | โ | โ | โ | โ (per-message) | โ (per-session) | โ (Usage tracking) |
| Auth rotation | โ (multi-key) | โ | โ | โ | โ | โ | OAuth token management | โ |
| Context handoff | โ | โ | โ | โ | โ | โ (cross-provider) | โ | โ (handoffs between agents) |
pi does something no other framework does here: you can start a session with Claude, switch to GPT mid-conversation, and continue with Gemini. It handles the message format conversion automatically. The Agents SDK's handoffs are between agents, not providers -- a different concept entirely.
Multi-Channel Support
| Platform | OpenClaw | IronClaw | PicoClaw | HermitClaw | Spacebot | pi | Hermes | Agents SDK |
|---|---|---|---|---|---|---|---|---|
| CLI/REPL | โ | โ | โ | โ | โ | โ (TUI) | โ (TUI) | โ (run_demo_loop) |
| Telegram | โ | โ (WASM) | โ | โ | โ | โ | โ | โ |
| Discord | โ | โ (WASM) | โ | โ | โ | โ | โ | โ |
| Slack | โ | โ (WASM) | โ | โ | โ | โ (mom) | โ | โ |
| โ | โ | โ | โ | โ | โ | โ (Node bridge) | โ | |
| Signal | โ | โ | โ | โ | โ | โ | โ | โ |
| iMessage | โ | โ | โ | โ | โ | โ | โ | โ |
| Web UI | โ | โ | โ | โ (pixel art) | โ | โ (web-ui) | โ | โ |
| Home Assistant | โ | โ | โ | โ | โ | โ | โ | โ |
| โ | โ | โ | โ | โ | โ | โ | โ | |
| DingTalk | โ | โ | โ | โ | โ | โ | โ | โ |
| Feishu/Lark | โ | โ | โ | โ | โ | โ | โ | โ |
| LINE | โ | โ | โ | โ | โ | โ | โ | โ |
| Realtime/Voice | โ | โ | โ | โ | โ | โ | โ | โ (WebRTC, WebSocket, SIP) |
| Total | 7 | 4-5 | 10 | 1 | 4 | 3-4 | 6 | 1-2 |
PicoClaw leads with 10 channels -- mostly because of Chinese platform support (QQ, DingTalk, Feishu, OneBot). OpenClaw has the broadest Western platform coverage at 7. Hermes covers the core Western platforms plus Home Assistant for IoT.
The OpenAI Agents SDK is a library, not an application, so it doesn't ship platform adapters. What it does have is a Realtime API integration for voice agents (WebRTC, WebSocket, SIP) that you wire into your own app.
Channel Abstraction Patterns
Every multi-channel framework abstracts messaging behind a common interface:
- OpenClaw:
ChannelPluginwith optional adapters (setup, status, auth, messaging, streaming, etc.) - IronClaw:
Channeltrait (start() -> MessageStream,respond(),health_check()) - PicoClaw:
Channelinterface (Name/Start/Stop/Send/IsAllowed) - Spacebot:
Messagingtrait (start/respond/broadcast/fetch_history) - Hermes Agent:
BasePlatformAdapterABC producing normalizedMessageEventdataclass
Identity & Personality
| Framework | Identity Source | Personality Mechanism | Unique Aspect |
|---|---|---|---|
| OpenClaw | SOUL.md + AGENTS.md + USER.md + MEMORY.md | Workspace files injected into system prompt | Heartbeat system for continuous presence |
| IronClaw | SOUL.md + AGENTS.md + USER.md + VOICE.md | Workspace filesystem in system prompt | Voice profile in VOICE.md |
| PicoClaw | SOUL.md + AGENTS.md + USER.md + IDENTITY.md | Files + memory context in system prompt | OpenClaw-compatible workspace migration |
| HermitClaw | identity.json (genome-derived) | Cryptographic trait derivation from keyboard entropy | Personality is deterministic from a 32-byte genome |
| Spacebot | SOUL.md + IDENTITY.md + USER.md | Files + pre-computed memory bulletin | Per-process model routing (different models for different tasks) |
| pi | ~150 word default + AGENTS.md | Minimal system prompt, extensible via extensions | Stealth mode (mimics Claude Code tool names) |
| Hermes Agent | SOUL.md + AGENTS.md + USER.md + MEMORY.md | Workspace files + skills injected into system prompt | Skills system (agentskills.io compatible markdown docs with progressive disclosure) |
| OpenAI Agents SDK | Code-defined instructions string/function | Dynamic instructions via RunContext | Server-managed Prompt objects for centralized prompt management |
Resource Footprint
| Framework | RAM (typical) | Boot Time | Binary/Install Size | External Dependencies |
|---|---|---|---|---|
| OpenClaw | ~1GB+ | 5-10s | ~200MB (node_modules) | Node.js, npm |
| IronClaw | ~50-200MB | 1-3s | ~20-50MB binary | PostgreSQL (optional) |
| PicoClaw | <10MB | <1s | ~15-25MB binary | None (single static binary) |
| HermitClaw | ~100-200MB | 2-5s | ~50MB (Python + deps) | Python, pip |
| Spacebot | ~50-200MB | 1-3s | ~20-50MB binary | None (single Rust binary) |
| pi | ~200-500MB | 2-5s | ~100MB (node_modules) | Node.js, npm |
| Hermes Agent | ~200-500MB | 2-5s | ~100MB (Python + deps) | Python, pip, many optional services |
| OpenAI Agents SDK | ~50-200MB | <1s | ~20MB (pip) / ~10MB (npm) | Python or Node.js; OpenAI API key |
PicoClaw wins on footprint by a wide margin -- it was designed for $10 single-board computers, and it shows. The Agents SDK is light as a library, but it still needs an LLM API connection to do anything.
Best For: When to Use Which
| If you want... | Use this framework | Why |
|---|---|---|
| Personal assistant across all messaging platforms | OpenClaw | 7 channels, rich tool suite, production-hardened |
| Maximum security for agent tool execution | IronClaw | 5-layer security with WASM sandbox, leak detection |
| Agent on embedded/IoT hardware | PicoClaw | <10MB RAM, single binary, I2C/SPI hardware tools |
| Autonomous research agent / digital pet | HermitClaw | Continuous thinking loop, Generative Agents memory |
| Multi-user team agent (Discord/Slack server) | Spacebot | Concurrent delegation model, multi-agent support |
| Minimalist coding assistant | pi (pi.dev) | 4 tools, cross-provider handoff, powerful extension system |
| Self-improving personal agent with skills | Hermes Agent | Persistent memory, skills system, cron, 5 terminal backends |
| Multi-agent workflows in your own app | OpenAI Agents SDK | Handoffs, guardrails, human-in-the-loop, serializable state |
| Learning agent architecture basics | HermitClaw or PicoClaw | Smallest codebases, easiest to read |
| Building a custom agent framework | pi (pi.dev) | Clean extension system, pluggable operations pattern |
| Maximum LLM provider support | Hermes Agent | 200+ models via OpenRouter |
| Production multi-agent with typed safety | OpenAI Agents SDK | Type-safe context, structured output, tracing, 10+ session backends |
| Voice / realtime agents | OpenAI Agents SDK | Voice pipeline (Python), Realtime API, WebRTC/SIP (JS) |