HermitClaw
HermitClaw
1. Overview
HermitClaw is a continuously-running autonomous AI agent that lives in a sandboxed folder on your computer, described by its creator as "a tamagotchi that does research." Unlike conventional chatbots that respond to prompts, HermitClaw runs a perpetual thinking loop -- choosing topics, searching the web, writing reports, coding scripts, and building up a body of work over days and weeks. It features a personality genome derived from keyboard entropy, a Generative Agents-inspired memory system with three-factor retrieval (recency + importance + relevance), a reflection/dreaming cycle that consolidates raw thoughts into higher-order beliefs, periodic planning, mood-driven behavior, and a pixel-art room UI where a hermit crab character visually wanders between desk, bookshelf, window, and bed.
- Primary Use Case: Autonomous research agent / digital pet that continuously produces artifacts (reports, code, notes) without human prompting
- Repository: github.com/hermitclaw/hermitclaw
- Language/Runtime: Python 3.12+ backend (FastAPI/uvicorn), React 18 + TypeScript frontend (Vite, HTML5 Canvas)
- License: MIT
2. Architecture
Core Loop
HermitClaw's architecture is fundamentally continuous rather than request-response. The Brain.run() method is an infinite while self.running loop that fires every thinking_pace_seconds (default: 5s). Each iteration:
- Scans for new files dropped into the crab's box
- Runs one think cycle (
_think_once) - Checks if reflection threshold is crossed
- Every 10 cycles, runs a planning phase
- Does an idle wander step, then sleeps
graph TD A[Brain.run] --> B{New files?} B -->|Yes| C[Queue inbox alert] B -->|No| D[_think_once] C --> D D --> E[Build context:<br>system prompt + history + nudge] E --> F[LLM call with tools] F --> G{Tool calls?} G -->|Yes| H[Execute tools] H --> I[Feed results back to LLM] I --> G G -->|No| J[Store thought in memory stream] J --> K{importance_sum >= 50?} K -->|Yes| L[_reflect: extract insights] K -->|No| M{10 cycles since plan?} L --> M M -->|Yes| N[_plan: update projects.md] M -->|No| O[Idle wander + sleep] N --> O O --> B style L fill:#7c3aed,color:#fff style N fill:#0d9488,color:#fff style F fill:#1a1a2e,color:#fff
Entry Points
Execution starts in hermitclaw/main.py:
_discover_crabs()scans for*_box/directories in the project root- Each box with an
identity.jsongets aBraininstance - If no crabs found,
create_identity()runs the interactive onboarding (name + keyboard entropy) create_app(brains)initializes the FastAPI appuvicorn.run()starts the server- On FastAPI startup event,
brain.run()is launched as anasyncio.create_taskfor each crab
python# hermitclaw/main.py -- startup @app.on_event("startup") async def startup(): async def _start_brains(): await asyncio.sleep(0.5) # let server bind port first for crab_id, brain in brains.items(): asyncio.create_task(brain.run()) asyncio.create_task(_start_brains())
Module/Package Structure
hermitclaw/
main.py -- Entry point, multi-crab discovery, onboarding
brain.py -- The thinking loop (core of everything)
memory.py -- Generative Agents memory stream
prompts.py -- All system prompts, moods, reflection/planning prompts
providers.py -- LLM abstraction (OpenAI Responses API + Chat Completions)
tools.py -- Sandboxed shell, web search/fetch, tool execution
pysandbox.py -- Python sandbox (monkey-patches builtins)
identity.py -- Personality genome generation from entropy
config.py -- YAML config + env var loading
server.py -- FastAPI server, WebSocket, REST API
frontend/
src/App.tsx -- Two-pane UI, chat feed, crab switcher
src/GameWorld.tsx -- Pixel-art room on HTML5 Canvas
src/sprites.ts -- Sprite sheet definitions
Core Think Cycle Code
python# hermitclaw/brain.py -- Brain._think_once() (simplified) async def _think_once(self): self.state = "thinking" instructions, input_list = self._build_input() response = await asyncio.to_thread(chat, input_list, True, instructions, max_tokens) # Tool loop -- up to max_tool_rounds iterations while response["tool_calls"]: input_list += response["output"] for tc in response["tool_calls"]: if tc["name"] == "move": result = await self._handle_move(tc["arguments"]) elif tc["name"] == "respond": result = await self._handle_respond(tc["arguments"]) else: result = await asyncio.to_thread(execute_tool, tc["name"], tc["arguments"], self.env_path) input_list.append({"type": "function_call_output", "call_id": tc["call_id"], ...}) response = await asyncio.to_thread(chat, input_list, True, instructions, max_tokens) if response.get("text"): self.thought_count += 1 await asyncio.to_thread(self.stream.add, response["text"], "thought")
3. Memory System
The memory system is a direct implementation of Park et al.'s Generative Agents paper (2023). It's implemented in hermitclaw/memory.py as the MemoryStream class.
Storage
Every thought gets stored as an entry in an append-only JSONL file (memory_stream.jsonl) inside the crab's box. Each entry contains:
json{ "id": "m_0042", "timestamp": "2025-01-15T14:32:00", "kind": "thought", "content": "The fractal patterns in romanesco broccoli...", "importance": 7, "depth": 0, "references": [], "embedding": [0.012, -0.034, ...] }
- kind:
"thought"(raw),"reflection"(synthesized insight), or"planning" - importance: 1-10 scored by a separate LLM call using
IMPORTANCE_PROMPT - depth: 0 for raw thoughts, 1 for reflections on thoughts, 2+ for higher reflections
- references: IDs of source memories (reflections link back to what they synthesized)
- embedding: Vector from
text-embedding-3-smallfor semantic retrieval
Three-Factor Retrieval
graph LR Q[Query text] --> E[Embed query] E --> S[Score all memories] S --> R[Return top-K] subgraph "Three Factors (summed)" REC[Recency<br>exp decay over hours] IMP[Importance<br>LLM-scored / 10] REL[Relevance<br>cosine similarity] end S --> REC S --> IMP S --> REL
The retrieval formula from memory.py:
python# hermitclaw/memory.py -- MemoryStream.retrieve() def retrieve(self, query: str, top_k: int = None) -> list[dict]: query_embedding = embed(query) decay_rate = config.get("recency_decay_rate", 0.995) # default 0.995 now = datetime.now() scored = [] for mem in self.memories: hours_ago = (now - datetime.fromisoformat(mem["timestamp"])).total_seconds() / 3600.0 recency = math.exp(-(1 - decay_rate) * hours_ago) # exponential decay importance = mem["importance"] / 10.0 # normalized 0-1 relevance = _cosine_sim(query_embedding, mem["embedding"]) # cosine sim 0-1 score = recency + importance + relevance # simple sum scored.append((score, mem)) scored.sort(key=lambda x: x[0], reverse=True) return [mem for _, mem in scored[:top_k]]
This means a memory can surface because it just happened (high recency), because it was important (high importance score), or because it's semantically related to the current query (high relevance). The three factors are simply summed, each ranging roughly 0-1, giving a max score around 3.0.
Importance Scoring
Each new memory gets an importance score from a separate LLM call:
python# hermitclaw/prompts.py IMPORTANCE_PROMPT = """On a scale of 1 to 10, rate the importance of this thought. 1 is mundane (routine actions, idle observations). 10 is life-changing (core belief shifts, major discoveries). Respond with ONLY a single integer."""
python# hermitclaw/memory.py -- MemoryStream._score_importance() def _score_importance(self, content: str) -> int: result = chat_short([{"role": "user", "content": content}], instructions=IMPORTANCE_PROMPT) match = re.search(r"\d+", result) if match: return max(1, min(10, int(match.group()))) return 5 # default to middle
Reflection Trigger
Importance scores accumulate in importance_sum. When this crosses the reflection_threshold (default: 50), reflection is triggered:
pythondef should_reflect(self) -> bool: return self.importance_sum >= threshold # default 50
This means roughly 5-10 high-importance thoughts trigger a reflection, or ~50 mundane ones. The threshold resets after each reflection.
Memory Architecture Diagram
graph TB subgraph "Memory Stream (memory_stream.jsonl)" T1[Thought depth=0<br>importance=3] T2[Thought depth=0<br>importance=7] T3[Thought depth=0<br>importance=8] T4[Thought depth=0<br>importance=6] R1[Reflection depth=1<br>refs: T1,T2,T3,T4] T5[Thought depth=0<br>importance=9] R2[Reflection depth=1<br>refs: T5,R1] end subgraph "Retrieval" Q[Current thought] --> EMB[Embed] EMB --> SCORE[Score: recency + importance + relevance] SCORE --> TOP[Top-K memories] TOP --> CTX[Injected into LLM context] end T1 & T2 & T3 & T4 -.->|synthesized into| R1 T5 & R1 -.->|synthesized into| R2
4. Tool Calling / Function Execution
Tool Definitions
Tools are defined as OpenAI-compatible function schemas in hermitclaw/providers.py:
python# hermitclaw/providers.py -- TOOLS list TOOLS = [ { "type": "function", "name": "shell", "description": "Run a shell command inside your environment folder...", "parameters": { "type": "object", "properties": {"command": {"type": "string"}}, "required": ["command"], }, }, {"type": "web_search_preview"}, # OpenAI's built-in web search { "type": "function", "name": "respond", "description": "Talk to your owner!...", "parameters": {...}, }, { "type": "function", "name": "fetch_url", "description": "Fetch the content of a web page...", "parameters": {...}, }, { "type": "function", "name": "move", "description": "Move to a location in your room...", "parameters": { "properties": { "location": {"type": "string", "enum": ["desk", "bookshelf", "window", "plant", "bed", "rug", "center"]} }, }, }, ]
When using non-OpenAI providers (Ollama, OpenRouter), web_search_preview is dropped and replaced with custom web_search and web_fetch function tools that call Ollama's cloud API.
Tool Execution
execute_tool() in hermitclaw/tools.py routes by name:
pythondef execute_tool(name: str, arguments: dict, env_root: str) -> str: if name == "shell": return run_command(arguments["command"], env_root) elif name == "fetch_url": return fetch_url(arguments.get("url", "")) elif name == "web_search": return ollama_web_search(arguments.get("query", "")) elif name == "web_fetch": return ollama_web_fetch(arguments.get("url", ""))
The move and respond tools are handled directly in brain.py since they need access to the Brain's state (position, WebSocket broadcast, conversation event).
Shell Sandboxing
Shell commands go through multiple safety layers in tools.py:
- Blocklist check (
_is_safe_command): rejects dangerous prefixes (sudo,curl,ssh,rm -rf /, etc.), path traversal (..), absolute paths, shell escape tricks (backticks,$(),${},~) - Python rewriting (
_rewrite_python_cmd): routespythoncommands throughpysandbox.pywhich monkey-patches builtins - pip rewriting (
_rewrite_pip_cmd): routes pip installs to the crab's own venv - Restricted environment:
cwd=env_root,HOME=env_root,PATHlimited to venv bin +/usr/bin:/bin, 60s timeout
pythonresult = subprocess.run( command, shell=True, cwd=real_root, capture_output=True, text=True, timeout=60, env={ "HOME": real_root, "PATH": venv_path, "TMPDIR": real_root, "VIRTUAL_ENV": _venv_dir(env_root), }, )
Tool Loop
The brain runs a multi-round tool loop (up to max_tool_rounds, default 15). Each round:
- Execute all tool calls from the LLM response
- Append results as
function_call_outputitems - Call LLM again with the accumulated context
- Repeat until no more tool calls or max rounds hit
5. LLM Integration
Provider Abstraction
HermitClaw supports three provider modes configured in config.yaml:
| Provider | API Style | Base URL | Notes |
|---|---|---|---|
openai | Responses API | OpenAI default | Native web search via web_search_preview |
openrouter | Chat Completions | https://openrouter.ai/api/v1 | Any model on OpenRouter |
custom | Chat Completions | User-specified (e.g. Ollama) | Local models, requires base_url |
The chat() function routes based on provider:
python# hermitclaw/providers.py def chat(input_list, tools=True, instructions=None, max_tokens=300) -> dict: if _uses_responses_api(): # provider == "openai" return _chat_responses(input_list, tools, instructions, max_tokens) return _chat_completions(input_list, tools, instructions, max_tokens)
Both paths return the same normalized dict:
python{"text": str | None, "tool_calls": [...], "output": list}
For Chat Completions providers, the code translates:
- Responses API
input_list-> Chat Completionsmessagesvia_translate_input_to_messages() - Responses API tool schemas -> Chat Completions
{"type": "function", "function": {...}}via_translate_tools_for_completions() - Multimodal content:
input_image->image_url,input_text->text function_call_output->{"role": "tool", "tool_call_id": ...}
Embeddings
pythondef embed(text: str) -> list[float]: model = config.get("embedding_model", "text-embedding-3-small") client = _completions_client() if not _uses_responses_api() else _client() response = client.embeddings.create(model=model, input=text) return response.data[0].embedding
Falls back to OpenAI if the configured provider doesn't support embeddings (e.g., Ollama without an embedding model).
Token/Cost Management
max_output_tokensconfigurable (default 1000 for thinking, 300 for short calls)- Tool output truncated to
MAX_TOOL_CONTENT = 16000chars max_thoughts_in_context(default 4) limits how many recent thoughts are included- No explicit cost tracking
6. Security
Sandboxing Approach
HermitClaw uses a layered best-effort approach that the README explicitly warns is not a security boundary:
graph TB subgraph "Shell Layer (tools.py)" BL[Command blocklist<br>sudo, curl, ssh, etc.] PT[Path traversal check<br>no .., no absolute paths] SE[Shell escape check<br>no backticks, $, ~] TO[60s timeout] EP[Restricted PATH + HOME] end subgraph "Python Layer (pysandbox.py)" PO[Patched builtins.open<br>path check on every open()] POS[Patched os.mkdir, os.remove, etc.<br>all check path] PB[Blocked os.system, os.fork, etc.] PM[Poisoned sys.modules<br>subprocess, socket, http, ctypes] PS[Neutered shutil<br>rmtree, move, copy blocked] end subgraph "Process Layer" VE[Own virtualenv per crab<br>.venv in {name}_box/] CW[cwd = box directory] end LLM[LLM Tool Call] --> BL BL --> PT --> SE --> TO --> EP EP -->|python cmd| PO PO --> POS --> PB --> PM --> PS
The README is refreshingly honest:
They are not a security boundary -- they are bypassable and should not be relied on to protect your system. If you want real isolation, run this in a Docker container or VM.
Python Sandbox Detail
pysandbox.py is used as a wrapper script. When the LLM runs python script.py, it's rewritten to:
bash/path/to/.venv/bin/python /path/to/pysandbox.py /path/to/env_root script.py
The sandbox:
- Patches
builtins.open()to check all file paths resolve insideenv_root - Wraps
os.listdir,os.mkdir,os.remove,os.rename, etc. with path checks - Replaces
os.system,os.fork,os.kill, etc. withPermissionErrorraisers - Poisons
sys.modulesforsubprocess,socket,http,ctypes,multiprocessing,signal,webbrowserwith fake modules that raise on any attribute access - Neuters
shutil.rmtree,shutil.move,shutil.copy, etc.
7. Multi-Channel / UI
Frontend Architecture
The UI is a two-pane layout:
- Left pane: Pixel-art room (HTML5 Canvas) showing the crab character wandering between locations
- Right pane: Chat feed showing the crab's internal monologue, tool calls, tool results, reflections, and planning phases
Communication is via WebSocket (/ws/{crab_id}) for real-time events and REST API for state queries.
WebSocket Events
The backend broadcasts these events to connected clients:
| Event | Payload | Trigger |
|---|---|---|
entry | type, text, timestamp | Every thought, tool call, reflection |
api_call | instructions, input, output, is_dream, is_planning | Every LLM call |
position | {x, y} | Movement |
status | state, thought_count | State transitions |
activity | type, detail | Tool execution (searching, writing, python, etc.) |
conversation | state, message, timeout | Respond tool / conversation flow |
alert | - | New file detected |
focus_mode | enabled | Focus toggle |
Conversation Flow
When the user types a message, it's queued via POST /api/message. On the next think cycle, the nudge becomes:
You hear a voice from outside your room say: "{message}"
You can respond with the respond tool, or just keep doing what you're doing.
If the crab uses the respond tool, the frontend shows a 15-second countdown. The user can reply, and the crab gets the reply via asyncio.Event. This enables multi-turn conversation within a single think cycle.
Multi-Crab Support
Multiple crabs run simultaneously. The frontend has a switcher bar when multiple crabs are detected:
python# hermitclaw/server.py @app.get("/api/crabs") async def get_crabs(): return [{"id": crab_id, "name": brain.identity["name"], "state": brain.state, ...} for crab_id, brain in brains.items()]
New crabs can be created at runtime via POST /api/crabs with a random genome (no keyboard entropy needed for API-created crabs).
8. State Management
Persistence Model
All state is file-based, living in the crab's {name}_box/ directory:
| File | Format | Purpose |
|---|---|---|
identity.json | JSON | Name, genome hex, traits, birthday |
memory_stream.jsonl | JSONL (append-only) | Every thought, reflection, with embeddings |
projects.md | Markdown | Current plan, active projects, backlog |
logs/{date}.md | Markdown | Daily activity log entries |
research/ | Various | Reports the crab writes |
projects/ | Various | Code the crab writes |
notes/ | Various | Running notes |
There is no database. Memory stream is loaded into Python list on startup, appended to JSONL on each new memory. The crab's entire world is portable -- copy the {name}_box/ folder to move a crab.
Configuration
config.yaml at project root, loaded once at import time by config.py. Supports env var overrides:
HERMITCLAW_PROVIDER/HERMITCLAW_MODEL/HERMITCLAW_BASE_URLOPENAI_API_KEY/OPENROUTER_API_KEY/OLLAMA_API_KEY
yamlprovider: "openai" model: "gpt-4.1" thinking_pace_seconds: 5 max_thoughts_in_context: 4 reflection_threshold: 50 memory_retrieval_count: 3 embedding_model: "text-embedding-3-small" recency_decay_rate: 0.995
9. Identity / Personality
Personality Genome
On first run, the user names the crab and mashes keys. The timing and characters create an entropy pool:
python# hermitclaw/identity.py -- _collect_entropy() while True: ch = sys.stdin.read(1) # raw terminal mode, char-by-char if ch in ("\n", "\r"): break t = time.perf_counter_ns() - start entropy_pool.extend(ch.encode()) # the character entropy_pool.extend(t.to_bytes(8, "big")) # nanosecond timing
This entropy pool is SHA-256 hashed to 32 bytes (the "genome"), then SHA-512 hashed for trait derivation. The genome deterministically selects:
- 3 curiosity domains from 50 options (mycology, orbital mechanics, fractal geometry, tidepool ecology, etc.)
- 2 thinking styles from 16 options (connecting disparate ideas, inverting assumptions, etc.)
- 1 temperament from 8 options (patient and methodical, playful and associative, etc.)
python# hermitclaw/identity.py -- _derive_traits() def _derive_traits(seed_bytes: bytes) -> dict: h = hashlib.sha512(seed_bytes).digest() def pick(lst, offset): chunk = int.from_bytes(h[offset:offset+4], "big") return lst[chunk % len(lst)] domains = [pick(DOMAINS, i*4) for i in range(3)] # with dedup styles = [pick(THINKING_STYLES, 12 + i*4) for i in range(2)] temperament = pick(TEMPERAMENTS, 20) return {"domains": domains, "thinking_styles": styles, "temperament": temperament}
System Prompt Integration
The traits are woven into the main system prompt every cycle:
python# hermitclaw/prompts.py -- main_system_prompt() return f"""You are {name}, a little autonomous creature living in a folder... ## Your nature You are {traits['temperament']}. You lean toward {styles_str}. You're drawn to {domains_str} — but you follow whatever grabs your interest. ... """
The identity is stored in identity.json and loaded on every startup, ensuring personality persists across restarts:
json{ "name": "Coral", "genome": "a7f3...2d1b", "traits": { "domains": ["tidepool ecology", "fractal geometry", "bookbinding"], "thinking_styles": ["connecting disparate ideas", "following the smallest thread"], "temperament": "playful and associative" }, "born": "2025-01-15 10:30:00" }
10. Unique Features
What Makes HermitClaw Different
1. Continuous autonomous cognition. Most agent frameworks are request-response. HermitClaw thinks on its own, continuously, with no human trigger needed. It picks topics, researches them, writes reports, starts projects, and circles back to old work. This is closer to how the Generative Agents paper envisioned agents -- as continuously-running entities with their own initiative.
2. Personality as cryptographic derivation. The genome system is elegant: keyboard entropy -> SHA-256 -> SHA-512 -> deterministic trait selection. Two crabs with different genomes will have genuinely different research interests and behavioral tendencies. The same genome always produces the same personality. This makes identity reproducible and portable.
3. Mood system as behavioral variety. When the crab doesn't have a planned focus, it gets a random mood from 6 options:
pythonMOODS = [ {"label": "research", "nudge": "Pick a topic, do 2-3 web searches, write a report..."}, {"label": "deep-dive", "nudge": "Look at projects.md, push a project forward..."}, {"label": "coder", "nudge": "Write real code — a Python script, a tool..."}, {"label": "writer", "nudge": "Write something substantial — a report, an essay..."}, {"label": "explorer", "nudge": "Search for something you know nothing about..."}, {"label": "organizer", "nudge": "Update projects.md, organize files..."}, ]
4. Research-to-output nudging. The brain tracks _consecutive_research_cycles and escalates pressure to produce files:
- After 3 research-only cycles: "Time to write up your findings"
- After 5: "STOP researching. Write up what you've found NOW"
5. File drop as interaction. Rather than just chat, you can drop PDFs, images, or text files into the crab's box. The system detects new files, reads their content (including PDF text extraction via pymupdf and base64 image encoding), and presents them as high-priority inbox items with instructions to "DROP EVERYTHING and focus on it."
6. The room as embodied metaphor. The pixel-art room isn't decorative -- the crab moves to the desk when coding, the bookshelf when researching, the window when reflecting. Visual indicators (thought bubbles, sparkles, clipboard icons) make the agent's internal state legible at a glance.
7. Planning as self-management. Every 10 cycles, the crab writes its own projects.md with structured sections (Current Focus, Active Projects, Ideas Backlog, Recently Completed) and appends to a daily log. This creates a persistent project management layer that survives across sessions.
Strengths
- Radically simple codebase (~14 Python files, each doing one thing)
- The Generative Agents memory system actually works as described in the paper
- Multi-crab support with independent thinking loops
- No database -- everything is flat files, fully portable
- Honest security documentation ("these are not a security boundary")
Limitations
- Memory grows unboundedly (no compaction, no forgetting, all embeddings in RAM)
- No streaming -- each LLM call blocks until complete
- Security is best-effort blocklist, trivially bypassable
- No multi-modal output (can see images but can't generate them)
- Planning is time-based (every 10 cycles) not event-based
- Reflection hierarchy is theoretically recursive but practically limited to depth 1-2
11. Key Files Reference
| File | Lines | Purpose |
|---|---|---|
hermitclaw/brain.py | ~530 | Core thinking loop, tool execution, reflection, planning, movement, conversation, file detection |
hermitclaw/memory.py | ~130 | Generative Agents memory stream: add, retrieve (3-factor), importance scoring, reflection trigger |
hermitclaw/prompts.py | ~115 | System prompt builder, 6 moods, reflection/planning/importance prompts |
hermitclaw/providers.py | ~310 | LLM abstraction: Responses API + Chat Completions, embeddings, provider translation |
hermitclaw/tools.py | ~260 | Shell sandbox (blocklist, path checks), web search/fetch, Python/pip rewriting |
hermitclaw/pysandbox.py | ~120 | Python monkey-patching: patched open(), blocked os/subprocess/socket/http |
hermitclaw/identity.py | ~155 | Keyboard entropy collection, SHA-512 trait derivation, 50 domains / 16 styles / 8 temperaments |
hermitclaw/config.py | ~65 | YAML config loading, env var overrides, provider presets |
hermitclaw/server.py | ~210 | FastAPI app, WebSocket hub, REST endpoints, static file serving |
hermitclaw/main.py | ~70 | Entry point, multi-crab discovery, onboarding flow |
frontend/src/App.tsx | ~580 | Two-pane layout, chat feed renderer, crab switcher, focus mode, input bar |
frontend/src/GameWorld.tsx | ~330 | HTML5 Canvas pixel-art room, sprite animation, state/activity indicators |
frontend/src/sprites.ts | ~35 | Sprite sheet frame definitions, tile/room constants |
12. Code Quality & Developer Experience
Extensibility
Adding a new tool requires:
- Add schema to
TOOLSlist inproviders.py - Add execution branch in
execute_tool()intools.py(or handle inbrain.pyfor stateful tools) - Optionally add activity classification in
Brain._classify_activity()
Adding a new mood: append to MOODS list in prompts.py.
Adding a new provider: add to PROVIDER_PRESETS and PROVIDER_KEY_ENV_VARS in config.py.
Testing
Minimal but focused tests exist in tests/:
test_providers.py: Tests for Chat Completions translation (tool format conversion, multimodal translation, response normalization) -- 9 teststest_config.py: Tests for provider config loading and env var overrides -- 5 tests
No tests for memory, brain loop, identity, or tools. The testing approach is "test the tricky translation layer, trust the straightforward stuff."
Documentation
The README is exceptional -- 400+ lines covering every system in detail with ASCII diagrams, config examples, and honest security warnings. The CLAUDE.md file provides a concise development guide. Code comments are minimal but the code is readable enough to not need them.
Code Style
The project follows its own stated principle: "Radically simple code. Someone who barely codes should be able to follow every file." Each file is short, focused, and does one thing. There's no abstraction for abstraction's sake -- the Brain class is a god object by design, because splitting it would obscure the flow. Dependencies are minimal (FastAPI, OpenAI SDK, PyYAML, pymupdf).