Architecture Deep Dive

Building Openclad: A 24/7 Personal AI Agent That Actually Does Things

How I built an autonomous AI assistant that runs on my Mac, talks through Telegram, controls my smart home, tracks my job search, and learns from every interaction — using Claude, MCP servers, and about 2,000 lines of Python.

There's a gap between what AI chatbots can do and what they actually do for you day-to-day. ChatGPT can write a poem, but it can't turn off your living room lights when you're already in bed. Claude can analyze a document, but it won't proactively send you a news briefing at 9 AM.

I wanted something different: an AI agent that runs continuously, has access to my real tools and services, makes decisions autonomously for low-risk tasks, and asks permission before doing anything destructive. Something I could message from my phone and get things done — not just get answers.

So I built Jarvis. This post walks through the architecture, key design decisions, and how you can build your own. The full source is at github.com/foodlbs/openclad.

01 What Jarvis Can Do

Before diving into architecture, here's what a typical day looks like:

9:00 AM
Jarvis sends a news briefing to Telegram — top headlines, tech news, market movements. No prompt needed; it's a scheduled skill.
10:30 AM
"What's the status of my job applications?" → Jarvis queries the SQLite-backed tracker: 3 applied, 1 interview scheduled, 2 rejected.
2:00 PM
I send a PDF: "Summarize this and save the key points." → Jarvis reads it, generates a summary, stores it in vector memory, saves a markdown file to Documents.
11:00 PM
"Turn off all the lights." → Approval request with inline Telegram button. I tap Approve. Lights off via Home Assistant.

The key insight: Jarvis doesn't just respond to questions. It executes tasks, remembers context, and operates on a schedule — all while a risk classification system keeps me in control of anything with real-world side effects.

02 Architecture Overview

The system breaks into four layers: the Telegram interface (how I talk to it), the agent core (how it thinks and acts), MCP servers (what it can do), and persistent state (what it remembers).

System Architecture — Component Map
USER INTERFACE CORE MCP STATE Telegram App 📱 iOS / macOS TELEGRAM LAYER TelegramBot aiogram 3.x Auth Middleware allowlist Cmd Handlers /task /status /skill Approval Manager inline keyboards AGENT CORE PersonalAgent Claude SDK subprocess Risk Classifier 2-tier system policy.yaml ContextLoader system prompt assembly Scheduler cron from .md RetryQueue exp. backoff MCP SERVERS (FastMCP) Filesystem read/write Browser Playwright Sandbox code exec Job Tracker SQLite Memory ChromaDB Smart Home Home Asst. Cal / Gmail Google OAuth PERSISTENT STATE Redis personality.md + skills/ schedules.md ChromaDB files memory/*.md

03 The Tech Stack

LayerTechnologyRationale
LanguagePython 3.12 + uv workspaceFast dependency resolution, monorepo support
AgentClaude Agent SDK (subprocess)Process isolation, crash recovery, tool use built-in
Chat Interfaceaiogram 3.xAsync Telegram, inline keyboards for approvals
State & EventsRedisTask state, conversation buffer, retry queue
Vector MemoryChromaDB (local)No cloud dependency, free, cosine similarity search
EmbeddingsOpenAI text-embedding-3-smallCost-effective, high quality for semantic search
MCP FrameworkFastMCPSimple Python MCP server scaffolding
Configpydantic-settings + YAMLType-safe config with env var overrides
LoggingstructlogStructured JSON logs for production
DaemonmacOS LaunchAgentAuto-start on boot, background execution

04 Deep Dive: How Each Piece Works

1. The Conversation Flow

When I send a message on Telegram, here's exactly what happens:

Sequence Diagram — "Turn off the lights"
User TelegramBot main.py PersonalAgent RiskClassifier MCP Tool "Turn off the lights" auth +model dispatch task run_agent_task() ↺ "typing..." indicator → User prompt PreToolUse hook classify 🔒 "Approve / Deny" keyboard Tap "Approve" approved call_service(light, off) success "Done — lights are off." save

Smart model selection — Short messages auto-route to Haiku instead of Sonnet. Saves cost and latency for trivial queries while preserving Sonnet's reasoning capacity for demanding work.

Conversation buffer — Last 10 turns in Redis with a 1-hour TTL. Follow-ups like "What about the bedroom lights?" work because Jarvis remembers the topic.

2. The Risk System

An autonomous agent with access to your filesystem, email, and smart home needs guardrails. Jarvis uses a two-tier classification:

Risk Classification — Two-Tier Model
✓ Autonomous — No Approval Needed
File reads
Web search
Memory queries
Code sandbox
Browser navigation
Job tracker reads
🔒 Require Approval — Inline Button
File writes / deletes
Email sends
Calendar edits
Smart home control
Phone calls
Purchases

Classification examines input parameters, not just tool names. Reading ~/Documents is autonomous; writing to /etc/ always requires approval. Configured via risk_policy.yaml:

# risk_policy.yaml
risk_overrides:
  mcp__filesystem__write_file: autonomous
  mcp__smart_home__call_service: require_approval

context_escalation:
  dangerous_paths: [/system, /etc, /usr/bin]
  sensitive_entities: [lock, alarm, security]

3. The Memory System

Memory Architecture — Dual-Layer Design
Short-Term · System Prompt
preferences.mdLearned preferences
projects.mdActive project status
chat_history.mdRecent sessions · auto-trim at 30
journal.mdAgent reflections
ContextLoaderAssembles into every prompt
PersonalAgent
Long-Term · Vector Search
ChromaDB storeLocal — no cloud dependency
OpenAI Embeddingstext-embedding-3-small
memory_store_tool2-3 sentence task summaries
memory_search_toolSemantic recall before complex tasks
auto-trimCompresses history > 30 sessions

The combination gives Jarvis both working memory (always loaded into every prompt) and recall (searchable when needed).

4. The Skill Framework

Skills are Markdown files defining triggers, steps, and required tools. When Jarvis spots a repeatable pattern (3+ similar tool call sequences), it suggests creating a new skill — the system grows organically.

## Daily News Briefing
Trigger: "news update", "daily news", "morning briefing"
Schedule: Every day at 9:00 AM EST

### Steps
1. Search for current top headlines
2. Search for tech industry news
3. Format into clean briefing with sections
4. Output ONLY the briefing — no meta-commentary

5. The Scheduler

Parses a human-readable schedules.md into cron-like entries. Each job runs as an async task with max turns capped at 15 to enforce conciseness and suppress meta-commentary.

6. Resilience & Fallback

Retry queue — Failed tasks enter a Redis sorted set with exponential backoff (30s → 60s → 120s). After 3 retries, abandoned with failure notification. Model downgrades to Haiku on retry.

API fallback — Rate-limit or downtime detected via error keywords → falls back to Claude Code CLI with a cached OAuth token.

Circuit breaker — Closed → open → half-open. After 5 consecutive failures, rejects calls for 60 seconds before allowing a probe.

05 Project Structure

openclad/
├── main.py                  # Entry point & orchestrator
├── pyproject.toml            # uv workspace root
├── compose.yaml              # Docker Compose (Redis)
├── packages/
│   ├── core/               # Agent, config, state, risk, retry, scheduler
│   ├── interfaces/         # Telegram bot, handlers, approval flow
│   └── mcp_servers/        # Job tracker, smart home, memory
├── data/
│   ├── agent_context/        # personality.md, skills/, schedules.md
│   ├── memory/chroma/        # Vector store persistence
│   └── secrets/              # OAuth credentials (gitignored)
├── configs/
│   ├── agent.yaml            # Runtime config
│   └── risk_policy.yaml      # Risk classification overrides
└── tests/                    # 20+ unit tests

The monorepo keeps things modular — core has no Telegram dependency, interfaces has no MCP dependency, and mcp_servers are standalone FastMCP processes. Want Discord instead of Telegram? Replace interfaces without touching core.

06 Key Design Decisions

Why local ChromaDB over Pinecone?
No cloud dependency. The vector store lives at data/memory/chroma/ — just files on disk. Zero cost, zero round-trip latency, fully git-backupable.
Why the Claude Agent SDK subprocess model?
Each agent invocation is isolated. If it crashes, nothing leaks. SDK upgrade? Restart the process. Simplest possible isolation boundary.
Why Telegram over a custom UI?
Already on my phone, laptop, and watch. Inline keyboards, file attachments, voice messages, rich formatting — all built-in. A custom UI would have taken weeks for a worse experience.
Why Markdown for schedules and skills?
Editable with any text editor, version-controlled with git, and readable by the agent itself. When Jarvis creates a new skill, it writes a .md file.
Why Redis for all stateful data?
One dependency, in-memory speed, TTL for auto-cleanup, pub/sub for future real-time features. For a single-user agent, one Redis instance is plenty.

07 Running It Yourself

# Clone
git clone https://github.com/foodlbs/openclad.git
cd openclad

# Configure
cp .env.example .env

# Install
uv sync --all-packages

# Start Redis
docker compose up -d redis

# Run
uv run python main.py

You'll need an Anthropic API key, a Telegram bot token (from @BotFather), and your Telegram chat ID. Optionally: an OpenAI key (embeddings), Google OAuth (Calendar/Gmail), and a Home Assistant URL + token.

08 What I'd Do Differently

01
Dedicated voice pipeline
Telegram voice transcription works but is clunky. A proper streaming voice pipeline would transform the experience.
02
Parameterized skills
Skills as templates: Research {topic} with depth {shallow|deep} instead of flat instruction sets.
03
Multi-agent orchestration
Some tasks need parallel sub-agents (researcher + writer). Single-agent model hits turn limits on complex workflows.
04
Observability dashboard
Task history, tool usage, cost tracking, memory growth. The event stream is there but underutilized.

09 Wrapping Up

Jarvis has been running 24/7 on my Mac for about two weeks — delivering morning news, tracking job applications, helping with research, and controlling my apartment, all through Telegram.

The total codebase is ~2,000 lines of Python across three packages, plus Markdown files for personality, skills, and schedules. The agent framework does the heavy lifting; the surrounding infrastructure — risk classification, retry logic, conversation persistence, skill routing — is what transforms a chatbot into an actual assistant.

Star the repo and check out the source at github.com/foodlbs/openclad. If you build something cool with it, I'd love to hear about it.

Built with:
Claude Agent SDK aiogram 3.x Redis ChromaDB FastMCP Python 3.12 macOS LaunchAgent