How I built an autonomous AI assistant that runs on my Mac, talks through Telegram, controls my smart home, tracks my job search, and learns from every interaction — using Claude, MCP servers, and about 2,000 lines of Python.
There's a gap between what AI chatbots can do and what they actually do for you day-to-day. ChatGPT can write a poem, but it can't turn off your living room lights when you're already in bed. Claude can analyze a document, but it won't proactively send you a news briefing at 9 AM.
I wanted something different: an AI agent that runs continuously, has access to my real tools and services, makes decisions autonomously for low-risk tasks, and asks permission before doing anything destructive. Something I could message from my phone and get things done — not just get answers.
So I built Jarvis. This post walks through the architecture, key design decisions, and how you can build your own. The full source is at github.com/foodlbs/openclad.
Before diving into architecture, here's what a typical day looks like:
The key insight: Jarvis doesn't just respond to questions. It executes tasks, remembers context, and operates on a schedule — all while a risk classification system keeps me in control of anything with real-world side effects.
The system breaks into four layers: the Telegram interface (how I talk to it), the agent core (how it thinks and acts), MCP servers (what it can do), and persistent state (what it remembers).
| Layer | Technology | Rationale |
|---|---|---|
| Language | Python 3.12 + uv workspace | Fast dependency resolution, monorepo support |
| Agent | Claude Agent SDK (subprocess) | Process isolation, crash recovery, tool use built-in |
| Chat Interface | aiogram 3.x | Async Telegram, inline keyboards for approvals |
| State & Events | Redis | Task state, conversation buffer, retry queue |
| Vector Memory | ChromaDB (local) | No cloud dependency, free, cosine similarity search |
| Embeddings | OpenAI text-embedding-3-small | Cost-effective, high quality for semantic search |
| MCP Framework | FastMCP | Simple Python MCP server scaffolding |
| Config | pydantic-settings + YAML | Type-safe config with env var overrides |
| Logging | structlog | Structured JSON logs for production |
| Daemon | macOS LaunchAgent | Auto-start on boot, background execution |
When I send a message on Telegram, here's exactly what happens:
Smart model selection — Short messages auto-route to Haiku instead of Sonnet. Saves cost and latency for trivial queries while preserving Sonnet's reasoning capacity for demanding work.
Conversation buffer — Last 10 turns in Redis with a 1-hour TTL. Follow-ups like "What about the bedroom lights?" work because Jarvis remembers the topic.
An autonomous agent with access to your filesystem, email, and smart home needs guardrails. Jarvis uses a two-tier classification:
Classification examines input parameters, not just tool names. Reading ~/Documents is autonomous; writing to /etc/ always requires approval. Configured via risk_policy.yaml:
# risk_policy.yaml
risk_overrides:
mcp__filesystem__write_file: autonomous
mcp__smart_home__call_service: require_approval
context_escalation:
dangerous_paths: [/system, /etc, /usr/bin]
sensitive_entities: [lock, alarm, security]
The combination gives Jarvis both working memory (always loaded into every prompt) and recall (searchable when needed).
Skills are Markdown files defining triggers, steps, and required tools. When Jarvis spots a repeatable pattern (3+ similar tool call sequences), it suggests creating a new skill — the system grows organically.
## Daily News Briefing
Trigger: "news update", "daily news", "morning briefing"
Schedule: Every day at 9:00 AM EST
### Steps
1. Search for current top headlines
2. Search for tech industry news
3. Format into clean briefing with sections
4. Output ONLY the briefing — no meta-commentary
Parses a human-readable schedules.md into cron-like entries. Each job runs as an async task with max turns capped at 15 to enforce conciseness and suppress meta-commentary.
Retry queue — Failed tasks enter a Redis sorted set with exponential backoff (30s → 60s → 120s). After 3 retries, abandoned with failure notification. Model downgrades to Haiku on retry.
API fallback — Rate-limit or downtime detected via error keywords → falls back to Claude Code CLI with a cached OAuth token.
Circuit breaker — Closed → open → half-open. After 5 consecutive failures, rejects calls for 60 seconds before allowing a probe.
openclad/
├── main.py # Entry point & orchestrator
├── pyproject.toml # uv workspace root
├── compose.yaml # Docker Compose (Redis)
├── packages/
│ ├── core/ # Agent, config, state, risk, retry, scheduler
│ ├── interfaces/ # Telegram bot, handlers, approval flow
│ └── mcp_servers/ # Job tracker, smart home, memory
├── data/
│ ├── agent_context/ # personality.md, skills/, schedules.md
│ ├── memory/chroma/ # Vector store persistence
│ └── secrets/ # OAuth credentials (gitignored)
├── configs/
│ ├── agent.yaml # Runtime config
│ └── risk_policy.yaml # Risk classification overrides
└── tests/ # 20+ unit tests
The monorepo keeps things modular — core has no Telegram dependency, interfaces has no MCP dependency, and mcp_servers are standalone FastMCP processes. Want Discord instead of Telegram? Replace interfaces without touching core.
data/memory/chroma/ — just files on disk. Zero cost, zero round-trip latency, fully git-backupable..md file.# Clone
git clone https://github.com/foodlbs/openclad.git
cd openclad
# Configure
cp .env.example .env
# Install
uv sync --all-packages
# Start Redis
docker compose up -d redis
# Run
uv run python main.py
You'll need an Anthropic API key, a Telegram bot token (from @BotFather), and your Telegram chat ID. Optionally: an OpenAI key (embeddings), Google OAuth (Calendar/Gmail), and a Home Assistant URL + token.
Research {topic} with depth {shallow|deep} instead of flat instruction sets.Jarvis has been running 24/7 on my Mac for about two weeks — delivering morning news, tracking job applications, helping with research, and controlling my apartment, all through Telegram.
The total codebase is ~2,000 lines of Python across three packages, plus Markdown files for personality, skills, and schedules. The agent framework does the heavy lifting; the surrounding infrastructure — risk classification, retry logic, conversation persistence, skill routing — is what transforms a chatbot into an actual assistant.
Star the repo and check out the source at github.com/foodlbs/openclad. If you build something cool with it, I'd love to hear about it.