Building JARVIS: AI Agent for Raspberry Pi
On January 1st I woke up with a simple idea: what if I could talk to Claude through Telegram, and it could actually do things for me—not just answer questions?
What I Actually Built
Here's what JARVIS can do:
- Respond via voice or text — I send a voice message, it transcribes and responds. I can ask for audio replies too.
- Remember things across conversations — "Remember I like oat milk" actually persists.
- Run shell commands — Safe ones, in a sandbox.
- Publish blog posts — It can write and publish to my blog via API.
- Update itself — I can push code from my phone via Claude Code, and JARVIS pulls the changes and restarts. No SSH required.
All of this runs on 512MB of RAM, inside a locked-down Docker container with no host access—except for a single trigger file that signals "please update me."
The "Aha" Moment
The magic clicked when I realized Claude's tool-use isn't just for answering questions. It's for taking action.
When I message JARVIS, here's what happens:
Me: "Remember to buy coffee tomorrow"
↓
JARVIS receives message via Telegram
↓
Claude thinks: "I should use the remember tool"
↓
Tool executes: saves "buy coffee" to my todo list
↓
JARVIS: "Got it, I'll remind you about coffee"
Claude isn't just generating text. It's deciding which tool to use, calling it, reading the result, and responding. That's the difference between a chatbot and an agent.
For the Curious: How It Works
The core loop:
- Message comes in from Telegram
- Add it to conversation history (keeps last 50 messages for context)
- Send to Claude with a list of available tools
- Claude either responds with text OR requests a tool
- If tool requested → execute it → send result back to Claude → repeat
- Final response goes back to Telegram
The tools JARVIS has:
| Tool | What it does |
|---|---|
bash |
Runs safe shell commands (ls, cat, date, git—no rm or sudo) |
remember / recall |
Persistent memory across sessions |
create_note |
Publishes markdown to my blog |
self_update |
Pulls latest code from GitHub and restarts |
speak |
Sends audio response via text-to-speech |
The stack:
- Claude TypeScript Agent SDK for tool orchestration
- Telegram bot via grammY framework
- Groq for voice transcription (Whisper) and TTS (PlayAI)
- Docker for isolation
- Raspberry Pi Zero 2 W as the host (512MB RAM, ARM64)
Project structure:
jarvis/
├── src/
│ ├── index.ts # Entry point
│ ├── agent/
│ │ ├── agent.ts # Claude SDK integration
│ │ ├── memory.ts # Conversation history
│ │ └── tools/ # bash, file, memory, update, speak...
│ ├── telegram/
│ │ ├── bot.ts # grammY client
│ │ ├── handlers.ts # Message routing
│ │ └── middleware.ts # Auth + rate limiting
│ └── security/
│ └── whitelist.ts # User validation
├── docker/
│ ├── Dockerfile # Production (ARM64, hardened)
│ └── docker-compose.yml
└── scripts/
├── auto-update.sh # Polls git + watches trigger
└── jarvis-updater.service # systemd unit
How it all connects:
┌──────────────────────────────────────────────────┐
│ Raspberry Pi Zero 2 W (512MB) │
│ │
│ ┌─────────────────────────────────────────────┐ │
│ │ Docker Container │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────────┐ │ │
│ │ │ grammY │→ │ Agent │→ │ Claude API │ │ │
│ │ │Telegram │ │ Loop │ │ (tools) │ │ │
│ │ └─────────┘ └─────────┘ └─────────────┘ │ │
│ │ ↓ ↓ │ │
│ │ ┌─────────────────────────────────────┐ │ │
│ │ │ Tools: bash, memory, speak, etc. │ │ │
│ │ └─────────────────────────────────────┘ │ │
│ └──────────────────┬──────────────────────────┘ │
│ │ writes │
│ ▼ │
│ ┌─────────────────────────────────────────────┐ │
│ │ /var/jarvis/update-trigger │ │
│ └─────────────────────────────────────────────┘ │
│ ▲ │
│ │ watches │
│ ┌─────────────────────────────────────────────┐ │
│ │ jarvis-updater.service (systemd) │ │
│ │ → git pull → docker compose restart │ │
│ └─────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────┘
The Self-Update Trick
This is my favorite part. The bot runs in a sandboxed container with no host access—so how can it update itself?
The answer: a trigger file.
The container can write to exactly one location on the host: /var/jarvis/update-trigger. Meanwhile, a systemd service on the host watches that file.
Me: "Update yourself"
↓
Claude calls self_update() tool
↓
Tool writes timestamp to /var/jarvis/update-trigger
↓
Host's jarvis-updater.service detects the file change
↓
Runs: git pull → npm install → npm build → docker compose restart
↓
JARVIS comes back online with the new code
The bot can't execute arbitrary code on the host. It can only say "please update me now." The host decides whether to honor that.
There's also a polling fallback. The updater checks GitHub every 60 seconds anyway:
# In auto-update.sh daemon mode
while true; do
git fetch origin main
if [ $(git rev-parse HEAD) != $(git rev-parse origin/main) ]; then
git pull && docker compose up -d --build
fi
sleep $POLL_INTERVAL
done
So I have two paths:
- Push to GitHub → Pi polls and auto-deploys within 60 seconds
- Tell JARVIS to update → Instant trigger via Telegram
Why this matters: I can push code from my phone using Claude Code iOS, and JARVIS picks it up automatically. No SSH, no manual deploys, no laptop required.
Voice Flow
When I send a voice message:
- Telegram sends the audio file
- JARVIS downloads it and sends to Groq's Whisper API
- Transcribed text goes to Claude
- If I said "reply with voice," Claude uses the
speaktool - PlayAI generates audio, JARVIS sends it back
I added this feature while my car was driving itself. One voice message to JARVIS: "Add a feature so you can reply with audio when I ask." Worked first try.
Security (Because I'm Not Crazy)
Running an AI with shell access sounds terrifying. Here's how I locked it down:
Access Control:
- Whitelist only — Only my Telegram ID can talk to it
- Rate limited — Token bucket: 10 requests, 0.5/sec refill
Sandboxed Execution:
- Command whitelist — Only these:
echo,ls,pwd,cat,head,tail,wc,date,whoami,git - Path restricted — File operations limited to
/app,/tmp,/app/data - No network tools — No
curl,wget, or anything that could exfiltrate data
Docker Hardening:
- Non-root user — Container runs as UID 1001, not root
- Read-only filesystem — Container can't modify its own code
- Dropped capabilities — Minimal Linux capabilities
- Single write point — Only
/var/jarvis/update-triggeris writable to host
Audit Trail:
- Every tool call logged with timestamps
- Every message logged (encrypted at rest)
The self-update mechanism is the only bridge between container and host, and it's one-way: the bot can request an update, but it can't control what code gets deployed. That always comes from the git remote.
The Memory System
Each user gets their own categorized memory:
/app/data/memory/{userId}/
├── todo/items.json
├── today/items.json
├── memory/items.json
└── posts/items.json
Example:
Me: "Remember that I prefer dark mode"
→ Saved to memory/items.json with timestamp
Me: "What are my preferences?"
→ Claude calls recall(), gets the data, responds naturally
JARVIS can also move items between categories—"move 'buy coffee' from todo to today."
"Think Hard" Mode
Most messages use Claude Haiku (fast, cheap). But sometimes I need real reasoning:
Me: "think hard: analyze my spending patterns from last month"
The "think hard:" prefix switches to Claude Opus. Same tools, more intelligence.
What's Next
Right now JARVIS is reactive—it waits for me to message it. The next version will be proactive:
- Break down goals into subtasks
- Execute multi-step plans without supervision
- Learn from outcomes and adjust
The loop becomes: Observe → Plan → Act → Reflect → Learn → Repeat
That's when things get really interesting.
Try It Yourself
The full code is on GitHub:
https://github.com/acrosa/alfajor-raspberry
You'll need:
TELEGRAM_BOT_TOKEN=xxx
ANTHROPIC_API_KEY=xxx
TELEGRAM_ALLOWED_USERS=your_telegram_id
# Optional: GROQ_API_KEY for voice
# Optional: COLLECTED_NOTES_API_KEY for blog publishing
Then:
docker compose up
The Takeaway
The gap between "AI that answers questions" and "AI that does things" is smaller than I thought. Claude's tool-use, a Raspberry Pi, and a weekend of hacking got me an agent that:
- Lives in my pocket (via Telegram)
- Remembers everything I tell it
- Can update its own code
- Runs securely in a sandbox
We're at the very beginning of personal AI agents. This is a proof of concept, but the pattern scales. What would you build if your AI could take action?
Built with Claude, deployed on a Pi, controlled from a Tesla. The future is weird and I'm here for it.