Initial research: OpenClaw on AgentCore architecture

- Architecture comparison (OpenClaw daemon vs AgentCore serverless)
- Component compatibility analysis
- Fargate analysis
- AgentCore rebuild plan (Telegram, zero always-on compute)
- Memory strategy: AgentCore Memory + factbase as structured KB
- Serverless relay patterns per channel
- All open questions resolved
- OpenClaw feature delta March→May 2026
- Build phases and cost estimates
This commit is contained in:
daniel
2026-05-04 08:28:52 -05:00
parent 4afa16a9cd
commit 0369a74ac1
13 changed files with 1876 additions and 1 deletions

109
feasibility-verdict.md Normal file
View File

@@ -0,0 +1,109 @@
# Feasibility Verdict: OpenClaw on AgentCore
## TL;DR
**Can OpenClaw run on AgentCore Runtime?** Not as-is. The architectures are fundamentally different. OpenClaw is a **long-lived daemon** with persistent connections; AgentCore is a **serverless, request-driven container** with ephemeral sessions.
You could run a **subset** of OpenClaw on AgentCore — specifically, the agent reasoning/tool-calling core — but you'd need to completely redesign the messaging layer, state management, and scheduling. At that point, you're essentially building a new system that borrows OpenClaw's agent logic.
## The Core Tension
| Dimension | OpenClaw | AgentCore |
|---|---|---|
| Process model | Always-on daemon | Request-invoked container |
| Session lifetime | Indefinite | 8 hours max, 15min idle kill |
| State | Local filesystem | Ephemeral (use AgentCore Memory) |
| Connections | Persistent WS to channels | No persistent outbound connections |
| Scheduling | Internal cron/heartbeat | None (use EventBridge) |
| User model | Single user, single host | Multi-user, multi-session |
## What Would Actually Work
### Realistic Architecture: "Split Gateway"
```
┌─────────────────────────────┐
│ Channel Relay │ ← ECS Fargate (always-on)
│ WhatsApp, Discord, Slack, │ Maintains persistent channel connections
│ Telegram, Signal, etc. │ Translates messages → InvokeAgentRuntime
└──────────────┬──────────────┘
│ InvokeAgentRuntime
┌─────────────────────────────┐
│ AgentCore Runtime │ ← Serverless container (ARM64)
│ Pi-mono agent loop │ Handles reasoning, tool calls, LLM calls
│ /invocations + /ping │ Ephemeral per-session
└──────────────┬──────────────┘
┌──────────┼──────────┐
▼ ▼ ▼
┌────────┐ ┌────────┐ ┌──────────────┐
│ S3 │ │ DynamoDB│ │ AgentCore │
│ State │ │ Sessions│ │ Memory │
└────────┘ └────────┘ └──────────────┘
```
**Channel Relay** (must be always-on):
- Runs on ECS Fargate, EC2, or similar
- Extracted from OpenClaw's gateway — just the channel plugins + message routing
- On inbound message → calls `InvokeAgentRuntime` with session ID
- On agent response → routes back to the correct channel
**Agent Container** (runs on AgentCore):
- Pi-mono agent runtime wrapped in HTTP server
- Implements `/invocations`, `/ping`, optionally `/ws`
- Loads workspace files from S3 on session start
- Writes session state to DynamoDB/AgentCore Memory
- Makes LLM calls, web searches, etc.
**External Scheduling** (EventBridge):
- Heartbeat: EventBridge rule every 30m → Lambda → InvokeAgentRuntime
- Cron: dynamic EventBridge rules managed via an API
## Pros of This Approach
- **No infrastructure management** for the agent runtime (scaling, patching, etc.)
- **Cost-efficient** — pay only for active agent CPU time (I/O wait is free)
- **Security isolation** — each session in its own microVM
- **Built-in auth** — SigV4/OAuth for agent endpoints
- **Built-in observability** — agent tracing, tool invocations
- **Bedrock-native** — direct IAM-role access to Bedrock models
## Cons / Risks
- **Massive refactoring effort** — this is not a "deploy and go" situation
- **Channel relay still needs always-on infra** — you don't eliminate ops completely
- **Session continuity is harder** — 15min idle timeout means sessions are short-lived; need careful state management for multi-turn conversations
- **Cold start latency** — new sessions need to spin up a microVM
- **Loss of local features** — no macOS integrations, no device nodes, no browser extension relay
- **WhatsApp is the hardest** — Baileys requires persistent WebSocket + auth state; this alone might need a dedicated EC2 instance
- **Agent workspace semantics change** — MEMORY.md, daily notes, etc. need to be loaded from S3 and written back; the "personal local assistant" feel is lost
## Alternative: Don't Do This
Honestly? OpenClaw's design philosophy is **personal, local-first, always-on**. AgentCore's philosophy is **serverless, multi-user, request-driven**. These are almost diametrically opposed.
### Better alternatives for "OpenClaw on AWS":
1. **EC2/ECS + Docker** — Run the full OpenClaw gateway as a container on EC2 or ECS. This is what the existing Docker support does. You get the full feature set, persistent connections, local filesystem. Just add an EBS volume for state.
2. **Lightsail** — Cheap VPS that runs the gateway exactly as designed.
3. **ECS Fargate** — Run the gateway as a Fargate task with EFS for persistence. More serverless-y without the architecture mismatch.
### Where AgentCore _would_ make sense for OpenClaw:
- **Sub-agent offloading** — Run expensive coding agent tasks on AgentCore (Codex-style), keeping the gateway local but offloading heavy compute.
- **Tool hosting** — Host MCP tool servers on AgentCore (e.g., browser tool, code interpreter) and connect them to a local OpenClaw gateway.
- **Multi-user deployment** — If you wanted to offer OpenClaw-as-a-service to multiple users, AgentCore's per-session isolation would be valuable. But you'd still need the channel relay.
## Verdict
| Question | Answer |
|---|---|
| Can OpenClaw run on AgentCore? | Not without fundamental redesign |
| Is the agent core (reasoning loop) compatible? | Yes, with HTTP wrapper |
| Can channel connections run on AgentCore? | No — need separate always-on infra |
| Is the effort worth it? | Probably not for personal use. Maybe for multi-user SaaS. |
| Best AWS hosting for OpenClaw today? | EC2 or ECS with Docker |
| Where AgentCore adds value? | Sub-agent compute, MCP tool hosting |
---
*Research completed 2026-03-10. Sources: OpenClaw docs (docs.openclaw.ai), OpenClaw GitHub (github.com/openclaw/openclaw), AWS AgentCore docs (docs.aws.amazon.com/bedrock-agentcore), installed OpenClaw v2026.3.2 source inspection.*