Initial research: OpenClaw on AgentCore architecture
- Architecture comparison (OpenClaw daemon vs AgentCore serverless) - Component compatibility analysis - Fargate analysis - AgentCore rebuild plan (Telegram, zero always-on compute) - Memory strategy: AgentCore Memory + factbase as structured KB - Serverless relay patterns per channel - All open questions resolved - OpenClaw feature delta March→May 2026 - Build phases and cost estimates
This commit is contained in:
109
feasibility-verdict.md
Normal file
109
feasibility-verdict.md
Normal file
@@ -0,0 +1,109 @@
|
||||
# Feasibility Verdict: OpenClaw on AgentCore
|
||||
|
||||
## TL;DR
|
||||
|
||||
**Can OpenClaw run on AgentCore Runtime?** Not as-is. The architectures are fundamentally different. OpenClaw is a **long-lived daemon** with persistent connections; AgentCore is a **serverless, request-driven container** with ephemeral sessions.
|
||||
|
||||
You could run a **subset** of OpenClaw on AgentCore — specifically, the agent reasoning/tool-calling core — but you'd need to completely redesign the messaging layer, state management, and scheduling. At that point, you're essentially building a new system that borrows OpenClaw's agent logic.
|
||||
|
||||
## The Core Tension
|
||||
|
||||
| Dimension | OpenClaw | AgentCore |
|
||||
|---|---|---|
|
||||
| Process model | Always-on daemon | Request-invoked container |
|
||||
| Session lifetime | Indefinite | 8 hours max, 15min idle kill |
|
||||
| State | Local filesystem | Ephemeral (use AgentCore Memory) |
|
||||
| Connections | Persistent WS to channels | No persistent outbound connections |
|
||||
| Scheduling | Internal cron/heartbeat | None (use EventBridge) |
|
||||
| User model | Single user, single host | Multi-user, multi-session |
|
||||
|
||||
## What Would Actually Work
|
||||
|
||||
### Realistic Architecture: "Split Gateway"
|
||||
|
||||
```
|
||||
┌─────────────────────────────┐
|
||||
│ Channel Relay │ ← ECS Fargate (always-on)
|
||||
│ WhatsApp, Discord, Slack, │ Maintains persistent channel connections
|
||||
│ Telegram, Signal, etc. │ Translates messages → InvokeAgentRuntime
|
||||
└──────────────┬──────────────┘
|
||||
│ InvokeAgentRuntime
|
||||
▼
|
||||
┌─────────────────────────────┐
|
||||
│ AgentCore Runtime │ ← Serverless container (ARM64)
|
||||
│ Pi-mono agent loop │ Handles reasoning, tool calls, LLM calls
|
||||
│ /invocations + /ping │ Ephemeral per-session
|
||||
└──────────────┬──────────────┘
|
||||
│
|
||||
┌──────────┼──────────┐
|
||||
▼ ▼ ▼
|
||||
┌────────┐ ┌────────┐ ┌──────────────┐
|
||||
│ S3 │ │ DynamoDB│ │ AgentCore │
|
||||
│ State │ │ Sessions│ │ Memory │
|
||||
└────────┘ └────────┘ └──────────────┘
|
||||
```
|
||||
|
||||
**Channel Relay** (must be always-on):
|
||||
- Runs on ECS Fargate, EC2, or similar
|
||||
- Extracted from OpenClaw's gateway — just the channel plugins + message routing
|
||||
- On inbound message → calls `InvokeAgentRuntime` with session ID
|
||||
- On agent response → routes back to the correct channel
|
||||
|
||||
**Agent Container** (runs on AgentCore):
|
||||
- Pi-mono agent runtime wrapped in HTTP server
|
||||
- Implements `/invocations`, `/ping`, optionally `/ws`
|
||||
- Loads workspace files from S3 on session start
|
||||
- Writes session state to DynamoDB/AgentCore Memory
|
||||
- Makes LLM calls, web searches, etc.
|
||||
|
||||
**External Scheduling** (EventBridge):
|
||||
- Heartbeat: EventBridge rule every 30m → Lambda → InvokeAgentRuntime
|
||||
- Cron: dynamic EventBridge rules managed via an API
|
||||
|
||||
## Pros of This Approach
|
||||
|
||||
- **No infrastructure management** for the agent runtime (scaling, patching, etc.)
|
||||
- **Cost-efficient** — pay only for active agent CPU time (I/O wait is free)
|
||||
- **Security isolation** — each session in its own microVM
|
||||
- **Built-in auth** — SigV4/OAuth for agent endpoints
|
||||
- **Built-in observability** — agent tracing, tool invocations
|
||||
- **Bedrock-native** — direct IAM-role access to Bedrock models
|
||||
|
||||
## Cons / Risks
|
||||
|
||||
- **Massive refactoring effort** — this is not a "deploy and go" situation
|
||||
- **Channel relay still needs always-on infra** — you don't eliminate ops completely
|
||||
- **Session continuity is harder** — 15min idle timeout means sessions are short-lived; need careful state management for multi-turn conversations
|
||||
- **Cold start latency** — new sessions need to spin up a microVM
|
||||
- **Loss of local features** — no macOS integrations, no device nodes, no browser extension relay
|
||||
- **WhatsApp is the hardest** — Baileys requires persistent WebSocket + auth state; this alone might need a dedicated EC2 instance
|
||||
- **Agent workspace semantics change** — MEMORY.md, daily notes, etc. need to be loaded from S3 and written back; the "personal local assistant" feel is lost
|
||||
|
||||
## Alternative: Don't Do This
|
||||
|
||||
Honestly? OpenClaw's design philosophy is **personal, local-first, always-on**. AgentCore's philosophy is **serverless, multi-user, request-driven**. These are almost diametrically opposed.
|
||||
|
||||
### Better alternatives for "OpenClaw on AWS":
|
||||
1. **EC2/ECS + Docker** — Run the full OpenClaw gateway as a container on EC2 or ECS. This is what the existing Docker support does. You get the full feature set, persistent connections, local filesystem. Just add an EBS volume for state.
|
||||
2. **Lightsail** — Cheap VPS that runs the gateway exactly as designed.
|
||||
3. **ECS Fargate** — Run the gateway as a Fargate task with EFS for persistence. More serverless-y without the architecture mismatch.
|
||||
|
||||
### Where AgentCore _would_ make sense for OpenClaw:
|
||||
- **Sub-agent offloading** — Run expensive coding agent tasks on AgentCore (Codex-style), keeping the gateway local but offloading heavy compute.
|
||||
- **Tool hosting** — Host MCP tool servers on AgentCore (e.g., browser tool, code interpreter) and connect them to a local OpenClaw gateway.
|
||||
- **Multi-user deployment** — If you wanted to offer OpenClaw-as-a-service to multiple users, AgentCore's per-session isolation would be valuable. But you'd still need the channel relay.
|
||||
|
||||
## Verdict
|
||||
|
||||
| Question | Answer |
|
||||
|---|---|
|
||||
| Can OpenClaw run on AgentCore? | Not without fundamental redesign |
|
||||
| Is the agent core (reasoning loop) compatible? | Yes, with HTTP wrapper |
|
||||
| Can channel connections run on AgentCore? | No — need separate always-on infra |
|
||||
| Is the effort worth it? | Probably not for personal use. Maybe for multi-user SaaS. |
|
||||
| Best AWS hosting for OpenClaw today? | EC2 or ECS with Docker |
|
||||
| Where AgentCore adds value? | Sub-agent compute, MCP tool hosting |
|
||||
|
||||
---
|
||||
|
||||
*Research completed 2026-03-10. Sources: OpenClaw docs (docs.openclaw.ai), OpenClaw GitHub (github.com/openclaw/openclaw), AWS AgentCore docs (docs.aws.amazon.com/bedrock-agentcore), installed OpenClaw v2026.3.2 source inspection.*
|
||||
Reference in New Issue
Block a user