Initial research: OpenClaw on AgentCore architecture

- Architecture comparison (OpenClaw daemon vs AgentCore serverless)
- Component compatibility analysis
- Fargate analysis
- AgentCore rebuild plan (Telegram, zero always-on compute)
- Memory strategy: AgentCore Memory + factbase as structured KB
- Serverless relay patterns per channel
- All open questions resolved
- OpenClaw feature delta March→May 2026
- Build phases and cost estimates
This commit is contained in:
daniel
2026-05-04 08:28:52 -05:00
parent 4afa16a9cd
commit 0369a74ac1
13 changed files with 1876 additions and 1 deletions

103
compatibility-analysis.md Normal file
View File

@@ -0,0 +1,103 @@
# Component-by-Component Compatibility Analysis
## 🔴 Incompatible (fundamental architecture mismatch)
### 1. Gateway (Long-Lived Daemon)
**OpenClaw**: Single always-on process that multiplexes WS server + HTTP + channel connections.
**AgentCore**: Container is invoked per-request/session, idle-killed at 15min, max 8hr.
**Verdict**: 🔴 **Cannot run as-is.** The Gateway assumes it's a long-running daemon. AgentCore will kill it after 15 minutes of no inbound invocations. Even if you keep it warm with pings, the 8-hour max session kills any long-running process.
### 2. Channel Connections (WhatsApp, Discord, Telegram, etc.)
**OpenClaw**: Maintains persistent outbound WebSocket/polling connections to each messaging service. WhatsApp (Baileys) requires a persistent session with auth state. Discord uses a persistent gateway WebSocket.
**AgentCore**: Ephemeral sessions. No persistent outbound connections survive session termination.
**Verdict**: 🔴 **Fundamentally incompatible.** WhatsApp's Baileys library maintains a stateful WebSocket with auth keys that must persist. Discord.js maintains a real-time gateway connection. These cannot be started/stopped per request — they need to be always-on or you lose the connection and have to re-auth.
### 3. Filesystem Persistence (Session Transcripts, Config, Workspace)
**OpenClaw**: Stores everything on local filesystem — session JSONL files, config, WhatsApp auth state, pairing store, agent workspace (MEMORY.md, daily notes), secrets.
**AgentCore**: Filesystem is ephemeral. Destroyed when session terminates.
**Verdict**: 🔴 **All persistent state must be externalized.** Every file that OpenClaw writes and expects to read later would need to be backed by S3, DynamoDB, or AgentCore Memory.
### 4. Shell Exec / PTY (Agent Tool)
**OpenClaw**: The `exec` tool spawns real shell processes, supports PTY for interactive commands, runs coding agents (Codex, Claude Code) as child processes.
**AgentCore**: Runs inside a container, so basic exec is possible, but:
- No host-level access
- Container filesystem is ephemeral
- PTY support depends on container config
- Long-running background processes die with session (15min idle / 8hr max)
**Verdict**: 🟡 **Partially possible.** Basic shell commands work in containers. But coding agent subprocesses that run for extended periods will be killed. No access to host-level tools.
### 5. Heartbeat System
**OpenClaw**: Gateway-driven timer that fires periodic agent turns in the main session (default every 30m). Relies on the gateway being continuously running.
**AgentCore**: No built-in periodic task scheduler. Container only runs when invoked.
**Verdict**: 🔴 **Must be offloaded.** Would need EventBridge Scheduler or a Lambda cron to periodically invoke the agent. The heartbeat logic itself could run, but the trigger mechanism must be external.
### 6. Cron Jobs
**OpenClaw**: Built-in cron scheduler (`croner` library) that runs inside the gateway process.
**AgentCore**: No built-in scheduler.
**Verdict**: 🔴 **Must be offloaded.** Same as heartbeat — EventBridge Scheduler → InvokeAgentRuntime.
---
## 🟡 Partially Compatible (needs adaptation)
### 7. Agent Runtime (Pi-Mono)
**OpenClaw**: Embedded pi-mono agent runtime with RPC-based tool calling, streaming, and multi-turn sessions.
**AgentCore**: Expects your container to implement `/invocations` (POST) and `/ping` (GET). Returns JSON or SSE.
**Verdict**: 🟡 **Core agent loop could work.** The pi-mono agent runtime could be wrapped behind the AgentCore HTTP contract. The tool-calling loop would need to be adapted to the HTTP request/response pattern instead of internal RPC. The main challenge is the session model (see below).
### 8. Session Management
**OpenClaw**: Sessions are long-lived, stored as JSONL, persist indefinitely. The "main session" for a user is eternal and accumulates context over days/weeks.
**AgentCore**: Sessions max 8 hours. State is ephemeral. Cross-session continuity requires AgentCore Memory or external storage.
**Verdict**: 🟡 **Needs redesign.** Could use AgentCore Memory for cross-session context, but OpenClaw's JSONL-based session model (with compaction) would need to be completely rewritten to use AgentCore Memory or a database.
### 9. Browser Tool (Playwright)
**OpenClaw**: Launches a managed Chromium instance via Playwright, controls via CDP.
**AgentCore**: Containers can run headless browsers, but:
- ARM64 container (Chromium ARM builds exist)
- Ephemeral — browser state lost on session end
- Network access needed for web browsing (VPC + NAT or default internet)
**Verdict**: 🟡 **Possible but fragile.** Headless Chrome can run in containers, but you need the right base image, enough memory, and network egress. Browser sessions won't persist. AgentCore actually has a built-in Browser Tool you might use instead.
### 10. Web Search / Web Fetch
**OpenClaw**: Makes HTTP requests to Brave Search API, fetches web pages.
**AgentCore**: Outbound HTTP works fine (with internet access via VPC+NAT or default).
**Verdict**: 🟢 **Compatible.** Just needs outbound internet access.
### 11. TTS (Text-to-Speech)
**OpenClaw**: Calls external TTS APIs (ElevenLabs, Edge TTS, etc.)
**AgentCore**: Outbound API calls work fine.
**Verdict**: 🟢 **Compatible.**
### 12. LLM Provider Calls
**OpenClaw**: Calls Bedrock, Anthropic, OpenAI, etc. via HTTP APIs.
**AgentCore**: Outbound API calls work. Bedrock calls can use IAM roles.
**Verdict**: 🟢 **Compatible, and potentially better.** Bedrock calls from AgentCore can use the execution role directly — no API keys needed.
---
## 🟢 Compatible (works as-is or with minimal changes)
### 13. Model Provider Abstraction
The model routing/failover/selection logic is pure application code — works anywhere.
### 14. System Prompt Construction
Building system prompts from workspace files is pure logic — works anywhere (but workspace files need external storage).
### 15. Context Engine (Compaction, Pruning)
Session compaction/pruning logic is algorithmic — works in any runtime. But needs adapted storage backend.
---
## Node / Platform Features (N/A for AgentCore)
These features are inherently tied to physical devices and cannot run on AgentCore:
- **macOS app** (menu bar, Voice Wake, Talk Mode)
- **iOS/Android nodes** (camera, screen, location, voice)
- **iMessage** (requires macOS + Messages.app)
- **Signal** (requires signal-cli subprocess)
- **Canvas** (visual workspace rendered on client device)
- **Bonjour discovery** (LAN-based device pairing)
- **WhatsApp QR pairing** (requires interactive QR scan flow)
These would remain on the user's device, connecting to an AgentCore-hosted agent via API.