Initial research: OpenClaw on AgentCore architecture

- Architecture comparison (OpenClaw daemon vs AgentCore serverless) - Component compatibility analysis - Fargate analysis - AgentCore rebuild plan (Telegram, zero always-on compute) - Memory strategy: AgentCore Memory + factbase as structured KB - Serverless relay patterns per channel - All open questions resolved - OpenClaw feature delta March→May 2026 - Build phases and cost estimates
2026-05-04 08:28:52 -05:00
parent 4afa16a9cd
commit 0369a74ac1
13 changed files with 1876 additions and 1 deletions
--- a/compatibility-analysis.md
+++ b/compatibility-analysis.md
@@ -0,0 +1,103 @@
+# Component-by-Component Compatibility Analysis
+
+## 🔴 Incompatible (fundamental architecture mismatch)
+
+### 1. Gateway (Long-Lived Daemon)
+**OpenClaw**: Single always-on process that multiplexes WS server + HTTP + channel connections.
+**AgentCore**: Container is invoked per-request/session, idle-killed at 15min, max 8hr.
+**Verdict**: 🔴 **Cannot run as-is.** The Gateway assumes it's a long-running daemon. AgentCore will kill it after 15 minutes of no inbound invocations. Even if you keep it warm with pings, the 8-hour max session kills any long-running process.
+
+### 2. Channel Connections (WhatsApp, Discord, Telegram, etc.)
+**OpenClaw**: Maintains persistent outbound WebSocket/polling connections to each messaging service. WhatsApp (Baileys) requires a persistent session with auth state. Discord uses a persistent gateway WebSocket.
+**AgentCore**: Ephemeral sessions. No persistent outbound connections survive session termination.
+**Verdict**: 🔴 **Fundamentally incompatible.** WhatsApp's Baileys library maintains a stateful WebSocket with auth keys that must persist. Discord.js maintains a real-time gateway connection. These cannot be started/stopped per request — they need to be always-on or you lose the connection and have to re-auth.
+
+### 3. Filesystem Persistence (Session Transcripts, Config, Workspace)
+**OpenClaw**: Stores everything on local filesystem — session JSONL files, config, WhatsApp auth state, pairing store, agent workspace (MEMORY.md, daily notes), secrets.
+**AgentCore**: Filesystem is ephemeral. Destroyed when session terminates.
+**Verdict**: 🔴 **All persistent state must be externalized.** Every file that OpenClaw writes and expects to read later would need to be backed by S3, DynamoDB, or AgentCore Memory.
+
+### 4. Shell Exec / PTY (Agent Tool)
+**OpenClaw**: The `exec` tool spawns real shell processes, supports PTY for interactive commands, runs coding agents (Codex, Claude Code) as child processes.
+**AgentCore**: Runs inside a container, so basic exec is possible, but:
+- No host-level access
+- Container filesystem is ephemeral
+- PTY support depends on container config
+- Long-running background processes die with session (15min idle / 8hr max)
+**Verdict**: 🟡 **Partially possible.** Basic shell commands work in containers. But coding agent subprocesses that run for extended periods will be killed. No access to host-level tools.
+
+### 5. Heartbeat System
+**OpenClaw**: Gateway-driven timer that fires periodic agent turns in the main session (default every 30m). Relies on the gateway being continuously running.
+**AgentCore**: No built-in periodic task scheduler. Container only runs when invoked.
+**Verdict**: 🔴 **Must be offloaded.** Would need EventBridge Scheduler or a Lambda cron to periodically invoke the agent. The heartbeat logic itself could run, but the trigger mechanism must be external.
+
+### 6. Cron Jobs
+**OpenClaw**: Built-in cron scheduler (`croner` library) that runs inside the gateway process.
+**AgentCore**: No built-in scheduler.
+**Verdict**: 🔴 **Must be offloaded.** Same as heartbeat — EventBridge Scheduler → InvokeAgentRuntime.
+
+---
+
+## 🟡 Partially Compatible (needs adaptation)
+
+### 7. Agent Runtime (Pi-Mono)
+**OpenClaw**: Embedded pi-mono agent runtime with RPC-based tool calling, streaming, and multi-turn sessions.
+**AgentCore**: Expects your container to implement `/invocations` (POST) and `/ping` (GET). Returns JSON or SSE.
+**Verdict**: 🟡 **Core agent loop could work.** The pi-mono agent runtime could be wrapped behind the AgentCore HTTP contract. The tool-calling loop would need to be adapted to the HTTP request/response pattern instead of internal RPC. The main challenge is the session model (see below).
+
+### 8. Session Management
+**OpenClaw**: Sessions are long-lived, stored as JSONL, persist indefinitely. The "main session" for a user is eternal and accumulates context over days/weeks.
+**AgentCore**: Sessions max 8 hours. State is ephemeral. Cross-session continuity requires AgentCore Memory or external storage.
+**Verdict**: 🟡 **Needs redesign.** Could use AgentCore Memory for cross-session context, but OpenClaw's JSONL-based session model (with compaction) would need to be completely rewritten to use AgentCore Memory or a database.
+
+### 9. Browser Tool (Playwright)
+**OpenClaw**: Launches a managed Chromium instance via Playwright, controls via CDP.
+**AgentCore**: Containers can run headless browsers, but:
+- ARM64 container (Chromium ARM builds exist)
+- Ephemeral — browser state lost on session end
+- Network access needed for web browsing (VPC + NAT or default internet)
+**Verdict**: 🟡 **Possible but fragile.** Headless Chrome can run in containers, but you need the right base image, enough memory, and network egress. Browser sessions won't persist. AgentCore actually has a built-in Browser Tool you might use instead.
+
+### 10. Web Search / Web Fetch
+**OpenClaw**: Makes HTTP requests to Brave Search API, fetches web pages.
+**AgentCore**: Outbound HTTP works fine (with internet access via VPC+NAT or default).
+**Verdict**: 🟢 **Compatible.** Just needs outbound internet access.
+
+### 11. TTS (Text-to-Speech)
+**OpenClaw**: Calls external TTS APIs (ElevenLabs, Edge TTS, etc.)
+**AgentCore**: Outbound API calls work fine.
+**Verdict**: 🟢 **Compatible.**
+
+### 12. LLM Provider Calls
+**OpenClaw**: Calls Bedrock, Anthropic, OpenAI, etc. via HTTP APIs.
+**AgentCore**: Outbound API calls work. Bedrock calls can use IAM roles.
+**Verdict**: 🟢 **Compatible, and potentially better.** Bedrock calls from AgentCore can use the execution role directly — no API keys needed.
+
+---
+
+## 🟢 Compatible (works as-is or with minimal changes)
+
+### 13. Model Provider Abstraction
+The model routing/failover/selection logic is pure application code — works anywhere.
+
+### 14. System Prompt Construction
+Building system prompts from workspace files is pure logic — works anywhere (but workspace files need external storage).
+
+### 15. Context Engine (Compaction, Pruning)
+Session compaction/pruning logic is algorithmic — works in any runtime. But needs adapted storage backend.
+
+---
+
+## Node / Platform Features (N/A for AgentCore)
+
+These features are inherently tied to physical devices and cannot run on AgentCore:
+
+- **macOS app** (menu bar, Voice Wake, Talk Mode)
+- **iOS/Android nodes** (camera, screen, location, voice)
+- **iMessage** (requires macOS + Messages.app)
+- **Signal** (requires signal-cli subprocess)
+- **Canvas** (visual workspace rendered on client device)
+- **Bonjour discovery** (LAN-based device pairing)
+- **WhatsApp QR pairing** (requires interactive QR scan flow)
+
+These would remain on the user's device, connecting to an AgentCore-hosted agent via API.