Initial research: OpenClaw on AgentCore architecture

- Architecture comparison (OpenClaw daemon vs AgentCore serverless) - Component compatibility analysis - Fargate analysis - AgentCore rebuild plan (Telegram, zero always-on compute) - Memory strategy: AgentCore Memory + factbase as structured KB - Serverless relay patterns per channel - All open questions resolved - OpenClaw feature delta March→May 2026 - Build phases and cost estimates
2026-05-04 08:28:52 -05:00
parent 4afa16a9cd
commit 0369a74ac1
13 changed files with 1876 additions and 1 deletions
--- a/agentcore-rebuild.md
+++ b/agentcore-rebuild.md
@@ -0,0 +1,308 @@
+# AgentCore Rebuild: What's Reusable vs What's New
+
+## The Premise
+
+Instead of porting OpenClaw's monolithic gateway to AgentCore, build an **AgentCore-native personal assistant** that reuses the best parts of OpenClaw's design. Think of it as "the OpenClaw experience, built on AWS primitives."
+
+---
+
+## ✅ Directly Reusable (copy/adapt, no rewrite)
+
+### 1. Personality & Workspace Files
+All of these are just text that gets injected into the system prompt:
+- `SOUL.md` — persona, tone, boundaries
+- `AGENTS.md` — operating instructions
+- `IDENTITY.md` — name, emoji, vibe
+- `USER.md` — human's profile
+- `TOOLS.md` — tool notes
+- `MEMORY.md` — long-term memory
+- `HEARTBEAT.md` — periodic task checklist
+- `BOOTSTRAP.md` — first-run ritual
+- `memory/YYYY-MM-DD.md` — daily notes
+
+**Where they live**: S3 bucket (one prefix per user/agent). Loaded on each invocation, written back after mutations.
+
+**Or**: AgentCore Memory for the conversational parts, S3 for the static persona files.
+
+### 2. System Prompt Construction Logic
+OpenClaw's context engine builds a rich system prompt from workspace files + tool descriptions + channel context + runtime metadata. This logic is pure string templating — framework-independent. Could be extracted and reused in a Strands/LangGraph agent or a custom Python agent.
+
+Key pieces:
+- Bootstrap file injection (with truncation markers for large files)
+- Runtime context block (timezone, channel, OS, model, capabilities)
+- Inbound message metadata (sender, group, timestamps)
+- Tool policy injection
+- Heartbeat/cron prompt variants
+- Reply tag system (`[[reply_to_current]]` etc.)
+
+### 3. Tool Definitions & Schemas
+OpenClaw defines ~20+ tools. The **schemas** (parameters, descriptions) can be translated to any framework's tool format:
+
+| OpenClaw Tool | AgentCore Equivalent | Notes |
+|---|---|---|
+| `read` | Custom tool (S3 or container FS) | Read files from workspace |
+| `write` | Custom tool (S3 or container FS) | Write workspace files |
+| `edit` | Custom tool | String replacement in files |
+| `exec` | Custom tool (container shell) | Limited vs OpenClaw's full PTY |
+| `web_search` | Custom tool or AgentCore Gateway | Brave API wrapper |
+| `web_fetch` | Custom tool | HTTP fetch + readability extraction |
+| `browser` | **AgentCore Browser Tool** | Built-in! Better than rolling your own |
+| `message` (Discord/Slack) | **AgentCore Gateway** → Slack/Discord tool | 1-click integrations available |
+| `memory_search` | **AgentCore Memory** | Semantic search over memory |
+| `tts` | Custom tool (ElevenLabs API call) | Straightforward |
+| `sessions_spawn` | AgentCore Runtime (A2A) | Agent-to-agent protocol |
+| `canvas` | Custom tool or drop | Needs client-side renderer |
+| `nodes` | Drop or custom | Requires physical devices |
+| `cron` | EventBridge Scheduler API | Via custom tool |
+
+### 4. Channel Integration Patterns
+The **logic** of how to handle group chats, mentions, reply threading, chunking, etc. is reusable even if the transport changes:
+- Group chat rules (when to speak, when to stay silent)
+- Reply tag system
+- Message chunking for long responses
+- Typing indicators / presence
+- Platform-specific formatting (Discord markdown vs WhatsApp formatting)
+
+### 5. Skills System (Concept)
+The skill discovery pattern (scan descriptions → load SKILL.md → follow instructions) works in any agent framework. The actual skill files are just markdown instructions.
+
+### 6. Heartbeat Logic
+The prompt, the HEARTBEAT_OK ack contract, the "check inbox/calendar/weather" patterns — all reusable. Just the **trigger mechanism** changes (EventBridge instead of internal timer).
+
+---
+
+## 🔧 Needs Rebuilding (new code, same concept)
+
+### 7. Agent Loop
+**OpenClaw**: pi-mono TypeScript (LLM call → tool parse → execute → loop)
+**AgentCore**: Use **Strands Agents** (Python, AWS-native) or **LangGraph** or custom.
+
+Strands is the path of least resistance on AgentCore — it's AWS-built, has native Bedrock integration, and the AgentCore SDK wraps it cleanly.
+
+```python
+from strands import Agent, tool
+from bedrock_agentcore.runtime import BedrockAgentCoreApp
+
+app = BedrockAgentCoreApp()
+
+@tool
+def read_workspace_file(path: str) -> str:
+    """Read a file from the agent workspace."""
+    # Load from S3
+    ...
+
+agent = Agent(tools=[read_workspace_file, ...])
+
+@app.entrypoint
+def main(payload):
+    prompt = payload.get("prompt")
+    system_prompt = build_system_prompt()  # ← Reuse OpenClaw's logic
+    return {"message": agent(prompt, system_prompt=system_prompt).message}
+```
+
+### 8. Session / Memory Management
+**OpenClaw**: JSONL files on disk, compaction algorithm
+**AgentCore**: 
+- **Short-term**: AgentCore Memory (per-session turn history)
+- **Long-term**: AgentCore Memory (extracted insights, preferences)
+- **Workspace files**: S3 (MEMORY.md, SOUL.md, etc.)
+- **Daily notes**: S3 or DynamoDB
+
+The compaction algorithm could be reimplemented as a post-session hook that summarizes and stores to long-term memory.
+
+### 9. Channel Relay Service
+This is the **biggest new piece**. Options:
+
+**Option A: Lightweight Fargate relay (recommended)**
+- Small ECS Fargate task running a stripped-down Node.js service
+- Maintains WS connections to WhatsApp/Discord/Telegram/Slack
+- On inbound message → `InvokeAgentRuntime` (AgentCore)
+- On agent response → route back to channel
+- ~$10-15/mo for a tiny Fargate task
+
+**Option B: Webhook-only channels**
+- Telegram (webhook mode), Slack (Events API), Discord (interactions endpoint)
+- API Gateway → Lambda → InvokeAgentRuntime
+- No always-on infra needed
+- But: no WhatsApp (Baileys needs persistent WS), no real-time Discord
+
+**Option C: AgentCore Gateway integrations**
+- AgentCore Gateway has 1-click Slack integration
+- Could handle Slack as a tool (agent → Slack) but not as an inbound channel
+- Would still need a relay for inbound messages
+
+**Option D: SNS/SQS fan-out**
+- Channels → SQS → Lambda → InvokeAgentRuntime
+- Good for decoupling, adds latency
+
+### 10. Scheduling (Heartbeat + Cron)
+**EventBridge Scheduler** replaces OpenClaw's internal cron:
+
+```
+EventBridge Rule (every 30m)
+  → Lambda function
+    → InvokeAgentRuntime(prompt="Read HEARTBEAT.md...")
+    → Route response to last channel
+```
+
+For dynamic cron (agent creates its own schedules), the agent needs a tool that creates/deletes EventBridge rules via the SDK.
+
+### 11. File Operations on Workspace
+**OpenClaw**: Direct filesystem read/write/edit
+**AgentCore**: S3-backed workspace
+
+```python
+@tool
+def write_file(path: str, content: str) -> str:
+    """Write content to a workspace file."""
+    s3.put_object(Bucket=WORKSPACE_BUCKET, Key=f"{agent_id}/{path}", Body=content)
+    return f"Written {len(content)} bytes to {path}"
+```
+
+The `edit` tool (find-and-replace) needs to download, modify, re-upload. Slightly more complex but straightforward.
+
+---
+
+## ❌ Must Drop or Significantly Redesign
+
+### 12. Shell Exec (Full PTY)
+AgentCore containers can run basic commands, but:
+- No persistent background processes (session dies)
+- No PTY for interactive CLIs
+- No host-level access
+- **Coding agent sub-processes** (Codex, Claude Code) don't fit the session model
+
+**Alternative**: Use AgentCore's A2A protocol to spin up specialized coding agent sessions, or use a separate Fargate task for heavy compute.
+
+### 13. Device Nodes (Camera, Screen, Location)
+Physical device features can't run on AgentCore. But:
+- iOS/Android/macOS nodes could connect to the channel relay
+- The relay could expose node commands as tools via AgentCore Gateway
+- This is a stretch — likely better to keep nodes connecting to a local gateway
+
+### 14. Browser Extension Relay
+The Chrome extension relay requires a persistent WS connection to the gateway. Would need the relay service to proxy this.
+
+### 15. Canvas / A2UI
+Requires a client-side renderer (macOS app, browser). The AgentCore agent could generate canvas commands, but delivery depends on having a client.
+
+---
+
+## Architecture: "OpenClaw Experience on AgentCore"
+
+```
+┌─────────────────────┐
+│  Channel Relay      │ ECS Fargate (tiny, always-on, ~$10/mo)
+│  WA/Discord/TG/Slack│ Inbound msgs → InvokeAgentRuntime
+│  + webhook endpoints │ Agent responses → route to channel
+└──────────┬──────────┘
+           │
+           ▼
+┌─────────────────────┐
+│  AgentCore Runtime  │ Serverless container (pay per use)
+│  Strands Agent      │
+│  ├─ System prompt   │ ← SOUL.md, AGENTS.md from S3
+│  ├─ Tools           │ ← read/write (S3), web_search, browser, message
+│  ├─ Memory          │ ← AgentCore Memory (short + long term)
+│  └─ LLM (Bedrock)   │ ← Direct IAM role access
+└──────────┬──────────┘
+           │
+     ┌─────┼─────┬──────────┐
+     ▼     ▼     ▼          ▼
+┌──────┐ ┌───┐ ┌─────────┐ ┌───────────────┐
+│  S3  │ │DDB│ │AgentCore│ │  AgentCore    │
+│ Work-│ │Cron│ │ Memory  │ │  Gateway      │
+│ space│ │State│ │         │ │  (MCP tools)  │
+└──────┘ └───┘ └─────────┘ └───────────────┘
+                               │
+                         ┌─────┼─────┐
+                         ▼     ▼     ▼
+                      Slack  Jira  Custom
+                      Tool   Tool  Lambda
+                                   Tools
+```
+
+### EventBridge Triggers
+```
+┌─────────────────────┐
+│  EventBridge        │
+│  ├─ Heartbeat (30m) │ → Lambda → InvokeAgentRuntime
+│  ├─ Cron jobs       │ → Lambda → InvokeAgentRuntime
+│  └─ Webhook events  │ → Lambda → InvokeAgentRuntime
+└─────────────────────┘
+```
+
+---
+
+## Effort Estimate (Ground-Up Build)
+
+| Component | Effort | Tech |
+|---|---|---|
+| Agent container (Strands + tools) | 2-3 weeks | Python, bedrock-agentcore SDK |
+| System prompt builder | 3-5 days | Port from OpenClaw TS → Python |
+| S3 workspace tools (read/write/edit) | 2-3 days | boto3 |
+| Web search + fetch tools | 2-3 days | Brave API, readability |
+| AgentCore Memory integration | 3-5 days | AgentCore Memory SDK |
+| Channel relay (Fargate) | 2-3 weeks | Node.js (reuse OpenClaw channel code) |
+| EventBridge scheduling | 2-3 days | CDK/Terraform |
+| Webhook ingress (API GW) | 2-3 days | CDK/Terraform |
+| AgentCore Gateway tools | 1 week | Slack, custom Lambda tools |
+| IaC (CDK or Terraform) | 1 week | Full stack deployment |
+| Testing + integration | 1-2 weeks | End-to-end |
+| **Total** | **~8-12 weeks** | For one person, part-time |
+
+---
+
+## Cost Estimate (Monthly)
+
+| Service | Cost |
+|---|---|
+| AgentCore Runtime (agent compute) | ~$5-15 (consumption-based, depends on usage) |
+| Channel relay (Fargate 0.25 vCPU) | ~$9 |
+| NAT Gateway | ~$3 |
+| S3 (workspace files) | ~$0.02 |
+| DynamoDB (cron state, metadata) | ~$1 |
+| AgentCore Memory | TBD (managed service pricing) |
+| EventBridge | ~$0.01 |
+| Bedrock LLM calls | $20-100+ (model-dependent, same as today) |
+| **Infrastructure total (ex-LLM)** | **~$20-30/mo** |
+
+Comparable to Fargate-only ($26/mo) but with better scaling characteristics and per-use billing for the agent compute.
+
+---
+
+## What You Gain vs Fargate-Only
+
+| Benefit | Fargate-Only | AgentCore Rebuild |
+|---|---|---|
+| Effort to deploy | Days | Months |
+| Full OpenClaw feature set | ✅ Yes | ~70% (no PTY, no nodes, no canvas) |
+| Per-invocation billing | ❌ Always-on | ✅ Pay per use |
+| Session isolation (security) | ❌ Shared process | ✅ Per-session microVM |
+| Built-in observability | ❌ DIY logging | ✅ AgentCore tracing |
+| Built-in auth (OAuth/SigV4) | ❌ DIY | ✅ AgentCore Identity |
+| Multi-user scalability | ❌ Single user | ✅ Designed for it |
+| AgentCore Memory | ❌ File-based | ✅ Managed, semantic |
+| AgentCore Gateway tools | ❌ N/A | ✅ 1-click Slack/Jira/etc |
+| Browser Tool | DIY Playwright | ✅ Built-in |
+| Future AWS integrations | Manual | ✅ First-class |
+
+---
+
+## When the Rebuild Makes Sense
+
+**Do it if**:
+- You want to offer this as a **multi-user product/service** (AgentCore's per-session isolation is purpose-built for this)
+- You want to go deep on **AWS-native agent infra** (Memory, Gateway, Identity, observability)
+- You're OK with a Python agent (Strands) instead of the pi-mono TypeScript stack
+- You want consumption-based billing instead of always-on compute
+- This is a learning/exploration project for AgentCore itself
+
+**Don't do it if**:
+- You just want your personal assistant running on AWS (Fargate in a day)
+- You need the full OpenClaw feature set (nodes, canvas, PTY, coding agents)
+- You want to stay on the OpenClaw upgrade path (community updates, new channels, skills)
+
+---
+
+*Added 2026-03-10*