Files
agent-claw/agentcore-rebuild.md
daniel 0369a74ac1 Initial research: OpenClaw on AgentCore architecture
- Architecture comparison (OpenClaw daemon vs AgentCore serverless)
- Component compatibility analysis
- Fargate analysis
- AgentCore rebuild plan (Telegram, zero always-on compute)
- Memory strategy: AgentCore Memory + factbase as structured KB
- Serverless relay patterns per channel
- All open questions resolved
- OpenClaw feature delta March→May 2026
- Build phases and cost estimates
2026-05-04 08:28:52 -05:00

309 lines
13 KiB
Markdown

# AgentCore Rebuild: What's Reusable vs What's New
## The Premise
Instead of porting OpenClaw's monolithic gateway to AgentCore, build an **AgentCore-native personal assistant** that reuses the best parts of OpenClaw's design. Think of it as "the OpenClaw experience, built on AWS primitives."
---
## ✅ Directly Reusable (copy/adapt, no rewrite)
### 1. Personality & Workspace Files
All of these are just text that gets injected into the system prompt:
- `SOUL.md` — persona, tone, boundaries
- `AGENTS.md` — operating instructions
- `IDENTITY.md` — name, emoji, vibe
- `USER.md` — human's profile
- `TOOLS.md` — tool notes
- `MEMORY.md` — long-term memory
- `HEARTBEAT.md` — periodic task checklist
- `BOOTSTRAP.md` — first-run ritual
- `memory/YYYY-MM-DD.md` — daily notes
**Where they live**: S3 bucket (one prefix per user/agent). Loaded on each invocation, written back after mutations.
**Or**: AgentCore Memory for the conversational parts, S3 for the static persona files.
### 2. System Prompt Construction Logic
OpenClaw's context engine builds a rich system prompt from workspace files + tool descriptions + channel context + runtime metadata. This logic is pure string templating — framework-independent. Could be extracted and reused in a Strands/LangGraph agent or a custom Python agent.
Key pieces:
- Bootstrap file injection (with truncation markers for large files)
- Runtime context block (timezone, channel, OS, model, capabilities)
- Inbound message metadata (sender, group, timestamps)
- Tool policy injection
- Heartbeat/cron prompt variants
- Reply tag system (`[[reply_to_current]]` etc.)
### 3. Tool Definitions & Schemas
OpenClaw defines ~20+ tools. The **schemas** (parameters, descriptions) can be translated to any framework's tool format:
| OpenClaw Tool | AgentCore Equivalent | Notes |
|---|---|---|
| `read` | Custom tool (S3 or container FS) | Read files from workspace |
| `write` | Custom tool (S3 or container FS) | Write workspace files |
| `edit` | Custom tool | String replacement in files |
| `exec` | Custom tool (container shell) | Limited vs OpenClaw's full PTY |
| `web_search` | Custom tool or AgentCore Gateway | Brave API wrapper |
| `web_fetch` | Custom tool | HTTP fetch + readability extraction |
| `browser` | **AgentCore Browser Tool** | Built-in! Better than rolling your own |
| `message` (Discord/Slack) | **AgentCore Gateway** → Slack/Discord tool | 1-click integrations available |
| `memory_search` | **AgentCore Memory** | Semantic search over memory |
| `tts` | Custom tool (ElevenLabs API call) | Straightforward |
| `sessions_spawn` | AgentCore Runtime (A2A) | Agent-to-agent protocol |
| `canvas` | Custom tool or drop | Needs client-side renderer |
| `nodes` | Drop or custom | Requires physical devices |
| `cron` | EventBridge Scheduler API | Via custom tool |
### 4. Channel Integration Patterns
The **logic** of how to handle group chats, mentions, reply threading, chunking, etc. is reusable even if the transport changes:
- Group chat rules (when to speak, when to stay silent)
- Reply tag system
- Message chunking for long responses
- Typing indicators / presence
- Platform-specific formatting (Discord markdown vs WhatsApp formatting)
### 5. Skills System (Concept)
The skill discovery pattern (scan descriptions → load SKILL.md → follow instructions) works in any agent framework. The actual skill files are just markdown instructions.
### 6. Heartbeat Logic
The prompt, the HEARTBEAT_OK ack contract, the "check inbox/calendar/weather" patterns — all reusable. Just the **trigger mechanism** changes (EventBridge instead of internal timer).
---
## 🔧 Needs Rebuilding (new code, same concept)
### 7. Agent Loop
**OpenClaw**: pi-mono TypeScript (LLM call → tool parse → execute → loop)
**AgentCore**: Use **Strands Agents** (Python, AWS-native) or **LangGraph** or custom.
Strands is the path of least resistance on AgentCore — it's AWS-built, has native Bedrock integration, and the AgentCore SDK wraps it cleanly.
```python
from strands import Agent, tool
from bedrock_agentcore.runtime import BedrockAgentCoreApp
app = BedrockAgentCoreApp()
@tool
def read_workspace_file(path: str) -> str:
"""Read a file from the agent workspace."""
# Load from S3
...
agent = Agent(tools=[read_workspace_file, ...])
@app.entrypoint
def main(payload):
prompt = payload.get("prompt")
system_prompt = build_system_prompt() # ← Reuse OpenClaw's logic
return {"message": agent(prompt, system_prompt=system_prompt).message}
```
### 8. Session / Memory Management
**OpenClaw**: JSONL files on disk, compaction algorithm
**AgentCore**:
- **Short-term**: AgentCore Memory (per-session turn history)
- **Long-term**: AgentCore Memory (extracted insights, preferences)
- **Workspace files**: S3 (MEMORY.md, SOUL.md, etc.)
- **Daily notes**: S3 or DynamoDB
The compaction algorithm could be reimplemented as a post-session hook that summarizes and stores to long-term memory.
### 9. Channel Relay Service
This is the **biggest new piece**. Options:
**Option A: Lightweight Fargate relay (recommended)**
- Small ECS Fargate task running a stripped-down Node.js service
- Maintains WS connections to WhatsApp/Discord/Telegram/Slack
- On inbound message → `InvokeAgentRuntime` (AgentCore)
- On agent response → route back to channel
- ~$10-15/mo for a tiny Fargate task
**Option B: Webhook-only channels**
- Telegram (webhook mode), Slack (Events API), Discord (interactions endpoint)
- API Gateway → Lambda → InvokeAgentRuntime
- No always-on infra needed
- But: no WhatsApp (Baileys needs persistent WS), no real-time Discord
**Option C: AgentCore Gateway integrations**
- AgentCore Gateway has 1-click Slack integration
- Could handle Slack as a tool (agent → Slack) but not as an inbound channel
- Would still need a relay for inbound messages
**Option D: SNS/SQS fan-out**
- Channels → SQS → Lambda → InvokeAgentRuntime
- Good for decoupling, adds latency
### 10. Scheduling (Heartbeat + Cron)
**EventBridge Scheduler** replaces OpenClaw's internal cron:
```
EventBridge Rule (every 30m)
→ Lambda function
→ InvokeAgentRuntime(prompt="Read HEARTBEAT.md...")
→ Route response to last channel
```
For dynamic cron (agent creates its own schedules), the agent needs a tool that creates/deletes EventBridge rules via the SDK.
### 11. File Operations on Workspace
**OpenClaw**: Direct filesystem read/write/edit
**AgentCore**: S3-backed workspace
```python
@tool
def write_file(path: str, content: str) -> str:
"""Write content to a workspace file."""
s3.put_object(Bucket=WORKSPACE_BUCKET, Key=f"{agent_id}/{path}", Body=content)
return f"Written {len(content)} bytes to {path}"
```
The `edit` tool (find-and-replace) needs to download, modify, re-upload. Slightly more complex but straightforward.
---
## ❌ Must Drop or Significantly Redesign
### 12. Shell Exec (Full PTY)
AgentCore containers can run basic commands, but:
- No persistent background processes (session dies)
- No PTY for interactive CLIs
- No host-level access
- **Coding agent sub-processes** (Codex, Claude Code) don't fit the session model
**Alternative**: Use AgentCore's A2A protocol to spin up specialized coding agent sessions, or use a separate Fargate task for heavy compute.
### 13. Device Nodes (Camera, Screen, Location)
Physical device features can't run on AgentCore. But:
- iOS/Android/macOS nodes could connect to the channel relay
- The relay could expose node commands as tools via AgentCore Gateway
- This is a stretch — likely better to keep nodes connecting to a local gateway
### 14. Browser Extension Relay
The Chrome extension relay requires a persistent WS connection to the gateway. Would need the relay service to proxy this.
### 15. Canvas / A2UI
Requires a client-side renderer (macOS app, browser). The AgentCore agent could generate canvas commands, but delivery depends on having a client.
---
## Architecture: "OpenClaw Experience on AgentCore"
```
┌─────────────────────┐
│ Channel Relay │ ECS Fargate (tiny, always-on, ~$10/mo)
│ WA/Discord/TG/Slack│ Inbound msgs → InvokeAgentRuntime
│ + webhook endpoints │ Agent responses → route to channel
└──────────┬──────────┘
┌─────────────────────┐
│ AgentCore Runtime │ Serverless container (pay per use)
│ Strands Agent │
│ ├─ System prompt │ ← SOUL.md, AGENTS.md from S3
│ ├─ Tools │ ← read/write (S3), web_search, browser, message
│ ├─ Memory │ ← AgentCore Memory (short + long term)
│ └─ LLM (Bedrock) │ ← Direct IAM role access
└──────────┬──────────┘
┌─────┼─────┬──────────┐
▼ ▼ ▼ ▼
┌──────┐ ┌───┐ ┌─────────┐ ┌───────────────┐
│ S3 │ │DDB│ │AgentCore│ │ AgentCore │
│ Work-│ │Cron│ │ Memory │ │ Gateway │
│ space│ │State│ │ │ │ (MCP tools) │
└──────┘ └───┘ └─────────┘ └───────────────┘
┌─────┼─────┐
▼ ▼ ▼
Slack Jira Custom
Tool Tool Lambda
Tools
```
### EventBridge Triggers
```
┌─────────────────────┐
│ EventBridge │
│ ├─ Heartbeat (30m) │ → Lambda → InvokeAgentRuntime
│ ├─ Cron jobs │ → Lambda → InvokeAgentRuntime
│ └─ Webhook events │ → Lambda → InvokeAgentRuntime
└─────────────────────┘
```
---
## Effort Estimate (Ground-Up Build)
| Component | Effort | Tech |
|---|---|---|
| Agent container (Strands + tools) | 2-3 weeks | Python, bedrock-agentcore SDK |
| System prompt builder | 3-5 days | Port from OpenClaw TS → Python |
| S3 workspace tools (read/write/edit) | 2-3 days | boto3 |
| Web search + fetch tools | 2-3 days | Brave API, readability |
| AgentCore Memory integration | 3-5 days | AgentCore Memory SDK |
| Channel relay (Fargate) | 2-3 weeks | Node.js (reuse OpenClaw channel code) |
| EventBridge scheduling | 2-3 days | CDK/Terraform |
| Webhook ingress (API GW) | 2-3 days | CDK/Terraform |
| AgentCore Gateway tools | 1 week | Slack, custom Lambda tools |
| IaC (CDK or Terraform) | 1 week | Full stack deployment |
| Testing + integration | 1-2 weeks | End-to-end |
| **Total** | **~8-12 weeks** | For one person, part-time |
---
## Cost Estimate (Monthly)
| Service | Cost |
|---|---|
| AgentCore Runtime (agent compute) | ~$5-15 (consumption-based, depends on usage) |
| Channel relay (Fargate 0.25 vCPU) | ~$9 |
| NAT Gateway | ~$3 |
| S3 (workspace files) | ~$0.02 |
| DynamoDB (cron state, metadata) | ~$1 |
| AgentCore Memory | TBD (managed service pricing) |
| EventBridge | ~$0.01 |
| Bedrock LLM calls | $20-100+ (model-dependent, same as today) |
| **Infrastructure total (ex-LLM)** | **~$20-30/mo** |
Comparable to Fargate-only ($26/mo) but with better scaling characteristics and per-use billing for the agent compute.
---
## What You Gain vs Fargate-Only
| Benefit | Fargate-Only | AgentCore Rebuild |
|---|---|---|
| Effort to deploy | Days | Months |
| Full OpenClaw feature set | ✅ Yes | ~70% (no PTY, no nodes, no canvas) |
| Per-invocation billing | ❌ Always-on | ✅ Pay per use |
| Session isolation (security) | ❌ Shared process | ✅ Per-session microVM |
| Built-in observability | ❌ DIY logging | ✅ AgentCore tracing |
| Built-in auth (OAuth/SigV4) | ❌ DIY | ✅ AgentCore Identity |
| Multi-user scalability | ❌ Single user | ✅ Designed for it |
| AgentCore Memory | ❌ File-based | ✅ Managed, semantic |
| AgentCore Gateway tools | ❌ N/A | ✅ 1-click Slack/Jira/etc |
| Browser Tool | DIY Playwright | ✅ Built-in |
| Future AWS integrations | Manual | ✅ First-class |
---
## When the Rebuild Makes Sense
**Do it if**:
- You want to offer this as a **multi-user product/service** (AgentCore's per-session isolation is purpose-built for this)
- You want to go deep on **AWS-native agent infra** (Memory, Gateway, Identity, observability)
- You're OK with a Python agent (Strands) instead of the pi-mono TypeScript stack
- You want consumption-based billing instead of always-on compute
- This is a learning/exploration project for AgentCore itself
**Don't do it if**:
- You just want your personal assistant running on AWS (Fargate in a day)
- You need the full OpenClaw feature set (nodes, canvas, PTY, coding agents)
- You want to stay on the OpenClaw upgrade path (community updates, new channels, skills)
---
*Added 2026-03-10*