Files

daniel 0369a74ac1 Initial research: OpenClaw on AgentCore architecture

- Architecture comparison (OpenClaw daemon vs AgentCore serverless)
- Component compatibility analysis
- Fargate analysis
- AgentCore rebuild plan (Telegram, zero always-on compute)
- Memory strategy: AgentCore Memory + factbase as structured KB
- Serverless relay patterns per channel
- All open questions resolved
- OpenClaw feature delta March→May 2026
- Build phases and cost estimates

2026-05-04 08:28:52 -05:00

13 KiB

Raw Blame History

AgentCore Rebuild: What's Reusable vs What's New

The Premise

Instead of porting OpenClaw's monolithic gateway to AgentCore, build an AgentCore-native personal assistant that reuses the best parts of OpenClaw's design. Think of it as "the OpenClaw experience, built on AWS primitives."

✅ Directly Reusable (copy/adapt, no rewrite)

1. Personality & Workspace Files

All of these are just text that gets injected into the system prompt:

SOUL.md — persona, tone, boundaries
AGENTS.md — operating instructions
IDENTITY.md — name, emoji, vibe
USER.md — human's profile
TOOLS.md — tool notes
MEMORY.md — long-term memory
HEARTBEAT.md — periodic task checklist
BOOTSTRAP.md — first-run ritual
memory/YYYY-MM-DD.md — daily notes

Where they live: S3 bucket (one prefix per user/agent). Loaded on each invocation, written back after mutations.

Or: AgentCore Memory for the conversational parts, S3 for the static persona files.

2. System Prompt Construction Logic

OpenClaw's context engine builds a rich system prompt from workspace files + tool descriptions + channel context + runtime metadata. This logic is pure string templating — framework-independent. Could be extracted and reused in a Strands/LangGraph agent or a custom Python agent.

Key pieces:

Bootstrap file injection (with truncation markers for large files)
Runtime context block (timezone, channel, OS, model, capabilities)
Inbound message metadata (sender, group, timestamps)
Tool policy injection
Heartbeat/cron prompt variants
Reply tag system ([[reply_to_current]] etc.)

3. Tool Definitions & Schemas

OpenClaw defines ~20+ tools. The schemas (parameters, descriptions) can be translated to any framework's tool format:

OpenClaw Tool	AgentCore Equivalent	Notes
`read`	Custom tool (S3 or container FS)	Read files from workspace
`write`	Custom tool (S3 or container FS)	Write workspace files
`edit`	Custom tool	String replacement in files
`exec`	Custom tool (container shell)	Limited vs OpenClaw's full PTY
`web_search`	Custom tool or AgentCore Gateway	Brave API wrapper
`web_fetch`	Custom tool	HTTP fetch + readability extraction
`browser`	AgentCore Browser Tool	Built-in! Better than rolling your own
`message` (Discord/Slack)	AgentCore Gateway → Slack/Discord tool	1-click integrations available
`memory_search`	AgentCore Memory	Semantic search over memory
`tts`	Custom tool (ElevenLabs API call)	Straightforward
`sessions_spawn`	AgentCore Runtime (A2A)	Agent-to-agent protocol
`canvas`	Custom tool or drop	Needs client-side renderer
`nodes`	Drop or custom	Requires physical devices
`cron`	EventBridge Scheduler API	Via custom tool

4. Channel Integration Patterns

The logic of how to handle group chats, mentions, reply threading, chunking, etc. is reusable even if the transport changes:

Group chat rules (when to speak, when to stay silent)
Reply tag system
Message chunking for long responses
Typing indicators / presence
Platform-specific formatting (Discord markdown vs WhatsApp formatting)

5. Skills System (Concept)

The skill discovery pattern (scan descriptions → load SKILL.md → follow instructions) works in any agent framework. The actual skill files are just markdown instructions.

6. Heartbeat Logic

The prompt, the HEARTBEAT_OK ack contract, the "check inbox/calendar/weather" patterns — all reusable. Just the trigger mechanism changes (EventBridge instead of internal timer).

🔧 Needs Rebuilding (new code, same concept)

7. Agent Loop

OpenClaw: pi-mono TypeScript (LLM call → tool parse → execute → loop) AgentCore: Use Strands Agents (Python, AWS-native) or LangGraph or custom.

Strands is the path of least resistance on AgentCore — it's AWS-built, has native Bedrock integration, and the AgentCore SDK wraps it cleanly.

from strands import Agent, tool
from bedrock_agentcore.runtime import BedrockAgentCoreApp

app = BedrockAgentCoreApp()

@tool
def read_workspace_file(path: str) -> str:
    """Read a file from the agent workspace."""
    # Load from S3
    ...

agent = Agent(tools=[read_workspace_file, ...])

@app.entrypoint
def main(payload):
    prompt = payload.get("prompt")
    system_prompt = build_system_prompt()  # ← Reuse OpenClaw's logic
    return {"message": agent(prompt, system_prompt=system_prompt).message}

8. Session / Memory Management

OpenClaw: JSONL files on disk, compaction algorithm AgentCore:

Short-term: AgentCore Memory (per-session turn history)
Long-term: AgentCore Memory (extracted insights, preferences)
Workspace files: S3 (MEMORY.md, SOUL.md, etc.)
Daily notes: S3 or DynamoDB

The compaction algorithm could be reimplemented as a post-session hook that summarizes and stores to long-term memory.

9. Channel Relay Service

This is the biggest new piece. Options:

Option A: Lightweight Fargate relay (recommended)

Small ECS Fargate task running a stripped-down Node.js service
Maintains WS connections to WhatsApp/Discord/Telegram/Slack
On inbound message → InvokeAgentRuntime (AgentCore)
On agent response → route back to channel
~$10-15/mo for a tiny Fargate task

Option B: Webhook-only channels

Telegram (webhook mode), Slack (Events API), Discord (interactions endpoint)
API Gateway → Lambda → InvokeAgentRuntime
No always-on infra needed
But: no WhatsApp (Baileys needs persistent WS), no real-time Discord

Option C: AgentCore Gateway integrations

AgentCore Gateway has 1-click Slack integration
Could handle Slack as a tool (agent → Slack) but not as an inbound channel
Would still need a relay for inbound messages

Option D: SNS/SQS fan-out

Channels → SQS → Lambda → InvokeAgentRuntime
Good for decoupling, adds latency

10. Scheduling (Heartbeat + Cron)

EventBridge Scheduler replaces OpenClaw's internal cron:

EventBridge Rule (every 30m)
  → Lambda function
    → InvokeAgentRuntime(prompt="Read HEARTBEAT.md...")
    → Route response to last channel

For dynamic cron (agent creates its own schedules), the agent needs a tool that creates/deletes EventBridge rules via the SDK.

11. File Operations on Workspace

OpenClaw: Direct filesystem read/write/edit AgentCore: S3-backed workspace

@tool
def write_file(path: str, content: str) -> str:
    """Write content to a workspace file."""
    s3.put_object(Bucket=WORKSPACE_BUCKET, Key=f"{agent_id}/{path}", Body=content)
    return f"Written {len(content)} bytes to {path}"

The edit tool (find-and-replace) needs to download, modify, re-upload. Slightly more complex but straightforward.

❌ Must Drop or Significantly Redesign

12. Shell Exec (Full PTY)

AgentCore containers can run basic commands, but:

No persistent background processes (session dies)
No PTY for interactive CLIs
No host-level access
Coding agent sub-processes (Codex, Claude Code) don't fit the session model

Alternative: Use AgentCore's A2A protocol to spin up specialized coding agent sessions, or use a separate Fargate task for heavy compute.

13. Device Nodes (Camera, Screen, Location)

Physical device features can't run on AgentCore. But:

iOS/Android/macOS nodes could connect to the channel relay
The relay could expose node commands as tools via AgentCore Gateway
This is a stretch — likely better to keep nodes connecting to a local gateway

14. Browser Extension Relay

The Chrome extension relay requires a persistent WS connection to the gateway. Would need the relay service to proxy this.

15. Canvas / A2UI

Requires a client-side renderer (macOS app, browser). The AgentCore agent could generate canvas commands, but delivery depends on having a client.

Architecture: "OpenClaw Experience on AgentCore"

┌─────────────────────┐
│  Channel Relay      │ ECS Fargate (tiny, always-on, ~$10/mo)
│  WA/Discord/TG/Slack│ Inbound msgs → InvokeAgentRuntime
│  + webhook endpoints │ Agent responses → route to channel
└──────────┬──────────┘
           │
           ▼
┌─────────────────────┐
│  AgentCore Runtime  │ Serverless container (pay per use)
│  Strands Agent      │
│  ├─ System prompt   │ ← SOUL.md, AGENTS.md from S3
│  ├─ Tools           │ ← read/write (S3), web_search, browser, message
│  ├─ Memory          │ ← AgentCore Memory (short + long term)
│  └─ LLM (Bedrock)   │ ← Direct IAM role access
└──────────┬──────────┘
           │
     ┌─────┼─────┬──────────┐
     ▼     ▼     ▼          ▼
┌──────┐ ┌───┐ ┌─────────┐ ┌───────────────┐
│  S3  │ │DDB│ │AgentCore│ │  AgentCore    │
│ Work-│ │Cron│ │ Memory  │ │  Gateway      │
│ space│ │State│ │         │ │  (MCP tools)  │
└──────┘ └───┘ └─────────┘ └───────────────┘
                               │
                         ┌─────┼─────┐
                         ▼     ▼     ▼
                      Slack  Jira  Custom
                      Tool   Tool  Lambda
                                   Tools

EventBridge Triggers

┌─────────────────────┐
│  EventBridge        │
│  ├─ Heartbeat (30m) │ → Lambda → InvokeAgentRuntime
│  ├─ Cron jobs       │ → Lambda → InvokeAgentRuntime
│  └─ Webhook events  │ → Lambda → InvokeAgentRuntime
└─────────────────────┘

Effort Estimate (Ground-Up Build)

Component	Effort	Tech
Agent container (Strands + tools)	2-3 weeks	Python, bedrock-agentcore SDK
System prompt builder	3-5 days	Port from OpenClaw TS → Python
S3 workspace tools (read/write/edit)	2-3 days	boto3
Web search + fetch tools	2-3 days	Brave API, readability
AgentCore Memory integration	3-5 days	AgentCore Memory SDK
Channel relay (Fargate)	2-3 weeks	Node.js (reuse OpenClaw channel code)
EventBridge scheduling	2-3 days	CDK/Terraform
Webhook ingress (API GW)	2-3 days	CDK/Terraform
AgentCore Gateway tools	1 week	Slack, custom Lambda tools
IaC (CDK or Terraform)	1 week	Full stack deployment
Testing + integration	1-2 weeks	End-to-end
Total	~8-12 weeks	For one person, part-time

Cost Estimate (Monthly)

Service	Cost
AgentCore Runtime (agent compute)	~$5-15 (consumption-based, depends on usage)
Channel relay (Fargate 0.25 vCPU)	~$9
NAT Gateway	~$3
S3 (workspace files)	~$0.02
DynamoDB (cron state, metadata)	~$1
AgentCore Memory	TBD (managed service pricing)
EventBridge	~$0.01
Bedrock LLM calls	$20-100+ (model-dependent, same as today)
Infrastructure total (ex-LLM)	~$20-30/mo

Comparable to Fargate-only ($26/mo) but with better scaling characteristics and per-use billing for the agent compute.

What You Gain vs Fargate-Only

Benefit	Fargate-Only	AgentCore Rebuild
Effort to deploy	Days	Months
Full OpenClaw feature set	✅ Yes	~70% (no PTY, no nodes, no canvas)
Per-invocation billing	❌ Always-on	✅ Pay per use
Session isolation (security)	❌ Shared process	✅ Per-session microVM
Built-in observability	❌ DIY logging	✅ AgentCore tracing
Built-in auth (OAuth/SigV4)	❌ DIY	✅ AgentCore Identity
Multi-user scalability	❌ Single user	✅ Designed for it
AgentCore Memory	❌ File-based	✅ Managed, semantic
AgentCore Gateway tools	❌ N/A	✅ 1-click Slack/Jira/etc
Browser Tool	DIY Playwright	✅ Built-in
Future AWS integrations	Manual	✅ First-class

When the Rebuild Makes Sense

Do it if:

You want to offer this as a multi-user product/service (AgentCore's per-session isolation is purpose-built for this)
You want to go deep on AWS-native agent infra (Memory, Gateway, Identity, observability)
You're OK with a Python agent (Strands) instead of the pi-mono TypeScript stack
You want consumption-based billing instead of always-on compute
This is a learning/exploration project for AgentCore itself

Don't do it if:

You just want your personal assistant running on AWS (Fargate in a day)
You need the full OpenClaw feature set (nodes, canvas, PTY, coding agents)
You want to stay on the OpenClaw upgrade path (community updates, new channels, skills)

Added 2026-03-10

13 KiB Raw Blame History