Files
agent-claw/agentcore-rebuild.md
daniel 0369a74ac1 Initial research: OpenClaw on AgentCore architecture
- Architecture comparison (OpenClaw daemon vs AgentCore serverless)
- Component compatibility analysis
- Fargate analysis
- AgentCore rebuild plan (Telegram, zero always-on compute)
- Memory strategy: AgentCore Memory + factbase as structured KB
- Serverless relay patterns per channel
- All open questions resolved
- OpenClaw feature delta March→May 2026
- Build phases and cost estimates
2026-05-04 08:28:52 -05:00

13 KiB

AgentCore Rebuild: What's Reusable vs What's New

The Premise

Instead of porting OpenClaw's monolithic gateway to AgentCore, build an AgentCore-native personal assistant that reuses the best parts of OpenClaw's design. Think of it as "the OpenClaw experience, built on AWS primitives."


Directly Reusable (copy/adapt, no rewrite)

1. Personality & Workspace Files

All of these are just text that gets injected into the system prompt:

  • SOUL.md — persona, tone, boundaries
  • AGENTS.md — operating instructions
  • IDENTITY.md — name, emoji, vibe
  • USER.md — human's profile
  • TOOLS.md — tool notes
  • MEMORY.md — long-term memory
  • HEARTBEAT.md — periodic task checklist
  • BOOTSTRAP.md — first-run ritual
  • memory/YYYY-MM-DD.md — daily notes

Where they live: S3 bucket (one prefix per user/agent). Loaded on each invocation, written back after mutations.

Or: AgentCore Memory for the conversational parts, S3 for the static persona files.

2. System Prompt Construction Logic

OpenClaw's context engine builds a rich system prompt from workspace files + tool descriptions + channel context + runtime metadata. This logic is pure string templating — framework-independent. Could be extracted and reused in a Strands/LangGraph agent or a custom Python agent.

Key pieces:

  • Bootstrap file injection (with truncation markers for large files)
  • Runtime context block (timezone, channel, OS, model, capabilities)
  • Inbound message metadata (sender, group, timestamps)
  • Tool policy injection
  • Heartbeat/cron prompt variants
  • Reply tag system ([[reply_to_current]] etc.)

3. Tool Definitions & Schemas

OpenClaw defines ~20+ tools. The schemas (parameters, descriptions) can be translated to any framework's tool format:

OpenClaw Tool AgentCore Equivalent Notes
read Custom tool (S3 or container FS) Read files from workspace
write Custom tool (S3 or container FS) Write workspace files
edit Custom tool String replacement in files
exec Custom tool (container shell) Limited vs OpenClaw's full PTY
web_search Custom tool or AgentCore Gateway Brave API wrapper
web_fetch Custom tool HTTP fetch + readability extraction
browser AgentCore Browser Tool Built-in! Better than rolling your own
message (Discord/Slack) AgentCore Gateway → Slack/Discord tool 1-click integrations available
memory_search AgentCore Memory Semantic search over memory
tts Custom tool (ElevenLabs API call) Straightforward
sessions_spawn AgentCore Runtime (A2A) Agent-to-agent protocol
canvas Custom tool or drop Needs client-side renderer
nodes Drop or custom Requires physical devices
cron EventBridge Scheduler API Via custom tool

4. Channel Integration Patterns

The logic of how to handle group chats, mentions, reply threading, chunking, etc. is reusable even if the transport changes:

  • Group chat rules (when to speak, when to stay silent)
  • Reply tag system
  • Message chunking for long responses
  • Typing indicators / presence
  • Platform-specific formatting (Discord markdown vs WhatsApp formatting)

5. Skills System (Concept)

The skill discovery pattern (scan descriptions → load SKILL.md → follow instructions) works in any agent framework. The actual skill files are just markdown instructions.

6. Heartbeat Logic

The prompt, the HEARTBEAT_OK ack contract, the "check inbox/calendar/weather" patterns — all reusable. Just the trigger mechanism changes (EventBridge instead of internal timer).


🔧 Needs Rebuilding (new code, same concept)

7. Agent Loop

OpenClaw: pi-mono TypeScript (LLM call → tool parse → execute → loop) AgentCore: Use Strands Agents (Python, AWS-native) or LangGraph or custom.

Strands is the path of least resistance on AgentCore — it's AWS-built, has native Bedrock integration, and the AgentCore SDK wraps it cleanly.

from strands import Agent, tool
from bedrock_agentcore.runtime import BedrockAgentCoreApp

app = BedrockAgentCoreApp()

@tool
def read_workspace_file(path: str) -> str:
    """Read a file from the agent workspace."""
    # Load from S3
    ...

agent = Agent(tools=[read_workspace_file, ...])

@app.entrypoint
def main(payload):
    prompt = payload.get("prompt")
    system_prompt = build_system_prompt()  # ← Reuse OpenClaw's logic
    return {"message": agent(prompt, system_prompt=system_prompt).message}

8. Session / Memory Management

OpenClaw: JSONL files on disk, compaction algorithm AgentCore:

  • Short-term: AgentCore Memory (per-session turn history)
  • Long-term: AgentCore Memory (extracted insights, preferences)
  • Workspace files: S3 (MEMORY.md, SOUL.md, etc.)
  • Daily notes: S3 or DynamoDB

The compaction algorithm could be reimplemented as a post-session hook that summarizes and stores to long-term memory.

9. Channel Relay Service

This is the biggest new piece. Options:

Option A: Lightweight Fargate relay (recommended)

  • Small ECS Fargate task running a stripped-down Node.js service
  • Maintains WS connections to WhatsApp/Discord/Telegram/Slack
  • On inbound message → InvokeAgentRuntime (AgentCore)
  • On agent response → route back to channel
  • ~$10-15/mo for a tiny Fargate task

Option B: Webhook-only channels

  • Telegram (webhook mode), Slack (Events API), Discord (interactions endpoint)
  • API Gateway → Lambda → InvokeAgentRuntime
  • No always-on infra needed
  • But: no WhatsApp (Baileys needs persistent WS), no real-time Discord

Option C: AgentCore Gateway integrations

  • AgentCore Gateway has 1-click Slack integration
  • Could handle Slack as a tool (agent → Slack) but not as an inbound channel
  • Would still need a relay for inbound messages

Option D: SNS/SQS fan-out

  • Channels → SQS → Lambda → InvokeAgentRuntime
  • Good for decoupling, adds latency

10. Scheduling (Heartbeat + Cron)

EventBridge Scheduler replaces OpenClaw's internal cron:

EventBridge Rule (every 30m)
  → Lambda function
    → InvokeAgentRuntime(prompt="Read HEARTBEAT.md...")
    → Route response to last channel

For dynamic cron (agent creates its own schedules), the agent needs a tool that creates/deletes EventBridge rules via the SDK.

11. File Operations on Workspace

OpenClaw: Direct filesystem read/write/edit AgentCore: S3-backed workspace

@tool
def write_file(path: str, content: str) -> str:
    """Write content to a workspace file."""
    s3.put_object(Bucket=WORKSPACE_BUCKET, Key=f"{agent_id}/{path}", Body=content)
    return f"Written {len(content)} bytes to {path}"

The edit tool (find-and-replace) needs to download, modify, re-upload. Slightly more complex but straightforward.


Must Drop or Significantly Redesign

12. Shell Exec (Full PTY)

AgentCore containers can run basic commands, but:

  • No persistent background processes (session dies)
  • No PTY for interactive CLIs
  • No host-level access
  • Coding agent sub-processes (Codex, Claude Code) don't fit the session model

Alternative: Use AgentCore's A2A protocol to spin up specialized coding agent sessions, or use a separate Fargate task for heavy compute.

13. Device Nodes (Camera, Screen, Location)

Physical device features can't run on AgentCore. But:

  • iOS/Android/macOS nodes could connect to the channel relay
  • The relay could expose node commands as tools via AgentCore Gateway
  • This is a stretch — likely better to keep nodes connecting to a local gateway

14. Browser Extension Relay

The Chrome extension relay requires a persistent WS connection to the gateway. Would need the relay service to proxy this.

15. Canvas / A2UI

Requires a client-side renderer (macOS app, browser). The AgentCore agent could generate canvas commands, but delivery depends on having a client.


Architecture: "OpenClaw Experience on AgentCore"

┌─────────────────────┐
│  Channel Relay      │ ECS Fargate (tiny, always-on, ~$10/mo)
│  WA/Discord/TG/Slack│ Inbound msgs → InvokeAgentRuntime
│  + webhook endpoints │ Agent responses → route to channel
└──────────┬──────────┘
           │
           ▼
┌─────────────────────┐
│  AgentCore Runtime  │ Serverless container (pay per use)
│  Strands Agent      │
│  ├─ System prompt   │ ← SOUL.md, AGENTS.md from S3
│  ├─ Tools           │ ← read/write (S3), web_search, browser, message
│  ├─ Memory          │ ← AgentCore Memory (short + long term)
│  └─ LLM (Bedrock)   │ ← Direct IAM role access
└──────────┬──────────┘
           │
     ┌─────┼─────┬──────────┐
     ▼     ▼     ▼          ▼
┌──────┐ ┌───┐ ┌─────────┐ ┌───────────────┐
│  S3  │ │DDB│ │AgentCore│ │  AgentCore    │
│ Work-│ │Cron│ │ Memory  │ │  Gateway      │
│ space│ │State│ │         │ │  (MCP tools)  │
└──────┘ └───┘ └─────────┘ └───────────────┘
                               │
                         ┌─────┼─────┐
                         ▼     ▼     ▼
                      Slack  Jira  Custom
                      Tool   Tool  Lambda
                                   Tools

EventBridge Triggers

┌─────────────────────┐
│  EventBridge        │
│  ├─ Heartbeat (30m) │ → Lambda → InvokeAgentRuntime
│  ├─ Cron jobs       │ → Lambda → InvokeAgentRuntime
│  └─ Webhook events  │ → Lambda → InvokeAgentRuntime
└─────────────────────┘

Effort Estimate (Ground-Up Build)

Component Effort Tech
Agent container (Strands + tools) 2-3 weeks Python, bedrock-agentcore SDK
System prompt builder 3-5 days Port from OpenClaw TS → Python
S3 workspace tools (read/write/edit) 2-3 days boto3
Web search + fetch tools 2-3 days Brave API, readability
AgentCore Memory integration 3-5 days AgentCore Memory SDK
Channel relay (Fargate) 2-3 weeks Node.js (reuse OpenClaw channel code)
EventBridge scheduling 2-3 days CDK/Terraform
Webhook ingress (API GW) 2-3 days CDK/Terraform
AgentCore Gateway tools 1 week Slack, custom Lambda tools
IaC (CDK or Terraform) 1 week Full stack deployment
Testing + integration 1-2 weeks End-to-end
Total ~8-12 weeks For one person, part-time

Cost Estimate (Monthly)

Service Cost
AgentCore Runtime (agent compute) ~$5-15 (consumption-based, depends on usage)
Channel relay (Fargate 0.25 vCPU) ~$9
NAT Gateway ~$3
S3 (workspace files) ~$0.02
DynamoDB (cron state, metadata) ~$1
AgentCore Memory TBD (managed service pricing)
EventBridge ~$0.01
Bedrock LLM calls $20-100+ (model-dependent, same as today)
Infrastructure total (ex-LLM) ~$20-30/mo

Comparable to Fargate-only ($26/mo) but with better scaling characteristics and per-use billing for the agent compute.


What You Gain vs Fargate-Only

Benefit Fargate-Only AgentCore Rebuild
Effort to deploy Days Months
Full OpenClaw feature set Yes ~70% (no PTY, no nodes, no canvas)
Per-invocation billing Always-on Pay per use
Session isolation (security) Shared process Per-session microVM
Built-in observability DIY logging AgentCore tracing
Built-in auth (OAuth/SigV4) DIY AgentCore Identity
Multi-user scalability Single user Designed for it
AgentCore Memory File-based Managed, semantic
AgentCore Gateway tools N/A 1-click Slack/Jira/etc
Browser Tool DIY Playwright Built-in
Future AWS integrations Manual First-class

When the Rebuild Makes Sense

Do it if:

  • You want to offer this as a multi-user product/service (AgentCore's per-session isolation is purpose-built for this)
  • You want to go deep on AWS-native agent infra (Memory, Gateway, Identity, observability)
  • You're OK with a Python agent (Strands) instead of the pi-mono TypeScript stack
  • You want consumption-based billing instead of always-on compute
  • This is a learning/exploration project for AgentCore itself

Don't do it if:

  • You just want your personal assistant running on AWS (Fargate in a day)
  • You need the full OpenClaw feature set (nodes, canvas, PTY, coding agents)
  • You want to stay on the OpenClaw upgrade path (community updates, new channels, skills)

Added 2026-03-10