agent-claw/agentcore-rebuild.md

# AgentCore Rebuild: What's Reusable vs What's New

## The Premise

Instead of porting OpenClaw's monolithic gateway to AgentCore, build an **AgentCore-native personal assistant** that reuses the best parts of OpenClaw's design. Think of it as "the OpenClaw experience, built on AWS primitives."

---

## ✅ Directly Reusable (copy/adapt, no rewrite)

### 1. Personality & Workspace Files
All of these are just text that gets injected into the system prompt:
- `SOUL.md` — persona, tone, boundaries
- `AGENTS.md` — operating instructions
- `IDENTITY.md` — name, emoji, vibe
- `USER.md` — human's profile
- `TOOLS.md` — tool notes
- `MEMORY.md` — long-term memory
- `HEARTBEAT.md` — periodic task checklist
- `BOOTSTRAP.md` — first-run ritual
- `memory/YYYY-MM-DD.md` — daily notes

**Where they live**: S3 bucket (one prefix per user/agent). Loaded on each invocation, written back after mutations.

**Or**: AgentCore Memory for the conversational parts, S3 for the static persona files.

### 2. System Prompt Construction Logic
OpenClaw's context engine builds a rich system prompt from workspace files + tool descriptions + channel context + runtime metadata. This logic is pure string templating — framework-independent. Could be extracted and reused in a Strands/LangGraph agent or a custom Python agent.

Key pieces:
- Bootstrap file injection (with truncation markers for large files)
- Runtime context block (timezone, channel, OS, model, capabilities)
- Inbound message metadata (sender, group, timestamps)
- Tool policy injection
- Heartbeat/cron prompt variants
- Reply tag system (`[[reply_to_current]]` etc.)

### 3. Tool Definitions & Schemas
OpenClaw defines ~20+ tools. The **schemas** (parameters, descriptions) can be translated to any framework's tool format:

| OpenClaw Tool | AgentCore Equivalent | Notes |
|---|---|---|
| `read` | Custom tool (S3 or container FS) | Read files from workspace |
| `write` | Custom tool (S3 or container FS) | Write workspace files |
| `edit` | Custom tool | String replacement in files |
| `exec` | Custom tool (container shell) | Limited vs OpenClaw's full PTY |
| `web_search` | Custom tool or AgentCore Gateway | Brave API wrapper |
| `web_fetch` | Custom tool | HTTP fetch + readability extraction |
| `browser` | **AgentCore Browser Tool** | Built-in! Better than rolling your own |
| `message` (Discord/Slack) | **AgentCore Gateway** → Slack/Discord tool | 1-click integrations available |
| `memory_search` | **AgentCore Memory** | Semantic search over memory |
| `tts` | Custom tool (ElevenLabs API call) | Straightforward |
| `sessions_spawn` | AgentCore Runtime (A2A) | Agent-to-agent protocol |
| `canvas` | Custom tool or drop | Needs client-side renderer |
| `nodes` | Drop or custom | Requires physical devices |
| `cron` | EventBridge Scheduler API | Via custom tool |

### 4. Channel Integration Patterns
The **logic** of how to handle group chats, mentions, reply threading, chunking, etc. is reusable even if the transport changes:
- Group chat rules (when to speak, when to stay silent)
- Reply tag system
- Message chunking for long responses
- Typing indicators / presence
- Platform-specific formatting (Discord markdown vs WhatsApp formatting)

### 5. Skills System (Concept)
The skill discovery pattern (scan descriptions → load SKILL.md → follow instructions) works in any agent framework. The actual skill files are just markdown instructions.

### 6. Heartbeat Logic
The prompt, the HEARTBEAT_OK ack contract, the "check inbox/calendar/weather" patterns — all reusable. Just the **trigger mechanism** changes (EventBridge instead of internal timer).

---

## 🔧 Needs Rebuilding (new code, same concept)

### 7. Agent Loop
**OpenClaw**: pi-mono TypeScript (LLM call → tool parse → execute → loop)
**AgentCore**: Use **Strands Agents** (Python, AWS-native) or **LangGraph** or custom.

Strands is the path of least resistance on AgentCore — it's AWS-built, has native Bedrock integration, and the AgentCore SDK wraps it cleanly.

```python
from strands import Agent, tool
from bedrock_agentcore.runtime import BedrockAgentCoreApp

app = BedrockAgentCoreApp()

@tool
def read_workspace_file(path: str) -> str:
    """Read a file from the agent workspace."""
    # Load from S3
    ...

agent = Agent(tools=[read_workspace_file, ...])

@app.entrypoint
def main(payload):
    prompt = payload.get("prompt")
    system_prompt = build_system_prompt()  # ← Reuse OpenClaw's logic
    return {"message": agent(prompt, system_prompt=system_prompt).message}
```

### 8. Session / Memory Management
**OpenClaw**: JSONL files on disk, compaction algorithm
**AgentCore**:
- **Short-term**: AgentCore Memory (per-session turn history)
- **Long-term**: AgentCore Memory (extracted insights, preferences)
- **Workspace files**: S3 (MEMORY.md, SOUL.md, etc.)
- **Daily notes**: S3 or DynamoDB

The compaction algorithm could be reimplemented as a post-session hook that summarizes and stores to long-term memory.

### 9. Channel Relay Service
This is the **biggest new piece**. Options:

**Option A: Lightweight Fargate relay (recommended)**
- Small ECS Fargate task running a stripped-down Node.js service
- Maintains WS connections to WhatsApp/Discord/Telegram/Slack
- On inbound message → `InvokeAgentRuntime` (AgentCore)
- On agent response → route back to channel
- ~$10-15/mo for a tiny Fargate task

**Option B: Webhook-only channels**
- Telegram (webhook mode), Slack (Events API), Discord (interactions endpoint)
- API Gateway → Lambda → InvokeAgentRuntime
- No always-on infra needed
- But: no WhatsApp (Baileys needs persistent WS), no real-time Discord

**Option C: AgentCore Gateway integrations**
- AgentCore Gateway has 1-click Slack integration
- Could handle Slack as a tool (agent → Slack) but not as an inbound channel
- Would still need a relay for inbound messages

**Option D: SNS/SQS fan-out**
- Channels → SQS → Lambda → InvokeAgentRuntime
- Good for decoupling, adds latency

### 10. Scheduling (Heartbeat + Cron)
**EventBridge Scheduler** replaces OpenClaw's internal cron:

```
EventBridge Rule (every 30m)
  → Lambda function
    → InvokeAgentRuntime(prompt="Read HEARTBEAT.md...")
    → Route response to last channel
```

For dynamic cron (agent creates its own schedules), the agent needs a tool that creates/deletes EventBridge rules via the SDK.

### 11. File Operations on Workspace
**OpenClaw**: Direct filesystem read/write/edit
**AgentCore**: S3-backed workspace

```python
@tool
def write_file(path: str, content: str) -> str:
    """Write content to a workspace file."""
    s3.put_object(Bucket=WORKSPACE_BUCKET, Key=f"{agent_id}/{path}", Body=content)
    return f"Written {len(content)} bytes to {path}"
```

The `edit` tool (find-and-replace) needs to download, modify, re-upload. Slightly more complex but straightforward.

---

## ❌ Must Drop or Significantly Redesign

### 12. Shell Exec (Full PTY)
AgentCore containers can run basic commands, but:
- No persistent background processes (session dies)
- No PTY for interactive CLIs
- No host-level access
- **Coding agent sub-processes** (Codex, Claude Code) don't fit the session model

**Alternative**: Use AgentCore's A2A protocol to spin up specialized coding agent sessions, or use a separate Fargate task for heavy compute.

### 13. Device Nodes (Camera, Screen, Location)
Physical device features can't run on AgentCore. But:
- iOS/Android/macOS nodes could connect to the channel relay
- The relay could expose node commands as tools via AgentCore Gateway
- This is a stretch — likely better to keep nodes connecting to a local gateway

### 14. Browser Extension Relay
The Chrome extension relay requires a persistent WS connection to the gateway. Would need the relay service to proxy this.

### 15. Canvas / A2UI
Requires a client-side renderer (macOS app, browser). The AgentCore agent could generate canvas commands, but delivery depends on having a client.

---

## Architecture: "OpenClaw Experience on AgentCore"

```
┌─────────────────────┐
│  Channel Relay      │ ECS Fargate (tiny, always-on, ~$10/mo)
│  WA/Discord/TG/Slack│ Inbound msgs → InvokeAgentRuntime
│  + webhook endpoints │ Agent responses → route to channel
└──────────┬──────────┘
           │
           ▼
┌─────────────────────┐
│  AgentCore Runtime  │ Serverless container (pay per use)
│  Strands Agent      │
│  ├─ System prompt   │ ← SOUL.md, AGENTS.md from S3
│  ├─ Tools           │ ← read/write (S3), web_search, browser, message
│  ├─ Memory          │ ← AgentCore Memory (short + long term)
│  └─ LLM (Bedrock)   │ ← Direct IAM role access
└──────────┬──────────┘
           │
     ┌─────┼─────┬──────────┐
     ▼     ▼     ▼          ▼
┌──────┐ ┌───┐ ┌─────────┐ ┌───────────────┐
│  S3  │ │DDB│ │AgentCore│ │  AgentCore    │
│ Work-│ │Cron│ │ Memory  │ │  Gateway      │
│ space│ │State│ │         │ │  (MCP tools)  │
└──────┘ └───┘ └─────────┘ └───────────────┘
                               │
                         ┌─────┼─────┐
                         ▼     ▼     ▼
                      Slack  Jira  Custom
                      Tool   Tool  Lambda
                                   Tools
```

### EventBridge Triggers
```
┌─────────────────────┐
│  EventBridge        │
│  ├─ Heartbeat (30m) │ → Lambda → InvokeAgentRuntime
│  ├─ Cron jobs       │ → Lambda → InvokeAgentRuntime
│  └─ Webhook events  │ → Lambda → InvokeAgentRuntime
└─────────────────────┘
```

---

## Effort Estimate (Ground-Up Build)

| Component | Effort | Tech |
|---|---|---|
| Agent container (Strands + tools) | 2-3 weeks | Python, bedrock-agentcore SDK |
| System prompt builder | 3-5 days | Port from OpenClaw TS → Python |
| S3 workspace tools (read/write/edit) | 2-3 days | boto3 |
| Web search + fetch tools | 2-3 days | Brave API, readability |
| AgentCore Memory integration | 3-5 days | AgentCore Memory SDK |
| Channel relay (Fargate) | 2-3 weeks | Node.js (reuse OpenClaw channel code) |
| EventBridge scheduling | 2-3 days | CDK/Terraform |
| Webhook ingress (API GW) | 2-3 days | CDK/Terraform |
| AgentCore Gateway tools | 1 week | Slack, custom Lambda tools |
| IaC (CDK or Terraform) | 1 week | Full stack deployment |
| Testing + integration | 1-2 weeks | End-to-end |
| **Total** | **~8-12 weeks** | For one person, part-time |

---

## Cost Estimate (Monthly)

| Service | Cost |
|---|---|
| AgentCore Runtime (agent compute) | ~$5-15 (consumption-based, depends on usage) |
| Channel relay (Fargate 0.25 vCPU) | ~$9 |
| NAT Gateway | ~$3 |
| S3 (workspace files) | ~$0.02 |
| DynamoDB (cron state, metadata) | ~$1 |
| AgentCore Memory | TBD (managed service pricing) |
| EventBridge | ~$0.01 |
| Bedrock LLM calls | $20-100+ (model-dependent, same as today) |
| **Infrastructure total (ex-LLM)** | **~$20-30/mo** |

Comparable to Fargate-only ($26/mo) but with better scaling characteristics and per-use billing for the agent compute.

---

## What You Gain vs Fargate-Only

| Benefit | Fargate-Only | AgentCore Rebuild |
|---|---|---|
| Effort to deploy | Days | Months |
| Full OpenClaw feature set | ✅ Yes | ~70% (no PTY, no nodes, no canvas) |
| Per-invocation billing | ❌ Always-on | ✅ Pay per use |
| Session isolation (security) | ❌ Shared process | ✅ Per-session microVM |
| Built-in observability | ❌ DIY logging | ✅ AgentCore tracing |
| Built-in auth (OAuth/SigV4) | ❌ DIY | ✅ AgentCore Identity |
| Multi-user scalability | ❌ Single user | ✅ Designed for it |
| AgentCore Memory | ❌ File-based | ✅ Managed, semantic |
| AgentCore Gateway tools | ❌ N/A | ✅ 1-click Slack/Jira/etc |
| Browser Tool | DIY Playwright | ✅ Built-in |
| Future AWS integrations | Manual | ✅ First-class |

---

## When the Rebuild Makes Sense

**Do it if**:
- You want to offer this as a **multi-user product/service** (AgentCore's per-session isolation is purpose-built for this)
- You want to go deep on **AWS-native agent infra** (Memory, Gateway, Identity, observability)
- You're OK with a Python agent (Strands) instead of the pi-mono TypeScript stack
- You want consumption-based billing instead of always-on compute
- This is a learning/exploration project for AgentCore itself

**Don't do it if**:
- You just want your personal assistant running on AWS (Fargate in a day)
- You need the full OpenClaw feature set (nodes, canvas, PTY, coding agents)
- You want to stay on the OpenClaw upgrade path (community updates, new channels, skills)

---

*Added 2026-03-10*