Initial research: OpenClaw on AgentCore architecture
- Architecture comparison (OpenClaw daemon vs AgentCore serverless) - Component compatibility analysis - Fargate analysis - AgentCore rebuild plan (Telegram, zero always-on compute) - Memory strategy: AgentCore Memory + factbase as structured KB - Serverless relay patterns per channel - All open questions resolved - OpenClaw feature delta March→May 2026 - Build phases and cost estimates
This commit is contained in:
217
agentcore-memory-research.md
Normal file
217
agentcore-memory-research.md
Normal file
@@ -0,0 +1,217 @@
|
||||
# AgentCore Memory — Deep Dive & MEMORY.md Replacement Analysis
|
||||
|
||||
## How AgentCore Memory Works
|
||||
|
||||
### Architecture
|
||||
|
||||
Memory is a managed service completely separate from the Runtime container. Two tiers:
|
||||
|
||||
```
|
||||
Short-term (events) Long-term (extracted records)
|
||||
───────────────────── ──────────────────────────────
|
||||
actor → session → events actor → namespace → records
|
||||
(semantic search available)
|
||||
(cross-session, persistent)
|
||||
```
|
||||
|
||||
### Short-term Memory
|
||||
|
||||
- Stores raw conversation events (turns) via `CreateEvent` API
|
||||
- Keyed by `actor_id` + `session_id`
|
||||
- Retrieval: `ListEvents(session_id)` → full conversation history
|
||||
- **Survives microVM termination** — stored in the managed service, not the container
|
||||
- This replaces JSONL session transcripts completely
|
||||
|
||||
### Long-term Memory — Three Built-in Strategies
|
||||
|
||||
Configured when creating a Memory resource. Extraction runs **asynchronously in the background** after each `CreateEvent`. Model costs for built-in extraction are **included in AgentCore Memory pricing** (confirmed by AWS support).
|
||||
|
||||
| Strategy | What it extracts | Namespace pattern |
|
||||
|---|---|---|
|
||||
| `SUMMARIZATION` | Session summaries | `/summaries/{actorId}/{sessionId}/` |
|
||||
| `USER_PREFERENCE` | Preferences, habits, recurring facts | `/preferences/{actorId}/` |
|
||||
| `SEMANTIC` | Raw facts, entities, knowledge | `/facts/{actorId}/` |
|
||||
|
||||
All three can run on the same memory resource simultaneously.
|
||||
|
||||
### Self-managed Strategy
|
||||
|
||||
You control the entire extraction pipeline:
|
||||
1. Configure triggers: message count (`messageCount: 6`), token count (`tokenCount: 1000`), or idle timeout (`idleSessionTimeout: 30`)
|
||||
2. AgentCore writes conversation payload to **your S3 bucket**
|
||||
3. Publishes notification to **your SNS topic**
|
||||
4. Your Lambda picks it up, runs whatever extraction logic you want
|
||||
5. You write results back via `BatchCreateMemoryRecords`
|
||||
|
||||
This is the **MEMORY.md pattern but managed in the cloud** — you decide what to write and how.
|
||||
|
||||
### Strands Integration
|
||||
|
||||
The Strands `AgentCoreMemorySessionManager` handles everything automatically:
|
||||
|
||||
```python
|
||||
config = AgentCoreMemoryConfig(
|
||||
memory_id=MEMORY_ID,
|
||||
session_id=SESSION_ID, # maps to Telegram chat_id + date
|
||||
actor_id=ACTOR_ID, # = user identity
|
||||
batch_size=5, # buffer 5 turns before flushing to save API calls
|
||||
)
|
||||
|
||||
with AgentCoreMemorySessionManager(config) as session_manager:
|
||||
agent = Agent(
|
||||
system_prompt=build_system_prompt(), # SOUL.md + AGENTS.md + retrieved memories
|
||||
session_manager=session_manager,
|
||||
)
|
||||
response = agent(user_message)
|
||||
# on exit: buffers flushed, async long-term extraction kicks off
|
||||
```
|
||||
|
||||
Every conversation turn is automatically stored. `batch_size` reduces API calls for rapid exchanges.
|
||||
|
||||
---
|
||||
|
||||
## MEMORY.md vs AgentCore Memory
|
||||
|
||||
### What MEMORY.md Does Today
|
||||
- Curated long-term memory the agent manually edits
|
||||
- Loaded wholesale into the system prompt each session
|
||||
- Agent writes specific things it wants to remember
|
||||
- Human-readable markdown
|
||||
|
||||
### What AgentCore Memory Provides
|
||||
- **Short-term**: full conversation history per session (replaces JSONL)
|
||||
- **Long-term SUMMARIZATION**: session summaries auto-extracted
|
||||
- **Long-term USER_PREFERENCE**: preferences auto-extracted and consolidated across sessions
|
||||
- **Long-term SEMANTIC**: facts/entities auto-extracted
|
||||
- **Semantic search**: `RetrieveMemoryRecords(query="...")` → relevant memories surfaced into system prompt
|
||||
- **Self-managed strategy**: explicit "write this to memory" control, just like the agent writing MEMORY.md
|
||||
|
||||
### Verdict: Replace MEMORY.md with AgentCore Memory
|
||||
|
||||
AgentCore Memory is strictly more powerful:
|
||||
- Auto-extraction means the agent doesn't have to manually curate (though it can via self-managed strategy)
|
||||
- Semantic search means you don't inject ALL memories into the system prompt — you inject the RELEVANT ones
|
||||
- No MEMORY.md bloat: today MEMORY.md grows unbounded; AgentCore Memory consolidates automatically
|
||||
- Cross-session persistence without any file I/O
|
||||
|
||||
**The tradeoff**: less direct control over what gets written. Mitigated with self-managed strategy for explicit writes.
|
||||
|
||||
---
|
||||
|
||||
## The S3 Round-Trip Concern — Addressed
|
||||
|
||||
Daniel's concern: S3 round-trip on every interaction.
|
||||
|
||||
With AgentCore Memory + Strands:
|
||||
|
||||
| What | When | Round-trip? |
|
||||
|---|---|---|
|
||||
| Conversation turns (short-term) | Each turn, async/batched | Non-blocking, buffered by `batch_size` |
|
||||
| Long-term extraction | Background async after turns | Zero latency impact |
|
||||
| Memory retrieval (session start) | Once per session | One `RetrieveMemoryRecords` call, ~50ms |
|
||||
| Personality files (SOUL.md etc.) | Once per session start | See below |
|
||||
|
||||
**For personality files specifically**: load them once when the session starts, cache in the container's in-memory dict. The same warm microVM handles all messages in an 8-hour session — SOUL.md loads once, not once per message. No per-message S3 calls.
|
||||
|
||||
In practice, the flow is:
|
||||
```
|
||||
Session start (once):
|
||||
1. Load SOUL.md, AGENTS.md, USER.md from S3 → cache in container memory
|
||||
2. RetrieveMemoryRecords(query="important context, preferences") → top-k memories
|
||||
3. Build system_prompt = static_files + retrieved_memories
|
||||
4. Pass to Strands agent
|
||||
|
||||
Each message (no extra round-trips):
|
||||
- Strands auto-stores turns to AgentCore Memory (async/batched)
|
||||
- Long-term extraction runs in background
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Recommended Storage Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ S3 (persona bucket) │
|
||||
│ SOUL.md, AGENTS.md, IDENTITY.md, USER.md, HEARTBEAT.md │
|
||||
│ → Loaded ONCE at session start, cached in container memory │
|
||||
│ → Updated rarely (when Daniel edits them) │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ AgentCore Memory (replaces MEMORY.md + JSONL transcripts) │
|
||||
│ │
|
||||
│ Short-term: conversation turns (per session) │
|
||||
│ → Strands session_manager handles automatically │
|
||||
│ │
|
||||
│ Long-term strategies: │
|
||||
│ SUMMARIZATION → /summaries/{actorId}/{sessionId}/ │
|
||||
│ USER_PREFERENCE → /preferences/{actorId}/ │
|
||||
│ SEMANTIC → /facts/{actorId}/ │
|
||||
│ │
|
||||
│ Self-managed strategy (for explicit "remember this"): │
|
||||
│ Trigger: idle timeout or message count │
|
||||
│ SNS → Lambda → custom extraction → BatchCreateMemoryRecords│
|
||||
│ → "/curated/{actorId}/" namespace │
|
||||
│ → This is the MEMORY.md equivalent, automated │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ DynamoDB │
|
||||
│ telegram_chat_id → agentcore_session_id + actor_id │
|
||||
│ heartbeat state (last check timestamps) │
|
||||
│ cron job definitions │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Session Start Pattern
|
||||
|
||||
```python
|
||||
@app.entrypoint
|
||||
async def main(payload, context):
|
||||
actor_id = payload["actor_id"] # = Telegram user ID
|
||||
session_id = payload["session_id"] # = from DynamoDB lookup
|
||||
|
||||
# Load static files (once per warm session, cached)
|
||||
if not PERSONA_CACHE.loaded:
|
||||
PERSONA_CACHE.update(load_from_s3(["SOUL.md", "AGENTS.md", "USER.md"]))
|
||||
|
||||
# Retrieve relevant long-term memories (semantic search)
|
||||
memories = memory_session.search_long_term_memories(
|
||||
query=payload["message"],
|
||||
namespace_prefix=f"/preferences/{actor_id}/",
|
||||
top_k=5
|
||||
)
|
||||
|
||||
# Build system prompt
|
||||
system_prompt = build_prompt(PERSONA_CACHE, memories)
|
||||
|
||||
# Run agent (session_manager handles turn storage automatically)
|
||||
with AgentCoreMemorySessionManager(config) as session_manager:
|
||||
agent = Agent(system_prompt=system_prompt, session_manager=session_manager)
|
||||
return {"response": agent(payload["message"]).message}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## What AgentCore Memory Pricing Covers
|
||||
|
||||
From the pricing page and AWS re:Post confirmation:
|
||||
- **Built-in strategies** (SUMMARIZATION, USER_PREFERENCE, SEMANTIC): model extraction costs are **included** in Memory pricing
|
||||
- **Self-managed strategy**: you pay for your own Lambda + Bedrock calls
|
||||
- Memory storage: billed per GB stored
|
||||
- `RetrieveMemoryRecords` (semantic search): billed per search
|
||||
|
||||
Exact rates not yet published clearly, but designed to be low for personal assistant scale.
|
||||
|
||||
---
|
||||
|
||||
## Open Questions Remaining
|
||||
|
||||
1. **Pricing for AgentCore Memory**: exact rates for storage + retrieval not clearly published yet. Need to check when actually provisioning.
|
||||
2. **S3 persona file cache invalidation**: when SOUL.md is updated in S3, the warm container won't know. Need a mechanism — either DynamoDB version flag checked at session start, or just accept ~8hr staleness (fine for persona files).
|
||||
3. **Self-managed extraction timing**: confirm whether idle-session trigger in self-managed strategy fires reliably at session end vs requiring explicit trigger. This determines whether the "write to memory" tool works reliably.
|
||||
|
||||
---
|
||||
|
||||
*Research: 2026-05-04. Sources: AgentCore Memory docs (memory-types, memory-strategies, memory-organization, strands integration), AgentCore pricing page.*
|
||||
Reference in New Issue
Block a user