- Architecture comparison (OpenClaw daemon vs AgentCore serverless) - Component compatibility analysis - Fargate analysis - AgentCore rebuild plan (Telegram, zero always-on compute) - Memory strategy: AgentCore Memory + factbase as structured KB - Serverless relay patterns per channel - All open questions resolved - OpenClaw feature delta March→May 2026 - Build phases and cost estimates
12 KiB
Plan: AgentCore-Native OpenClaw (Telegram, Zero Always-On Compute)
Target Architecture
[Telegram User]
│ message
▼
[Telegram Servers]
│ POST (webhook)
▼
[API Gateway (HTTP API)]
│
▼
[Lambda: tg-ingest] ← verify sig, send typing action, enqueue
│ SQS message
▼
[SQS: agent-queue]
│ trigger
▼
[Lambda: agent-runner] ← load workspace from S3, build system prompt,
│ InvokeAgentRuntime map chat_id → session_id
▼
[AgentCore Runtime] ← Strands agent container (ARM64)
│ streaming response tools: web_search, read/write S3, memory
▼
[Lambda: agent-runner] ← stream reply back
│ Telegram Bot API
▼
[Telegram User] ← receives message
[EventBridge Scheduler] ← every 30m → Lambda → InvokeAgentRuntime (heartbeat prompt)
│ │
▼ ▼ same response routing
[Lambda: heartbeat-trigger] [Telegram Bot API]
No 24/7 compute anywhere. Everything is event-driven.
What We've Answered
- ✅ AgentCore is the right runtime (stateless container, event-driven)
- ✅ Telegram supports full webhook mode (all message types)
- ✅ SQS decoupling handles the webhook ack requirement (respond 204 in <10s)
- ✅ OpenClaw workspace files (SOUL.md, AGENTS.md, MEMORY.md) reusable via S3
- ✅ System prompt construction logic is portable (pure string ops)
- ✅ Tool schemas (web_search, read, write, edit) translateable to Strands @tool
- ✅ EventBridge handles heartbeat and cron (no gateway process needed)
- ✅ AgentCore Memory SDK exists and supports conversation history + long-term extraction
- ✅ InvokeAgentRuntime supports streaming responses
- ✅ Lifecycle settings: idleRuntimeSessionTimeout is configurable (min 60s, default 900s)
- ✅ Cold start: Firecracker microVM ~2-5 seconds on first invocation
- ✅ Language/framework: Python + Strands + bedrock-agentcore SDK (ARM64 container)
- ✅ AgentCore Memory SDK: MemorySessionManager, actor_id + session_id model, search_long_term_memories()
Open Questions (Not Yet Answered)
🔴 Critical — blocks architecture decisions
Q1: Response routing for async runs When InvokeAgentRuntime is called from the agent-runner Lambda, does it block synchronously until the agent finishes? Lambda max timeout is 15 minutes. AgentCore sessions can run up to 8 hours. What's the maximum synchronous response wait? Is there a callback/webhook pattern for long agent runs, or do we always need to poll?
Why it matters: If an agent run takes 3 minutes (web browsing + LLM), the agent-runner Lambda needs to sit open for 3 minutes. That's fine up to ~15 minutes. But longer runs (coding tasks, deep research) need a different pattern.
Research needed: InvokeAgentRuntime streaming behavior + max Lambda concurrency implications.
Q2: Session ID strategy and daily session lifecycle
idleRuntimeSessionTimeout is configurable (60s–8hr, default 15min). For a personal assistant, set it to 4-6 hours — the session stays warm all day. Max lifetime is 8 hours, after which a new session is created.
- Map Telegram
chat_id→runtimeSessionIdin DynamoDB (create new session ID at start of day / when previous session maxes out) - On new session creation, load MEMORY.md + SOUL.md from S3 into system prompt — that's the context restoration
- The 8hr session boundary is a daily rhythm, not a UX problem
Simplified: One session per user per day. Session stays warm between messages. After 8hr, start a new one and reload workspace from S3.
Q3: AgentCore Memory — is long-term extraction automatic or manual?
The SDK docs mention "long-term memory automatically extracts and stores key insights." Is this extraction triggered on every add_turns() call, after a session ends, or does it require an explicit extraction call? Does it cost extra (separate LLM call)?
Why it matters: If extraction isn't automatic, MEMORY.md-equivalent content needs to be managed explicitly.
Q4: Workspace file mutations (MEMORY.md writes) — S3 vs AgentCore Memory When the agent wants to write to MEMORY.md (e.g., "remember this for next time"), there are two paths:
- Write to S3, reload on next invocation — simple but doesn't benefit from semantic search
- Write to AgentCore Memory — benefits from extraction + search but changes the access pattern
Which approach for MEMORY.md? Can we use BOTH — S3 for large curated memory, AgentCore Memory for semantic search over conversation history?
Q5: Cold start UX impact — first session only AgentCore keeps the microVM alive between requests (no cold start for warm sessions). The only startup cost is on the first invocation of a brand new session (container image pull + process start). Subsequent requests to the same warm session are instant.
- Does the Telegram "typing..." indicator cover the one-time startup gap on new session creation?
- What happens when the Lambda itself is cold (~500ms Lambda cold start, separate from the AgentCore session)?
Q6: Strands agent + bedrock-agentcore container — ARM64 build complexity AgentCore requires ARM64 containers. Strands is Python. The base image needs:
- Python 3.11+
strands-agents,bedrock-agentcorepip packages- AWS credentials via task role (IAM)
- Access to Bedrock models (need to check regional availability for the models we want)
What's the actual container build + push + deploy flow? Is there a starter template?
🟡 Important — needs answer before first spike
Q7: Which Bedrock model and region? AgentCore Runtime is available in us-east-1, us-west-2, and several other regions. The Bedrock models we want (Claude Sonnet 4, etc.) need to be available in the same region. Cross-region inference adds latency.
Need to confirm: which model for the agent (Sonnet? Haiku for speed?), which region for AgentCore, does the region support the model?
Q8: Telegram → AgentCore payload structure
The Telegram Update object contains message.chat.id, message.from.id, message.text, etc. The InvokeAgentRuntime payload is arbitrary JSON. What does the agent container expect to receive? How do we thread Telegram context (group vs DM, sender info, reply_to) through the SQS → Lambda → AgentCore chain?
Q9: Telegram response back to user — token management
The agent-runner Lambda needs to call api.telegram.org/bot{token}/sendMessage after the agent responds. The Bot Token must be available to the Lambda. Secrets Manager is the right answer — but it needs to be in the architecture from day one.
Q10: Heartbeat response delivery The heartbeat EventBridge rule fires every 30 minutes. The heartbeat Lambda invokes AgentCore. The agent produces a response (either HEARTBEAT_OK to suppress, or an actual message to deliver).
Where does the heartbeat response go? The Lambda needs to know: "if the agent produces a non-HEARTBEAT_OK response, send it to Telegram chat_id X." This routing config (target Telegram chat ID for heartbeat delivery) needs to be stored somewhere (DynamoDB, Secrets Manager, or baked into the Lambda env).
Q11: Multi-turn within a single AgentCore session
If a user sends 3 rapid messages (before the session expires), do they all land in the same runtimeSessionId? The agent-runner Lambda needs to look up the current active session ID for a given Telegram chat_id from DynamoDB, or create a new one if expired.
Race condition: two messages arrive simultaneously → both Lambdas look up session → both see "no active session" → both create new sessions. Need a DynamoDB conditional write / lock.
Q12: Telegram send_chat_action ("typing") timing Telegram's chat action expires in ~5 seconds. For a 30-second agent run, we need to refresh the typing indicator periodically. The agent-runner Lambda needs to refresh it while waiting for InvokeAgentRuntime to stream. Is this easy to do in a Lambda while streaming?
🟢 Lower priority — figure out during build
Q13: What tools does the container expose? OpenClaw has ~20 tools. For an MVP, what's the minimum viable tool set?
read_file(path)— S3 workspacewrite_file(path, content)— S3 workspaceweb_search(query)— Brave APIweb_fetch(url)— HTTP + readabilitymemory_search(query)— AgentCore Memorysend_telegram_message(text)— for multi-message replies? or just return the response?
Tools NOT in scope for v1: exec, browser, canvas, cron management, image generation.
Q14: Cron job management from within the agent
OpenClaw lets the agent create/delete cron jobs dynamically. With EventBridge, a create_cron_job tool would need to call eventbridge.put_rule(). Doable but needs IAM permissions baked in. Scope for v2.
Q15: Secrets rotation Bot token, Brave API key, etc. — Secrets Manager. Need to decide: Lambda env vars (loaded on cold start) vs Secrets Manager SDK calls (per-invocation). For personal scale, env vars baked in at deploy time are fine. Secrets Manager adds ~50ms latency per call.
Q16: IaC choice CDK (TypeScript) or Terraform or SAM. CDK is most AWS-native and has the highest-level constructs. SAM is simpler for Lambda-centric stacks. Terraform if portability matters.
Proposed Build Phases
Phase 0 — Spike (1-2 days)
Answer Q1, Q2, Q5 by actually running the thing:
- Deploy the smallest possible Strands container to AgentCore
- Send it a test InvokeAgentRuntime call
- Measure cold start latency in practice
- Test what happens when a session expires and you reinvoke with the same ID
Phase 1 — Telegram → Agent → Response (1 week)
- API Gateway + tg-ingest Lambda (verify signature, SQS enqueue, return 204)
- SQS queue
- agent-runner Lambda (maps chat_id → session_id, invokes AgentCore, sends Telegram reply)
- AgentCore container: minimal Strands agent, system prompt from S3 workspace, web_search tool
- S3 workspace bucket with SOUL.md, AGENTS.md, USER.md
- DynamoDB: chat_id → session_id mapping
Done when: can send a Telegram message and get a reply from the agent, personality intact.
Phase 2 — Memory + Workspace (1 week)
- AgentCore Memory provisioned (memory_id per user)
- Conversation history stored after each turn
- Long-term memory extraction confirmed working
- MEMORY.md sync pattern: S3 for curated, AgentCore Memory for semantic search
- write_file / read_file tools pointing at S3 workspace
Done when: agent remembers things across sessions (>15min gaps).
Phase 3 — Heartbeat + Cron (3-4 days)
- EventBridge rule (every 30m)
- heartbeat-trigger Lambda
- HEARTBEAT_OK suppression logic
- Delivery to configurable Telegram chat ID
Done when: heartbeat fires, agent checks HEARTBEAT.md, delivers alerts to Telegram.
Phase 4 — Polish (ongoing)
- Typing indicator refresh during long runs
- Additional tools (image gen, TTS)
- Error handling + DLQ
- CDK/IaC for reproducible deploys
- Cost monitoring
Cost Estimate (Personal Scale, ~50 agent runs/day)
| Service | Est. Monthly Cost | Notes |
|---|---|---|
| API Gateway (HTTP) | ~$0.01 | <1M requests/mo |
| Lambda (ingest + runner + heartbeat) | ~$0.50 | ~2000 invocations/day, avg 30s |
| SQS | ~$0.00 | Free tier |
| AgentCore Runtime | ~$5-15 | 50 runs/day × 30s avg × ~$0.0x/compute-sec |
| AgentCore Memory | TBD | Pricing not fully public yet |
| S3 (workspace files) | ~$0.01 | <1 MB total |
| DynamoDB (session mapping) | ~$0.01 | On-demand, minimal reads/writes |
| Bedrock LLM calls | $20-80 | Same as today — model-dependent |
| EventBridge | ~$0.00 | <100 rules/events/mo |
| Secrets Manager | ~$0.40 | $0.40/secret/mo |
| Total infra (ex-LLM) | ~$6-20/mo | vs ~$26/mo for Fargate |
Zero always-on compute cost. Pay only when messages arrive.
Immediate Next Steps
- Answer Q1 + Q2 with a spike — deploy toy Strands container, measure cold start, test session expiry behavior
- Clarify AgentCore Memory extraction (Q3) — read the full SDK docs + test
- Lock the Telegram payload schema (Q8) — define what goes in InvokeAgentRuntime payload
- Pick region + model (Q7) — confirm Sonnet availability in target region
- Start Phase 1 build
Updated 2026-05-04