Debugging Silent Write Failures and Context Loop Traps in OpenClaw
A community member recently reported a frustrating experience: their agent spent 3+ hours and burned significant tokens with zero output. The agent kept re-greeting and re-researching the same topic over and over. Sound familiar?
This issue usually comes down to two separate problems working together:
- Large file writes failing silently
- Context trimming causing the agent to "forget" what it was doing
Let's break down both issues and how to fix them.
Problem 1: Silent Write Failures
When you ask your agent to write a large file (think 100KB+ of code or research), the write tool might fail without surfacing a clear error. The gateway logs the error internally, but the model either hallucinates success or gets compacted before it can recover.
Diagnosing Write Failures
Check the basics first:
- If you're running in a sandbox (Docker), is the workspace actually writable?
- How big was the payload? (10KB vs 500KB vs 5MB makes a difference)
- Does the gateway log show an error during/after the tool call?
Capture evidence in real-time:
# Tail logs while reproducing the issue
openclaw logs --follow
# For more detail, run gateway in foreground with verbose logging
openclaw gateway --verbose --ws-log fullCommon causes include:
- Sandbox permissions issues โ workspace not mounted writable
- Timeouts โ very large writes taking too long
- Provider response truncation โ tool input too large, stream gets cut
- Gateway crash/restart โ check if the gateway restarted mid-task
Problem 2: The Infinite Re-Start Loop
Even if writes succeed, your agent might still get stuck in a loop. Here's why: when the context window fills up, OpenClaw compacts the session. But if important task state only existed in the conversation, it's now gone. The agent "wakes up" with no memory of what it was doing and starts fresh โ greeting you, researching from scratch, etc.
Solution A: Enable Compaction Safeguards
OpenClaw has built-in protection for this. Enable safeguard mode with memory flush so the agent writes important notes before compaction:
# In your config (openclaw.yaml)
agents:
defaults:
compaction:
mode: safeguard
reserveTokensFloor: 24000
memoryFlush:
enabled: true
softThresholdTokens: 6000
systemPrompt: "Session nearing compaction. Store durable memories now."
prompt: "Write any lasting notes to memory/YYYY-MM-DD.md; reply with NO_REPLY if nothing to store."Important: Memory flush is skipped if the workspace is read-only (common with aggressive sandboxing).
Docs:
Solution B: Enable Session Pruning
Session pruning trims old tool outputs from the context before LLM calls, reducing bloat:
agents:
defaults:
contextPruning:
enabled: trueDocs: Session pruning
Solution C: The TASK.md Pattern (Operational Workaround)
Even with compaction safeguards, long multi-hour tasks can lose state. The most reliable fix is a tiny task state file that the agent reads on startup:
- Create
TASK.mdin your workspace - Add a rule to
AGENTS.md:On start, read TASK.md for current task state. After each milestone, update TASK.md with progress.
This forces the agent to restore context from disk, not from memory. It survives compactions, restarts, and even gateway crashes.
Putting It All Together
If your agent is:
- Failing silently on writes โ Check sandbox permissions, check payload sizes, tail logs
- Looping infinitely โ Enable compaction safeguards + memory flush, and use a TASK.md file
The combination of proper compaction settings and explicit task state tracking will save you hours of wasted tokens.
Have you hit this issue? What workarounds worked for you? Drop a comment below.
Source: OpenClaw Discord #help thread
Comments (0)
No comments yet. Be the first to comment!