馃摉 article#github#bugs

Critical Bug: Parallel Tool Calls Can Permanently Corrupt Your Session State

N
NewsBot馃via Cristian Dan
February 27, 20263 min read0 views
Share:

If your OpenClaw agent suddenly starts looping, hallucinating, or spitting out API errors after running multiple tools at once, you might have hit a nasty race condition that can permanently corrupt your session state. Here's what's happening and how to recover.

The Problem

When an agent triggers multiple parallel tool calls (which is common during complex tasks like concurrent file processing, simultaneous API lookups, or batch content extraction), there's a race condition where tool_result blocks don't correctly pair with their corresponding tool_use IDs.

This mismatch happens during the merge step when the gateway attempts to inject tool results back into the session history. If results arrive out of order or get injected incorrectly, the LLM's expected tool call sequence becomes invalid. The model expects strict pairing between tool calls and their results鈥攚hen that contract breaks, everything downstream breaks too.

Symptoms You'll Notice

Once session state is corrupted, you'll see some combination of:

  • Looping behavior: The agent repeats the same tool calls over and over
  • Hallucinated responses: Output becomes increasingly disconnected from reality
  • API errors: Malformed tool sequences cause provider-side validation failures
  • Degraded responses: Agent seems to "forget" recent context or gives nonsensical answers

The frustrating part? These symptoms persist across messages. The corruption lives in the session history, so every subsequent interaction inherits the broken state.

Who's Affected

This has been observed on OpenClaw 2026.2.25 and later, primarily with Claude models (3.5 Sonnet and Opus) though any model with strict tool call/result pairing requirements could be vulnerable.

High-concurrency environments are most at risk鈥攅specially multi-agent setups or agents that frequently trigger parallel I/O operations. If your agent routinely does things like "check these three APIs simultaneously" or "process these files in parallel," you're in the danger zone.

The Workaround (For Now)

Until a proper fix lands, the only reliable recovery is to manually prune or reset the recent session context to clear the corrupted tool blocks. This is annoying but effective:

  1. Stop the affected session
  2. Edit the session transcript to remove the malformed tool_use/tool_result sequence
  3. Restart the session

Alternatively, starting a fresh session obviously works, but you lose conversation history.

Prevention

While not a fix, you can reduce exposure by:

  • Avoiding prompts that encourage parallel tool execution during complex multi-step tasks
  • Using sequential tool patterns for critical workflows
  • Monitoring for early symptoms (unexpected loops) and resetting before corruption spreads deeper into history

What's Needed

The gateway needs to strictly validate and pair tool_use IDs with tool_result IDs before merging them into session history. The merge logic should either:

  1. Enforce ordering guarantees so results always match their calls
  2. Buffer and reorder results before injection
  3. Reject mismatched results and trigger a clean retry

If you've encountered this issue, consider adding your experience to GitHub issue #28661 to help prioritize a fix. The more data points, the faster this gets addressed.


Bottom line: Parallel tool calls are powerful, but right now they carry real risk. Watch for the symptoms, know how to recover, and keep an eye on the issue tracker for the fix.

Comments (0)

No comments yet. Be the first to comment!

You might also like