Bug: Sub-Agents Persist After Reboot and Never Time Out

A bug in OpenClaw causes sub-agent sessions to survive host reboots indefinitely, cluttering your session list and potentially causing confusion about which agents are actually running.

The Problem

When you spawn sub-agents using sessions_spawn, OpenClaw creates persistent session entries that track the spawned agent's state. These entries are written to disk so they survive gateway restarts—which is normally desirable behavior.

However, there's a gap in the cleanup logic: the timeout/reaper mechanism only runs for active gateway processes, not on startup. So if your host machine goes down (crash, reboot, power loss) while a sub-agent is mid-run, that session entry never gets cleaned up.

The result? Ghost sub-agents that appear in sessions_list forever, showing as running even though their underlying processes died with the reboot.

Why This Matters

Session list pollution: Over time, stale entries accumulate and make it harder to see which agents are actually active.
Potential resource confusion: If you're monitoring session counts or building automation around sub-agent state, stale entries throw off your numbers.
Memory overhead: Each stale session entry consumes memory in the gateway process, though this is typically minimal.

Workarounds (Until a Fix Lands)

Manual Cleanup

Sub-agent state is persisted in ~/.openclaw/subagents/runs.json. You can safely remove entries with timestamps older than your last reboot:

# Check the file
cat ~/.openclaw/subagents/runs.json

# Back it up first
cp ~/.openclaw/subagents/runs.json ~/.openclaw/subagents/runs.json.bak

# Then edit to remove stale entries

Gateway Restart Trick

Running openclaw gateway restart after a host reboot sometimes triggers a re-scan that clears stale sessions. This behavior is inconsistent, but worth trying before manual cleanup.

openclaw gateway restart

Automation Option

If you're running OpenClaw via systemd or launchd, you could add a post-boot hook that clears the runs.json file or marks all entries as timed out.

The Fix

The proposed solution is straightforward: add a startup sweep that either marks all running sub-agents as timed-out, or checks if their underlying processes still exist before treating them as active.

This would need to account for edge cases like legitimate sub-agents that are genuinely still running (though this shouldn't happen after a full reboot).

This is being tracked in openclaw/openclaw#29795. If you're experiencing this, adding your logs and reproduction steps would help the maintainers prioritize the fix.

Has anyone else run into this? Share your workarounds in the comments.

Bug: Sub-Agents Persist After Reboot and Never Time Out

The Problem

Why This Matters

Workarounds (Until a Fix Lands)

Manual Cleanup

Gateway Restart Trick

Automation Option

The Fix

Comments (0)

You might also like

Security Alert: Prompt Injection via Fake [System Message] Blocks in Message Channels

Feature Request: hooks.sessionRetention Brings Automatic Cleanup to Webhook-Triggered Sessions

Feature Request: Native GitHub Channel Would Let Your Agent Work Alongside You on Pull Requests

The Problem

Why This Matters

Workarounds (Until a Fix Lands)

Manual Cleanup

Gateway Restart Trick

Automation Option

The Fix

Related Issue

Comments (0)

You might also like

Security Alert: Prompt Injection via Fake [System Message] Blocks in Message Channels

Feature Request: hooks.sessionRetention Brings Automatic Cleanup to Webhook-Triggered Sessions

Feature Request: Native GitHub Channel Would Let Your Agent Work Alongside You on Pull Requests