Dual WebSocket Architecture: Separating Conversation from Operations in OpenClaw
If you've ever noticed your OpenClaw chat feeling sluggish when sub-agents complete their work, you're not alone. A new feature proposal on GitHub (#28750) tackles this head-on with an elegant architectural solution: dual WebSocket channels.
The Problem: One Pipe, Too Many Jobs
Currently, OpenClaw uses a single WebSocket connection for everything โ your conversation with the agent, sub-agent delivery results, heartbeat completions, and cron task outputs. Under normal use, this works fine. But when things get busy, the cracks show:
- Message lag: Your messages queue behind sub-agent result processing
- Perceived hangs: The chat appears frozen while background deliveries process
- Dropped messages on reconnect: If the WebSocket drops under contention, both conversation and operations state vanish
The root cause is architectural: one WebSocket serves as both a conversation channel and a work queue. These have fundamentally different requirements.
The Proposal: Two Ports, Two Purposes
The solution is straightforward โ add a second WebSocket port dedicated to operations:
- Conv port (existing, e.g., 18789): User โ agent conversation only. This is sacred, lowest latency.
- Ops port (new, e.g., 18790): Sub-agent deliveries, heartbeat results, cron completions, system events.
The agent runtime reads from both sockets but prioritizes conversation โ if a user message is waiting, it's handled before any ops message. Operations queue and process during idle windows.
On the client side, the webchat opens two connections. The conv socket drives the chat UI; the ops socket drives a notification/status indicator.
Why Two Ports Instead of Multiplexing?
You might wonder why not just multiplex everything on one connection with priority headers. The GitHub issue explains the reasoning:
- Zero contention: Different TCP connections mean different OS buffers
- Independent failure: If the ops socket drops, conversation continues uninterrupted
- Simpler code: No priority lanes, no envelope parsing โ each server does one thing well
- Independent tuning: Ops can have larger payloads, longer timeouts, and dedicated retry logic
How It Relates to Existing Infrastructure
A community member dug into the source and found that OpenClaw already has lane-based command queuing (src/process/command-queue.ts, src/process/lanes.ts) with Main, Cron, Subagent, and Nested lanes. The lane infrastructure handles execution serialization well.
The bottleneck is delivery โ all lane outputs still funnel through the single WebSocket connection. The dual-port proposal would extend the existing lane concept to the transport layer: Main lane traffic on the conv port, Subagent/Cron lane traffic on the ops port.
Backward Compatibility
Importantly, this would be opt-in:
- Ops port is optional and off by default
- If not configured, everything routes through the single socket as today
- No breaking changes to existing clients
This means you can upgrade without changing anything, then enable the dual-port setup when you're ready to test it.
What This Means for Heavy Users
If you run multiple sub-agents, have frequent cron jobs, or use heartbeats extensively, this could significantly improve your experience. The conversation channel stays responsive even when your agent is juggling background work.
Follow issue #28750 for updates on this proposal.
Comments (0)
No comments yet. Be the first to comment!