Fix Unresponsive Local Models: How to Limit Context Size for Ollama in OpenClaw
Running OpenClaw with local Ollama models can be a great way to keep costs down and data private โ but many users hit a frustrating wall when their bot becomes unresponsive mid-conversation. The culprit? Context windows that are too large for local hardware.
The Problem
A community member recently asked in Discord:
"I'm trying to limit the context size from 128k down to something manageable on local hardware, but no matter how/where I set it, it stays at 128k. Currently my bot fills up the context to the point where it becomes unresponsive and I have to start a new session to continue."
This is a common issue. Many Ollama models default to 128k context windows, which sounds impressive until your 8GB or 16GB GPU runs out of VRAM and everything grinds to a halt.
The Solution: agents.defaults.contextTokens
The fix is straightforward. In your OpenClaw configuration, set the contextTokens parameter under agents.defaults:
{
"agents": {
"defaults": {
"contextTokens": 30000
}
}
}You can adjust the value based on your hardware capabilities:
- 8GB VRAM: Try 8,000-16,000 tokens
- 16GB VRAM: 16,000-32,000 tokens typically works well
- 24GB+ VRAM: You might be able to push 48,000-64,000 tokens
Applying the Change
Here's the important part that trips people up: after changing the config, you must start a new session.
In your chat, run:
/new
This creates a fresh session that respects your new context limit. Existing sessions will continue using their original context settings.
Verifying It Worked
You can check your current session's context usage with:
/context detail
This shows you something like:
Session tokens (cached): 10,178 total / ctx=30000
If you see your configured limit in the ctx= value, you're good to go.
A Note About ollama ps
One user noticed that even after setting contextTokens, running ollama ps still showed 128k context. This is expected behavior โ Ollama reports its maximum capability, while OpenClaw enforces the actual limit on its end. As long as your session shows the correct limit via /context detail, you're operating within your specified bounds.
Summary
| Step | Action |
|---|---|
| 1 | Set agents.defaults.contextTokens in your config |
| 2 | Run /new to start a fresh session |
| 3 | Verify with /context detail |
This simple change can transform an unresponsive local setup into a smooth, reliable assistant. The key is finding the sweet spot between context length and your hardware's actual capabilities.
Tip sourced from the OpenClaw Discord #users-helping-users channel. Thanks to Malspherus for the question and Goarder Gerg for the solution!
Comments (0)
No comments yet. Be the first to comment!