Fix Unresponsive Local Models: How to Limit Context Size for Ollama in OpenClaw

Running OpenClaw with local Ollama models can be a great way to keep costs down and data private — but many users hit a frustrating wall when their bot becomes unresponsive mid-conversation. The culprit? Context windows that are too large for local hardware.

The Problem

A community member recently asked in Discord:

"I'm trying to limit the context size from 128k down to something manageable on local hardware, but no matter how/where I set it, it stays at 128k. Currently my bot fills up the context to the point where it becomes unresponsive and I have to start a new session to continue."

This is a common issue. Many Ollama models default to 128k context windows, which sounds impressive until your 8GB or 16GB GPU runs out of VRAM and everything grinds to a halt.

The Solution: `agents.defaults.contextTokens`

The fix is straightforward. In your OpenClaw configuration, set the contextTokens parameter under agents.defaults:

{
  "agents": {
    "defaults": {
      "contextTokens": 30000
    }
  }
}

You can adjust the value based on your hardware capabilities:

8GB VRAM: Try 8,000-16,000 tokens
16GB VRAM: 16,000-32,000 tokens typically works well
24GB+ VRAM: You might be able to push 48,000-64,000 tokens

Applying the Change

Here's the important part that trips people up: after changing the config, you must start a new session.

In your chat, run:

/new

This creates a fresh session that respects your new context limit. Existing sessions will continue using their original context settings.

Verifying It Worked

You can check your current session's context usage with:

/context detail

This shows you something like:

Session tokens (cached): 10,178 total / ctx=30000

If you see your configured limit in the ctx= value, you're good to go.

A Note About `ollama ps`

One user noticed that even after setting contextTokens, running ollama ps still showed 128k context. This is expected behavior — Ollama reports its maximum capability, while OpenClaw enforces the actual limit on its end. As long as your session shows the correct limit via /context detail, you're operating within your specified bounds.

Summary

Step	Action
1	Set `agents.defaults.contextTokens` in your config
2	Run `/new` to start a fresh session
3	Verify with `/context detail`

This simple change can transform an unresponsive local setup into a smooth, reliable assistant. The key is finding the sweet spot between context length and your hardware's actual capabilities.

Tip sourced from the OpenClaw Discord #users-helping-users channel. Thanks to Malspherus for the question and Goarder Gerg for the solution!

Fix Unresponsive Local Models: How to Limit Context Size for Ollama in OpenClaw

The Problem

The Solution: `agents.defaults.contextTokens`

Applying the Change

Verifying It Worked

A Note About `ollama ps`

Summary

Comments (0)

You might also like

Lock Down Your OpenClaw Instance: A 13-Step Security Hardening Guide for Beginners

Bug: Discord Application ID Precision Loss Breaks Newer Bot Connections

Web Search in OpenClaw Requires an API Key: Here Are Your Options

The Problem

The Solution: agents.defaults.contextTokens

Applying the Change

Verifying It Worked

A Note About ollama ps

Summary

Comments (0)

You might also like

Lock Down Your OpenClaw Instance: A 13-Step Security Hardening Guide for Beginners

Bug: Discord Application ID Precision Loss Breaks Newer Bot Connections

Web Search in OpenClaw Requires an API Key: Here Are Your Options

The Solution: `agents.defaults.contextTokens`

A Note About `ollama ps`