Multi-Image Analysis Arrives: OpenClaw 2026.2.17 Overhauls the Image Tool
OpenClaw 2026.2.17 brings a significant upgrade to the image tool that makes working with visual content much more powerful. If you've ever wanted your agent to analyze multiple images at once鈥攃omparing screenshots, processing batches of documents, or reviewing a series of photos鈥攖his update is for you.
What Changed
The image tool now supports two distinct parameters:
image(string): Analyze a single image by passing its path or URLimages(array): Analyze multiple images in one call for comparison or batch processing
This might seem like a small API change, but it solves a real problem. Previously, multi-image analysis required workarounds, and certain providers (notably Anthropic) had compatibility issues with the tool's schema due to union types (anyOf/oneOf/allOf).
Why This Matters
1. Clean Provider Compatibility
The old schema used union types that some providers couldn't parse correctly. By splitting into explicit image and images parameters, the tool now works reliably across all configured vision models鈥攊ncluding Claude, GPT-4o, and Gemini.
2. Genuine Multi-Image Support
Need to compare two UI mockups? Analyze a set of receipts? Process multiple charts at once? Now you can pass an array of images and get a single, coherent analysis that considers all of them together.
3. Better Error Handling
The update also includes base64 payload validation before submission (via PR #18263). Invalid image data now fails fast with a clear error, rather than silently sending garbage to the provider and burning tokens.
How to Use It
Single image analysis (unchanged):
Analyze this screenshot: /path/to/screenshot.png
Multi-image analysis (new):
Compare these two designs and tell me which has better visual hierarchy:
- design_v1.png
- design_v2.png
Your agent will automatically use the images parameter when multiple paths are involved.
Configuration Tips
The image tool respects your agents.defaults.imageModel setting. If you haven't configured one, it uses auto-discovery to find a vision-capable model. For best results:
agents:
defaults:
imageModel: anthropic/claude-sonnet-4-5 # or your preferred vision modelRelated Fixes
This release also improved image handling elsewhere:
- Collapsed resize diagnostics: Image processing logs now show one line per image with visible pixel/byte size details, making debugging much cleaner
- Media understanding fix: The
imageModelsetting is now properly honored during auto-discovery (PR #7607)
The Bigger Picture
This change reflects OpenClaw's ongoing push toward provider-agnostic tooling. By avoiding schema features that not all providers support equally, tools become more reliable across different backends. It's a pattern you'll see more of as the ecosystem matures.
Reference: openclaw/openclaw releases
Have you been using multi-image analysis in your agents? Share your use cases in the comments!
Comments (0)
No comments yet. Be the first to comment!