xAI has officially announced that its Grok model lineup is now accessible through Cloudflare's AI Gateway, giving developers a unified platform to deploy, manage, and bill for Grok-powered applications. The integration covers everything from text reasoning to image and video generation — here's a breakdown of what's actually available and what it costs.

Cloudflare's AI Gateway acts as a single control plane — unified logging, caching, rate limiting, and billing — across multiple AI providers. Adding Grok to that roster means developers no longer need a separate integration layer to swap between xAI and other models. According to Cloudflare's documentation, the partnership was first formalized in August 2025, with the current wave of Grok models reflecting xAI's expanded API catalog through mid-2026.
Every Grok Model Now on the Gateway
1. Grok 4.3 — The Workhorse
Grok 4.3 is xAI's primary text model, featuring a 1-million-token context window with support for text and image inputs, function calling, structured outputs, and configurable reasoning effort (none, low, medium, or high). It's the most versatile entry point for most developer use cases. Pricing via the xAI API: $1.25 per million input tokens, $0.20 per million cached input tokens, and $2.50 per million output tokens.
2. Grok 4.20 Multi-Agent — Deep Research at Scale
This variant doubles the context window to 2 million tokens and runs multiple agents in parallel, making it purpose-built for complex, multi-step research tasks. It supports function calling, structured outputs, and full reasoning capabilities. Input tokens run at $2.00 per million; output tokens at $6.00 per million — a premium that reflects its parallel compute overhead.
3. Grok 4.20 Reasoning — Extended Thinking Mode
When a problem genuinely requires chain-of-thought work, this model returns a full reasoning trace alongside the final answer. Same pricing tier as the Multi-Agent variant ($2.00 input / $6.00 output per million tokens). Useful for math-heavy, legal, or scientific queries where showing the work matters as much as the conclusion.
4. Grok 4.20 Non-Reasoning — Speed Without the Trace
Built on the same training as the Reasoning variant but skips the thinking trace entirely for faster, single-pass responses. Identical pricing to the reasoning model, so the choice here is purely about latency vs. transparency — not cost.
5. Grok Build 0.1 — The Coding Model
Released on May 20, 2026 and available via public API beta since May 28, Grok Build 0.1 is xAI's dedicated software engineering model. It features a 256K-token context window, always-on reasoning, tool calling, structured outputs, and text/image input. According to xAI, it's served at over 100 tokens per second. Pricing is the most developer-friendly in the lineup: $1.00 per million input tokens, $0.20 per million cached, and $2.00 per million output tokens.
6. Grok Imagine — Image and Video Generation
Three generation models round out the catalog. Grok Imagine Image generates and edits images from text and reference-image inputs with configurable aspect ratio and resolution. Grok Imagine Image Quality is a higher-fidelity variant optimized for sharper detail, accurate composition, and stronger text rendering. Grok Imagine Video generates, edits, and extends video from text and image inputs with native synchronized audio. Specific per-generation pricing for these models is available through xAI's API documentation.
What Cloudflare AI Gateway Actually Adds
Cloudflare's platform itself doesn't charge a per-request fee for AI Gateway usage — costs are based on Cloudflare Workers compute, log volume, and your plan tier. What you get on top of raw API access: a single dashboard for observability across all your AI providers, built-in caching to reduce redundant token spend, rate limiting to prevent runaway costs, and unified billing so a team doesn't need separate invoices for every model provider they use.
For teams already running workloads on Cloudflare's edge network, the Grok integration removes a meaningful integration tax. Rather than maintaining a custom proxy or middleware to route between model providers, the gateway handles it natively — and that's the real story here, not just another API endpoint going live.

Sarah focuses on Tesla Energy, SpaceX missions, and the broader Musk AI portfolio. Former data analyst in clean energy. Based in San Francisco.
Sources verified at publish time. Spotted an inaccuracy? Email editorial@basenor.com.







