Grok Models on Cloudflare AI Gateway: What Developers Get

📌 UPDATE — June 4, 2026

Elon Musk has officially confirmed the xAI–Cloudflare partnership with a direct post on X, signaling that Grok is now actively running on Cloudflare's global infrastructure — not just listed as a model provider on AI Gateway. This official acknowledgment suggests the integration goes deeper than a simple API listing, potentially encompassing Cloudflare's network for routing, DDoS protection, and edge performance at scale. The announcement has already drawn nearly 500K views, underscoring significant developer and public interest in the infrastructure move.

Elon Musk @elonmusk · Jun 4, 2026

Grok on Cloudflare

❤️ 3,534 | 🔁 409 | 👁 493,491

xAI has officially announced that its Grok model lineup is now accessible through Cloudflare's AI Gateway, giving developers a unified platform to deploy, manage, and bill for Grok-powered applications. The integration covers everything from text reasoning to image and video generation — here's a breakdown of what's actually available and what it costs.

xAI announces Grok models on Cloudflare AI Gateway — Source: @xai — June 3, 2026

▶ Watch Video on X

Cloudflare's AI Gateway acts as a single control plane — unified logging, caching, rate limiting, and billing — across multiple AI providers. Adding Grok to that roster means developers no longer need a separate integration layer to swap between xAI and other models. According to Cloudflare's documentation, the partnership was first formalized in August 2025, with the current wave of Grok models reflecting xAI's expanded API catalog through mid-2026.

Every Grok Model Now on the Gateway

1. Grok 4.3 — The Workhorse

Grok 4.3 is xAI's primary text model, featuring a 1-million-token context window with support for text and image inputs, function calling, structured outputs, and configurable reasoning effort (none, low, medium, or high). It's the most versatile entry point for most developer use cases. Pricing via the xAI API: $1.25 per million input tokens, $0.20 per million cached input tokens, and $2.50 per million output tokens.

2. Grok 4.20 Multi-Agent — Deep Research at Scale

This variant doubles the context window to 2 million tokens and runs multiple agents in parallel, making it purpose-built for complex, multi-step research tasks. It supports function calling, structured outputs, and full reasoning capabilities. Input tokens run at $2.00 per million; output tokens at $6.00 per million — a premium that reflects its parallel compute overhead.

3. Grok 4.20 Reasoning — Extended Thinking Mode

When a problem genuinely requires chain-of-thought work, this model returns a full reasoning trace alongside the final answer. Same pricing tier as the Multi-Agent variant ($2.00 input / $6.00 output per million tokens). Useful for math-heavy, legal, or scientific queries where showing the work matters as much as the conclusion.

4. Grok 4.20 Non-Reasoning — Speed Without the Trace

Built on the same training as the Reasoning variant but skips the thinking trace entirely for faster, single-pass responses. Identical pricing to the reasoning model, so the choice here is purely about latency vs. transparency — not cost.

5. Grok Build 0.1 — The Coding Model

Released on May 20, 2026 and available via public API beta since May 28, Grok Build 0.1 is xAI's dedicated software engineering model. It features a 256K-token context window, always-on reasoning, tool calling, structured outputs, and text/image input. According to xAI, it's served at over 100 tokens per second. Pricing is the most developer-friendly in the lineup: $1.00 per million input tokens, $0.20 per million cached, and $2.00 per million output tokens.

6. Grok Imagine — Image and Video Generation

Three generation models round out the catalog. Grok Imagine Image generates and edits images from text and reference-image inputs with configurable aspect ratio and resolution. Grok Imagine Image Quality is a higher-fidelity variant optimized for sharper detail, accurate composition, and stronger text rendering. Grok Imagine Video generates, edits, and extends video from text and image inputs with native synchronized audio. Specific per-generation pricing for these models is available through xAI's API documentation.

What Cloudflare AI Gateway Actually Adds

Cloudflare's platform itself doesn't charge a per-request fee for AI Gateway usage — costs are based on Cloudflare Workers compute, log volume, and your plan tier. What you get on top of raw API access: a single dashboard for observability across all your AI providers, built-in caching to reduce redundant token spend, rate limiting to prevent runaway costs, and unified billing so a team doesn't need separate invoices for every model provider they use.

For teams already running workloads on Cloudflare's edge network, the Grok integration removes a meaningful integration tax. Rather than maintaining a custom proxy or middleware to route between model providers, the gateway handles it natively — and that's the real story here, not just another API endpoint going live.

Sources & reporting notes

The links below identify the material source records used for this report.

@xai on X (2026-06-03T22:03:48.000Z) — Direct source
@elonmusk on X (2026-06-04T01:30:47.000Z) — Direct source

Source links are preserved as published or accessed. See our editorial standards and corrections policy.

BASENOR Editorial Desk

BASENOR Newsroom

The BASENOR Editorial Desk covers Tesla, SpaceX, and related technology, curating reporting from primary sources — official accounts, regulatory filings, and software release data. Every article passes source-record and fact-checking review before publication. About the newsroom.

This report was curated by the BASENOR Editorial Desk from the sources listed above. Read our editorial standards or email editorial@basenor.com to report an error.

Tags: Ai & robotics

Stay in the Loop

Join 27,000+ Tesla owners who get our tips first — plus 10% OFF

Shop Tesla Accessories — Free USA Shipping