Grok Imagine Video 1.5: 7 Things Developers Need to Know

📌 UPDATE — June 4, 2026

Elon Musk confirmed that Grok Imagine 1.5 has officially launched, sharing an AI-generated trailer for the Iliad (Troy) as a showcase of its capabilities. Additionally, Grok Imagine is now available on Vercel, making it significantly easier for developers to integrate xAI's image generation directly into their deployment workflows. Both announcements came within minutes of each other on June 4, 2026.

Elon Musk tweet: Iliad trailer made by Grok Imagine 1.5

xAI just dropped Grok Imagine Video 1.5 Preview into its API, and the numbers behind this release are hard to ignore. The new model debuts at the top of the Artificial Analysis Video Arena Image-to-Video leaderboard and brings a stack of technical upgrades that put it ahead of some well-funded rivals. Here's what developers and early adopters need to know.

Grok announcing Imagine Video 1.5 Preview available via API — Source: @grok — June 3, 2026

▶ Watch Video on X

1. It Debuted at #1 on the Video Arena Leaderboard

Grok Imagine Video 1.5 entered the Artificial Analysis Video Arena Image-to-Video leaderboard in first place with an Elo rating of 1404 ±6. That's a +52 Elo point jump over its predecessor, Grok Imagine Video 1.0, and it's already above ByteDance's Seedance 2.0. For a preview-stage model, that's a meaningful opening statement.

2. Native Synchronized Audio — Generated in One Pass

This is the headline capability. Version 1.5 generates synchronized audio — dialogue, lip-sync, sound effects, and ambient music — jointly with video tokens in a single inference pass. Previous approaches typically bolted audio on after the fact. Doing it in one pass produces more natural-sounding dialogue and environmental sound, which matters a lot for anything approaching cinematic output.

3. Clips Now Run Up to 15 Seconds

The prior limit was 10 seconds. Version 1.5 extends that to 15 seconds — a 50% increase in duration control. Users can request any length from 1 to 15 seconds, giving more flexibility for storytelling without having to chain clips immediately.

4. Generation Speed Is 2–3x Faster Than Seedance 2.0

A 5-second 720p clip generates in approximately 20–30 seconds. According to available benchmarks, that's two to three times faster than ByteDance's Seedance 2.0 at comparable quality. For developers building production pipelines, inference speed is often the practical bottleneck — this gap is significant.

5. Measurable Physical Realism Improvements

xAI lists specific gains in cloth dynamics, water simulation, hair motion, and object interaction. High-motion scenes show reduced subject deformation, micro-expressions are sharper, and translucent or glass material rendering has improved. These aren't vague marketing claims — they're the kind of physics-layer improvements that show up clearly in side-by-side comparisons.

6. Video Chaining and Multi-Workflow Support

Imagine Video 1.5 is optimized for clip extension, letting users chain segments into longer multi-shot narratives with improved continuity between clips. Beyond image-to-video, the API also supports text-to-video, video editing, multi-image editing, and reference-to-video workflows — making it a broader creative toolkit rather than a single-function model.

7. Built on Aurora + Colossus 2 Infrastructure

The model runs on xAI's Aurora engine, an autoregressive mixture-of-experts architecture that predicts tokens across interleaved text, image, video, and audio modalities. It was trained on Colossus 2 — xAI's supercomputer facility running approximately 555,000 NVIDIA GPUs. The video pipeline also integrates technology from Hotshot, a video generation startup xAI acquired in March 2025.

Access and Rollout

Grok Imagine Video 1.5 Preview is available now via api.x.ai, identified by the alias grok-imagine-video-1.5-2026-05-30. A broader consumer rollout to X Premium tiers is still in progress. Input formats supported include JPG, JPEG, PNG, WEBP, GIF, and AVIF. Output is H.264 MP4 at 24fps across seven aspect ratios, at 480p or 720p resolution.

The API-first release follows xAI's pattern of giving developers early access before consumer features land on X proper. If the leaderboard position holds once the model sees broader testing, this could shift how the image-to-video space is benchmarked heading into the second half of 2026.

Sources & reporting notes

The links below identify the material source records used for this report.

@grok on X (2026-06-03T17:29:08.000Z) — Direct source
@elonmusk on X (2026-06-04T00:54:09.000Z) — Direct source
@elonmusk on X (2026-06-04T00:36:35.000Z) — Direct source

Source links are preserved as published or accessed. See our editorial standards and corrections policy.

BASENOR Editorial Desk

BASENOR Newsroom

The BASENOR Editorial Desk covers Tesla, SpaceX, and related technology, curating reporting from primary sources — official accounts, regulatory filings, and software release data. Every article passes source-record and fact-checking review before publication. About the newsroom.

This report was curated by the BASENOR Editorial Desk from the sources listed above. Read our editorial standards or email editorial@basenor.com to report an error.

Tags: Ai & robotics

Stay in the Loop

Join 27,000+ Tesla owners who get our tips first — plus 10% OFF

Shop Tesla Accessories — Free USA Shipping