Is Grok 4.5 currently in private beta?

According to a quoted statement attributed to Elon Musk and circulated by AshutoshShrivastava, Grok 4.5 is in private beta at SpaceX and Tesla. The original primary post by Musk is not included in the signal set, so the claim is sourced via community quoting.

What foundation model does Grok 4.5 reportedly use?

TestingCatalog reports that Grok 4.5 is based on a 1.5T V9 foundation model, with Cursor data added in supplemental training. This is the only quantified architectural claim in the current leak trail and it has not been confirmed by an official xAI model card.

Is Cursor involved in training Grok models?

Multiple community accounts, including Mark Kretschmann and TestingCatalog, describe a Grok or Cursor collaboration in which Cursor data is added during supplemental training. The exact data composition, license terms, and volume are not disclosed in any public document in this signal set.

Has Grok 4.5 been benchmarked against Claude Opus?

No formal benchmarks have been published. The Musk-attributed claim circulated by AshutoshShrivastava states Grok 4.5 performs close to or beyond Opus, and TestingCatalog repeats the framing. No SWE-bench, MMLU, or other standardized scores accompany the claim.

What is Grok 5 expected to look like?

Mark Kretschmann reports that Grok 5 is expected in two variants, at 6T and 10T parameters, putting it in roughly the same weight class as Fable 5, with Cursor data in the training mix. No release date, system card, or eval suite has been confirmed by xAI.

When will Grok 4.5 or the Cursor Composer 3 model be released?

Mark Kretschmann posted on June 28 that the version number has been removed from xAI menus, which he describes as a pattern that historically precedes a release. The post is the only evidence-tagged tweet in this signal set, and no official xAI release date has been published.

Does xAI host the Grok 4.5 private beta externally?

The community reports describe access being limited to SpaceX and Tesla. There is no mention of external partner access, API availability, or public preview in this signal set.

What evidence backs the imminent-release claim?

The only evidence-tagged tweet in the signal set is Mark Kretschmann's June 28 post showing a screenshot of xAI menus with the version number removed. All other architectural and performance claims are unsourced community posts.

TLDRPrivate beta at SpaceX and Tesla, a 1.5T V9 foundation, Cursor data — what the Grok 4.5 leak trail actually confirms and what it doesn't.

Grok 4.5 and the 1.5T Cursor Model: What the xAI Leak Trail Actually Shows

On the morning of June 28, 2026, a version number quietly disappeared from xAI's product menus. A few hours later, a quoted line attributed to Elon Musk surfaced on X: Grok 4.5 is in private beta at SpaceX and Tesla, "performing close to or beyond Opus." There is no system card. There is no benchmark sheet. There is no API note. There is a screenshot, a quote, and a 1.5T parameter number that no one at xAI has confirmed in writing.

TLDR The Grok 4.5 leak trail consists of one evidence-tagged screenshot of a missing version number, a quoted Musk statement circulated through two community accounts, and a single architectural claim about a 1.5T V9 foundation model with Cursor data added in supplemental training. The "close to or beyond Opus" performance line is community-sourced quoting, not a measured benchmark. The earlier Grok 5 thread — a 6T and 10T parameter pair in the Fable 5 weight class — is a separate, unverified roadmap claim. Builders should treat all of it as directional signal, not spec.

Key Takeaways

A quoted line attributed to Elon Musk says Grok 4.5 is in private beta at SpaceX and Tesla, circulated by AshutoshShrivastava.
TestingCatalog reports Grok 4.5 sits on a 1.5T V9 foundation model with Cursor data added in supplemental training.
The only evidence-tagged signal is Mark Kretschmann's June 28 screenshot showing the Grok / Cursor Composer 3 version number removed from xAI menus.
No benchmark scores, pricing, context window, or API timing have been disclosed in this signal set.
The Cursor training data story has been building since June 16 across at least three accounts, with Mark Kretschmann tying Grok 5 to a 6T and 10T parameter pair in the Fable 5 weight class.

What Was Actually Seen

The traceable signal trail is narrow. Three tweets carry most of the architectural and product weight.

The first is the screenshot. On June 28 at 06:33 UTC, Mark Kretschmann posted that "the release of the Grok 1.5T / Cursor Composer 3 model is imminent, as the version number has been removed from the menus. This always happens shortly before a release from @xai." The post is the only entry in the signal set tagged as evidence and the only one with an attached media asset.

Seem like the release of the Grok 1.5T / Cursor Composer 3 model is imminent, as the version number

Source: @mark_k

About four and a half hours later, two accounts surfaced a quoted line from Musk within minutes of each other. AshutoshShrivastava posted that "Grok 4.5 is now in private beta at SpaceX and Tesla, performing close to or beyond Opus." TestingCatalog amplified the same framing and added the architectural detail: "Grok 4.5 is based on 1.5T V9 foundation model, with Cursor data added in supplemental training." Both posts are flagged as quote-tweets in the signal, which means each is attaching commentary to a primary post that is not itself in the bundle.

That is the entirety of the verified surface. Everything beyond it — performance claims, release timing, model variant structure, training data composition — sits on top of those three signals.

A parallel thread, posted by Mark Kretschmann a week earlier, frames Grok 5 separately. He describes it as arriving in two variants at 6T and 10T parameters, in the same weight class as Fable 5, with Cursor data in the training mix and a stated emphasis on agentic coding. None of those numbers are sourced to an xAI document.

Why The Cursor Signal Matters

The most consistent thread across June is not a benchmark or a release date. It is the Cursor data story.

Mark Kretschmann first surfaced the "Grok / Cursor 1.5T model" framing on June 16. He returned to it on June 21 with a more architectural framing, then again on June 28 with the screenshot. TestingCatalog independently used the same architectural language on June 28, citing 1.5T V9 plus Cursor supplemental data. Three posts across nearly two weeks, two accounts, one consistent claim: Cursor's data — presumably IDE traces, agentic coding traces, or both — is being folded into xAI's training pipeline.

The reason this matters more than the parameter count is incentive alignment. Cursor is the busiest agentic coding surface in production today, and the data it generates is precisely the data that frontier coding models are otherwise expensive to collect. A vendor that trains directly on that distribution has, in principle, a structural advantage on long-horizon coding tasks that no synthetic eval set captures well. That is the thesis the community is rallying around. It is not yet a measurement.

On the limited evidence so far, the Cursor Training Loop — if it exists in the form described — would be a meaningful structural shift, but the data license, the volume, and the exact mixing strategy are all unverified.

What We Can Reasonably Expect

A few things are plausible given the signal trail, with appropriate hedging.

First, the menu removal pattern. Kretschmann claims this "always happens shortly before a release from @xai." It is plausible the Grok 1.5T / Cursor Composer 3 SKU enters wider preview within days, not weeks. The pattern is community-asserted and not corroborated by an xAI changelog, so the strength of the prediction depends on whether prior xAI releases followed the same menu-clearing tell.

Second, the Opus framing. The "close to or beyond Opus" quote is the kind of comparative line that vendors use ahead of public benchmarks. It is directional, not measured. Builders should expect xAI to publish at least a partial eval table when the model goes public — SWE-bench Verified and a coding-agent suite are the most likely candidates given the Cursor training emphasis. The exact baseline Opus version is not specified, which matters: Claude Opus 4.7 and Claude Opus 4.8 are both circulating, and the gap between them is non-trivial.

Third, the variant structure. Kretschmann's earlier Grok 5 post describes two variants at 6T and 10T. The Grok 4.5 line at 1.5T appears to be a tier below that, possibly a faster / cheaper SKU positioned closer to current Sonnet-class models. If the 1.5T V9 / Cursor Composer 3 framing holds, expect a coding-focused mid-tier SKU first, with the larger Grok 5 variants following on a longer timeline.

Fourth, distribution. The private beta is described as running inside SpaceX and Tesla. That is a narrow blast radius and consistent with xAI's prior pattern of internal dogfooding ahead of external preview. Whether external partners — Cursor itself being the obvious candidate — get access before a public API is open.

Grok 4.5 vs Claude Opus: What the Signal Says

The Opus comparison is the most-cited claim in the signal trail, so it deserves direct scrutiny. The bundle does not contain measured comparison data, only community-relayed quoting. The honest read is dimension-by-dimension.

Stated performance band. Grok 4.5 is described as "close to or beyond Opus" via a quote attributed to Musk and circulated by AshutoshShrivastava and TestingCatalog. Claude Opus has shipped public model cards and benchmark suites for each major version. The asymmetry of evidence is the headline.
Parameter scale. Grok 4.5 reportedly uses a 1.5T V9 foundation model per TestingCatalog. Anthropic does not publish Opus parameter counts, so a direct number comparison is unverified — no public number from one lab in this signal set.
Training data signal. Grok 4.5 reportedly includes Cursor data in supplemental training. Anthropic's Opus models do not have a publicly disclosed analogous IDE-trace ingest, though both labs presumably ingest large code corpora. The Cursor pipeline is the differentiator the community is anchoring on.
Distribution. Grok 4.5 is private-beta inside SpaceX and Tesla per the community quoting. Opus models ship to API, claude.ai, and partner platforms simultaneously. Grok 4.5 is at an earlier surface than any current Opus SKU.
Coding specialization. The Grok 4.5 thesis is explicitly agentic coding, driven by the Cursor data story. Opus competes broadly across reasoning, coding, and long-context tasks without a single dominant specialization framing.

The Cursor Training Loop framing gives Grok 4.5 a clean story to tell. It does not, on its own, demonstrate that the model clears Opus on any specific eval. Anyone planning a migration on the strength of the "close to or beyond Opus" line is migrating on a quoted sentence, not a measurement.

What We Know vs. What We Can't Yet Verify

Confirmed by the signal bundle:

A quoted line attributed to Musk says Grok 4.5 is in private beta at SpaceX and Tesla, surfaced by AshutoshShrivastava.
TestingCatalog reports Grok 4.5 is based on a 1.5T V9 foundation model with Cursor data in supplemental training.
Multiple community accounts, including Mark Kretschmann, describe a Cursor training collaboration with Grok models in the broader pipeline.
A Musk-attributed claim states Grok 4.5 performs close to or beyond Opus, with no formal benchmarks accompanying the quote.
Mark Kretschmann reports that Grok 5 is expected in two variants at 6T and 10T parameters, in the Fable 5 weight class.
Per Mark Kretschmann's June 28 post, the version number for the Grok 1.5T / Cursor Composer 3 model was removed from xAI menus on the morning of June 28, which he describes as a pattern that precedes xAI releases.
The only evidence-tagged tweet in this set is Kretschmann's screenshot showing the menu state.
xAI is described in Grok's own product page as offering Chat, Multi-agent, Search, and Imagine surfaces — the existing product surface area the Grok 4.5 model would slot into.

Open questions the bundle does not resolve:

No benchmark numbers have been published for Grok 4.5 — no SWE-bench, no MMLU, no coding-agent eval, nothing standardized.
The specific Opus version used as the comparison baseline is not stated.
The Cursor data license terms, volume, and exact mixing recipe are not disclosed.
No public timing has been given for either Grok 4.5 GA or an external API surface.
Pricing, context window, and per-token cost are unstated.
The relationship between Grok 4.5, Grok 5, and the reported 1.5T / 6T / 10T variant structure has not been laid out in any official xAI document.
It is unclear whether Cursor itself or external partners get access ahead of public release.
The Grok 1.5T / Cursor Composer 3 model and the Grok 4.5 referenced in the Musk quote may or may not be the same SKU under different labels — the bundle is ambiguous.

How To Evaluate It Yourself When Access Opens

When the model surfaces publicly, the signal-to-noise problem will get harder, not easier. The first 72 hours of any xAI release tend to be flooded with vibe-check posts and selective demos. A few practical evaluation moves keep you honest.

Run a coding-agent eval before trusting any agentic-coding claim. SWE-bench Verified is the obvious one, but pair it with a private internal task set the model has not seen — ideally something with multi-file edits and a real test harness. A model trained on Cursor traces will look exceptional on tasks that resemble Cursor traces and merely good on tasks that don't.

Probe long-context behavior at the actual advertised length, not just nominal capacity. Many models that advertise large contexts degrade sharply past a certain depth. The bundle does not state a context window for Grok 4.5, so this is a known-unknown to test first.

Compare cost per resolved task, not cost per token. The Cursor Training Loop story implies a possible efficiency edge on coding workflows. Cost per token is a misleading metric when one model resolves a multi-step task in two calls and another takes seven.

Watch for the official model card. If xAI publishes one with the launch, the gap between the card's claims and community vibe checks is itself a signal about positioning.

Why This Matters For Builders

The Grok 4.5 leak trail is interesting less because of the model and more because of the data pipeline it implies. Several frontier labs are converging on the same problem: producing models that are strong at agentic, multi-step coding tasks where the bottleneck is no longer raw reasoning but environmental fluency — knowing how an IDE behaves, how a codebase is structured, how a developer iterates.

Cursor's training data, if folded into xAI's pipeline at meaningful volume, is one of the cleanest known approaches to that problem. It is also the kind of partnership that, if it generalizes, changes the procurement question for builders. Choosing a coding model becomes partly a question of which dev-tool data exhaust it was trained on. That is a different framing than "which model is smartest."

For teams currently committed to Claude or GPT for agentic coding, the practical action this week is not to switch. It is to design an evaluation harness that can quickly answer the migration question when Grok 4.5 actually opens. Building that harness against a known baseline now is cheaper than building it under release-week pressure.

What To Watch Next

Three observation signals are worth pinning over the next two weeks.

Watch for the official model card. xAI tends to publish at least a partial spec sheet at public release, and the gap between the Musk-attributed Opus framing and any actual benchmark table will be the first real data point.

Run your own coding eval before relying on the Cursor-trained advantage. The structural argument is plausible. The measurement is not yet in the public record.

Pin the variant relationship. Whether Grok 4.5 at 1.5T, the Grok 1.5T / Cursor Composer 3 SKU, and the eventual Grok 5 6T / 10T variants are the same family or distinct lines materially changes the roadmap reading. The first xAI release note should clarify it.

Building similar agentic coding workflows on chat-class models? On kie.ai you can try Claude Opus 4.8, GPT-5.5, and Gemini 3 Pro.

Grok 4.5 Leak: 1.5T Cursor Model Deep Dive