GPT-5.6 Sol, Terra and Luna: OpenAI Locks Down Its Best AI

OpenAI’s newest models are so capable at cybersecurity that the company is not letting most people use them yet. On June 26, OpenAI began a limited preview of GPT-5.6, a three-model family named Sol, Terra, and Luna. Only about 20 government-vetted organizations can touch them, and that restriction is the whole story.

The models are strong. What makes this release unusual is the machinery OpenAI built around them, and what that machinery says about where frontier AI is heading.

Three models, one new naming system

GPT-5.6 splits into tiers that OpenAI says will now persist across generations. Sol is the flagship, built for the hardest problems in coding, science, and security research. Terra is the balanced middle option, priced at half of GPT-5.5 while matching much of its performance. Luna is the fast, cheap workhorse for summarizing, drafting, and routine automation.

🤖

Why a national-security review is involved

Here is where it gets interesting. OpenAI is holding back a broad release at the request of the U.S. government while a national-security review runs its course. The concern is cybersecurity. GPT-5.6 Sol is the company’s most capable model yet for finding and exploiting software vulnerabilities, and OpenAI is candid that benchmark thresholds cannot capture every way a model might be combined with other tools.

In testing against Chromium and Firefox, Sol identified bugs and the building blocks of an exploit but did not autonomously produce a working full-chain attack under the conditions tested. OpenAI says the model stays below the “Cyber Critical” threshold in its Preparedness Framework. Even so, the company paired the release with what it calls its most robust safety stack to date, and a phased rollout to a small group of trusted partners whose participation was shared with the government.

The safeguards are the product

OpenAI describes a layered defense: protections trained into the model, real-time classifiers that can pause generation mid-response for a larger reasoning model to review, account-level checks that look across multiple conversations, and differentiated access that keeps the most sensitive capabilities away from the general public.

The company also threw serious compute at breaking its own model. It dedicated more than 700,000 A100-equivalent GPU hours to automated red-teaming, hunting for universal jailbreaks that work across many prompts rather than one narrow case. That figure is a signal in itself. Safety testing at frontier labs now consumes the kind of compute that used to be reserved for training.

What defenders get, and what they wait for

OpenAI’s framing is that these capabilities should reach defenders first. Security teams can use Sol to find weaknesses, write patches, and harden systems, and the company expects substantial benefit for legitimate defensive work while making offensive misuse harder and more detectable. During the preview, some users will hit refusals or delays, especially in dual-use areas where defensive and offensive work look similar at first glance. OpenAI says that friction is part of what the preview is meant to test.

General availability is planned for the coming weeks across ChatGPT, Codex, and the API. OpenAI is also bringing Sol to Cerebras hardware at up to 750 tokens per second in July.

What this preview really marks is a shift in how the most capable models reach the world. A government-coordinated release process, even a temporary one, keeps powerful tools from developers and defenders who want them. OpenAI itself argues this should not become the default. Whether it does may shape the next several model launches more than any benchmark score. For more coverage of AI models and machine learning, visit Mylistingo.