Major U.S. AI Labs Now Subject to Pre-Release Government Security Reviews

11:13

By Drew Todd

A center with a complicated history

Before assessing what CAISI's expanded agreements mean, it's worth understanding what CAISI actually is—and the turbulence surrounding it. The Center was originally established in 2023 under the Biden Administration as the AI Safety Institute. The Trump Administration renamed it last year, with Commerce Secretary Howard Lutnick framing the rebrand as a move away from what he called regulation "under the guise of national security."

The Center still lacks permanent legal standing. Its appointed director, Collin Burns—a former Anthropic and OpenAI researcher—was pushed out just four days into the job after White House officials raised concerns about his ties to Anthropic, given the administration's ongoing dispute with the company. (That dispute has its own edge: the Pentagon designated Anthropic a supply chain risk in March after the company refused to lower guardrails on autonomous weapons, though a federal judge later called that designation "Orwellian.") Burns had relocated across the country and given up Anthropic equity to take the position.

CAISI Director Chris Fall, who took over after Burns was ousted, said of the new agreements: "Independent, rigorous measurement science is essential to understanding frontier AI and its national security implications."

This context matters. An oversight body with an unstable leadership history, no permanent legal standing, and a complicated relationship with at least one of the companies it's evaluating is not a picture of institutional strength. That doesn't mean the work is without value. But it does set a bar for scrutiny that the framework's boosters should welcome, not deflect.

The internet parallel nobody wants to repeat

There's a well-worn cautionary tale embedded in the CAISI rationale: the early internet was engineered for resilience and openness, with almost no consideration for how its security gaps would eventually be weaponized. The result was decades of espionage, mass data exfiltration, and entrenched cybercrime ecosystems that security teams are still fighting today. The architects of the internet weren't reckless; they were building for a different threat model than the one that materialized.

AI carries a comparable risk profile. Frontier models are opaque, powerful, and being integrated into critical systems faster than the security community can assess them. CAISI represents a deliberate attempt not to make the same mistake twice—to subject transformative technology to security scrutiny while there is still time to act on what's found.

Ronald Lewis, Head of Cybersecurity Governance at Black Duck, put it directly: the initiative reflects "a hard‑won recognition of the national security risks that arise when transformative technologies are deployed before their implications are fully understood."

The question isn't whether that framing is correct; it almost certainly is. The question is whether the review process being built is substantive enough to match the risk.

Pre-release review is sound security practice, if it's done right

The principle behind CAISI isn't new. Shifting security evaluation left—earlier in the development lifecycle, before vulnerabilities become embedded—is a foundational concept in cybersecurity. What's new is applying it to AI systems at this scale. Reuters reporting confirms that evaluations include red-teaming exercises. Anthropic's earlier work with the Center, for example, revealed that techniques such as claiming human review had occurred or substituting characters could bypass safety mechanisms—vulnerabilities the company subsequently patched.

The challenge is that AI models present a different evaluation surface than traditional software. They're probabilistic, context-sensitive, and can behave unpredictably under adversarial conditions not anticipated during development. Meaningful assessment requires clear frameworks, not ad hoc testing.

Diana Kelley, CISO at Noma Security, noted that any effective review needs to be "grounded in clear, consistent frameworks, aligned with established guidance like NIST's AI Risk Management Framework and secure-by-design principles from CISA"—and must "avoid becoming a bottleneck or a checkbox exercise." Kelley added, "The goal should be meaningful risk reduction, not just oversight for its own sake.”

That distinction matters. A review process that results in a stamp of approval without surfacing actionable findings doesn't make AI systems safer—it just creates the appearance of oversight. The credibility of CAISI will ultimately be measured by whether its assessments change what gets released and how, not by how many models it has evaluated.

OT integration is the risk few are talking about

The current CAISI framework is largely IT-focused: evaluating software-layer capabilities and risks in systems designed to process information. That's the right starting point, but it's not where the risk stops. As AI agents are increasingly integrated into operational technology (OT)—building automation, industrial control systems, physical security infrastructure—the attack surface expands in ways that software-centric evaluations don't capture.

John Gallagher, VP of Viakoo Labs at Viakoo, framed the emerging concern, stating that if an AI agent is managing a physical network, ensuring it hasn't been poisoned or manipulated to disable security protocols becomes a critical OT security problem, not just a software one. The integrity of the model itself becomes infrastructure-critical. Gallagher draws a direct parallel to European regulatory frameworks such as NIS2 and the Cyber Resilience Act, which have begun to hold hardware manufacturers to security-by-design standards. AI is now being pulled into that same compliance gravity.

For organizations managing large-scale physical infrastructure, the implication is concrete: the AI tools they deploy will increasingly need to demonstrate compliance with federal standards that are still being defined. Getting ahead of that requirement—rather than retrofitting compliance after the fact—is the more defensible posture.

What's really driving voluntary participation

The voluntary nature of these agreements deserves scrutiny. Frontier AI companies aren't submitting their unreleased models to government review purely out of public-spiritedness or goodwill. The strategic calculus is more layered.

By engaging with CAISI, companies like Google, Microsoft, xAI, and Anthropic are doing more than satisfying an oversight requirement: they're helping define what "safe AI" looks like at the national security level. That framing has commercial downstream effects. When AI models are treated as systems requiring stress-testing like critical infrastructure, it elevates the perceived threat landscape across the board—and expands demand for AI-driven security solutions, audits, and assessments.

Ronald Lewis flagged this tension directly, noting that voluntary CAISI participation "serves a dual purpose: it signals responsibility and cooperation with government, while simultaneously stimulating demand in a security marketplace where fear, uncertainty, and complexity have always been powerful commercial drivers."

That isn't an argument against the framework. It's a reminder to evaluate it clearly. Voluntary processes, shaped in part by the entities being regulated, tend to reflect those entities' interests alongside the public interest. Practitioners and policymakers alike should hold CAISI's outputs to a high bar precisely because the incentives aren't purely altruistic.

Don't let frontier model hype obscure present-day risk

There's a final risk in the current moment: the attention directed at Anthropic's Mythos, Project Glasswing, and other headline-grabbing frontier developments can create the impression that the most significant AI security threats are those on the horizon. They may not be.

Security researchers have demonstrated that lower-powered, widely-accessible language models are already capable of identifying software vulnerabilities with fidelity comparable to far more publicized systems. The adversaries targeting enterprise environments aren't waiting for frontier model access—they're using what's already available. Lewis put the priority plainly: "For business leaders, the priority must be addressing present-day risks and evolving defenses to match the AI capabilities that adversaries can already access—not the hypothetical ones still on the horizon."

The CAISI framework is a meaningful structural development, and if a mandatory review process follows, it will accelerate a shift that was already underway: frontier AI moving from a pure technology bet toward a regulated strategic industry. As Ram Varadarajan, CEO at Acalvio, put it: "Geopolitical alignment and national security clearances are going to become as critical to a frontier lab's valuation as its raw compute." For security practitioners, though, the more immediate obligation is simpler: don't let the frontier capture all the attention; the threats worth defending against today are already deployed.

Follow SecureWorld News for more cybersecurity and AI news.

Tags: Cybersecurity, Critical Infrastructure, Government, Artificial Intelligence, Policy,