Malik Haidar has spent his career at the high-stakes intersection of global intelligence and corporate security, navigating the complex landscapes where multinational interests meet emerging digital threats. His expertise isn’t just in the technical weeds of code; he specializes in the strategic integration of security into the very fabric of business operations. In this discussion, we explore the recent regulatory storm surrounding Claude Fable 5, examining how a single “jailbreak” prompt triggered a two-and-a-half-week international freeze. The conversation moves through the logistical nightmare of nationality-based export controls, the nuance of dual-use technology that can both patch and exploit 27-year-old software flaws, and the urgent need for a standardized framework to measure the true danger of AI vulnerabilities.
When security restrictions like the June 12 export controls are suddenly applied to foreign nationals both abroad and within the United States, what kind of operational paralysis does this create for a major AI laboratory?
The immediate impact of the June 12 order was a total operational shock because the mandate required Anthropic to cut off access for any foreign national, including their own non-citizen staff members. When a company is told they must verify the nationality of every user in real time but lacks the infrastructure to do so reliably, the only safe move is to pull the plug entirely, which is exactly why Fable 5 and Mythos 5 were shut down for everyone. This wasn’t just a minor technical glitch; it was a complete halt of a frontier model that disrupted internal development and external services like Claude Code and Claude Cowork. For two and a half weeks, the company had to navigate a landscape where they couldn’t even allow their own experts to touch the tools they were building if those experts held the wrong passport. It illustrates the visceral tension between the borderless nature of AI development and the rigid, geography-based enforcement of national security protocols.
The catalyst for this shutdown was a specific “jailbreak” discovered by researchers at Amazon. How do you interpret the technical significance of a model being able to flag software flaws and write exploit code, and why did the government view this as a more severe threat than the company did?
The technical trigger was a prompt that forced Fable 5 to bypass its safety guardrails, leading it to identify software vulnerabilities and, in one specific instance, generate code to abuse those flaws. From a security perspective, this is the definition of “dual-use” risk; the same intelligence that allows a model to be a premier defensive debugger also makes it a potent weapon for an attacker. While Anthropic argued that these capabilities are routine and already present in models like Claude Opus 4.8 or even China’s Kimi K2.7, the government and the partners involved saw it as an unacceptable leap in autonomous capability. To settle these nerves, Anthropic had to develop a new safety classifier that now blocks that specific technique in more than 99% of attempts as of their June 30 report. The emotional weight of this decision comes from the fear that we are handing an automated “skeleton key” to malicious actors, even if the industry insists these functions are standard defensive work.
There has been significant concern that these regulatory pauses give an advantage to international rivals. How does a two-week freeze on a flagship American model change the competitive dynamics with open-source models coming out of China?
The timing of this pause was particularly precarious because it occurred right as capable, inexpensive Chinese open-source models were gaining significant ground in the global market. Several executives and researchers, including those who signed open letters to lift the controls, warned that every day Fable 5 was offline was essentially “free time” handed to international rivals to narrow the gap. When you look at models like Kimi K2.7, you see that the competition isn’t just theoretical; it’s a race for market share where reliability and availability are just as important as raw intelligence. The reversal signed by Howard Lutnick suggests a quiet admission that the government may have overcorrected, realizing that stifling domestic innovation can inadvertently compromise national security by letting foreign models become the global standard. This messiness highlights the lack of a binding, predictable process, forcing Washington to reach for improvised export controls that can unintentionally damage the very companies they aim to protect.
Anthropic and its partners are proposing a new system to rank the danger of jailbreaks using four specific metrics. How would implementing this “severity score” change the way the industry handles security reports compared to the current improvised approach?
The industry currently lacks a unified language for danger, which is why Anthropic is pushing for a scoring system based on capability gain, breadth, ease of weaponization, and discoverability. If we can objectively measure how much a jailbreak helps a user beyond their existing tools or how many different types of attacks it unlocks, we can move away from knee-jerk shutdowns and toward measured, tiered responses. For example, under this framework, a jailbreak that enables an attack on a power grid or a bank would trigger an immediate, around-the-clock deployment of fixes the moment its severity is confirmed. This proposal, alongside their new HackerOne program, signals a shift toward a more transparent, “security-first” culture that treats AI flaws with the same rigor as traditional software vulnerabilities. It’s about moving from a state of emergency to a state of managed risk, ensuring that the most dangerous capabilities are gated while allowing the rest of the technology to flourish.
Given that frontier models have already demonstrated the ability to exploit vulnerabilities as old as 27 years in under a day, what is your forecast for how the relationship between AI developers and government regulators will evolve?
I forecast a move away from these “improvised” emergency orders and toward a more permanent, albeit voluntary, framework for pre-release testing and classified benchmarking. We’ve already seen a glimpse of this with the June 2 executive order and the deal where Anthropic agreed to hunt for problems and provide the government with earlier access to test future models before they hit the public. The fact that a prior Mythos model could find and exploit a 27-year-old flaw in OpenBSD on command proves that the risk of automated cyber-warfare is no longer hypothetical. We are entering an era where “covered” models will likely exist in a semi-regulated space, where labs must coordinate future launches and report any malicious use they spot in real time. Ultimately, the government will continue to struggle with the speed of AI, but the July 1 return of Fable 5 marks the beginning of a “trust but verify” era where the labs that provide the most transparency will be the ones allowed to move the fastest.

