sponsor content What's this?
Claude Mythos advances autonomous exploit development: What agencies can do to prepare

Presented by
Tanium Federal
Anthropic’s Claude Mythos Preview represents a seismic shift in the impact of AI on the cybersecurity landscape in the federal government. According to the AI company’s Claude Mythos Preview System Card, Anthropic will not publicly release its “most capable frontier model to date, [which] shows a striking leap in scores on many evaluation benchmarks.”
Such an extraordinary leap that, due to cybersecurity concerns, Anthropic will only use Mythos “as part of a defensive cybersecurity program with a limited set of partners.”
Through the program, dubbed Project Glasswing, Anthropic will equip partners with Mythos Preview for use in defensive security work. Glasswing partners include leading AI, software and cybersecurity organizations: Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA and Palo Alto Networks.
“No one organization can solve these cybersecurity problems alone,” Anthropic stated in its announcement. “Frontier AI developers, other software companies, security researchers, open-source maintainers, and governments across the world all have essential roles to play.”
Writing in Fortune, former national cyber director Kemba Walden offered a precise account of what makes Mythos particularly difficult to defend against.
"Not only does the model discover zero-day vulnerabilities, but it autonomously builds and chains exploits — and then covers its tracks." says Walden, "We need urgent investment, policy innovation, and public-private collaboration to ensure AI strengthens, rather than undermines, our national security."
The federal response has moved quickly and, at times, unevenly. On April 7, Treasury Secretary Scott Bessent and Federal Reserve Chair Jerome Powell convened an urgent, closed-door meeting with the CEOs of several of the nation’s largest banks to brief them on the cybersecurity risks posed by Mythos. By mid-April, the White House Office of Management and Budget was negotiating terms to give federal civilian agencies-controlled access to Mythos.
Yet even as those negotiations proceeded, the Cybersecurity and Infrastructure Security Agency — the office singularly responsible for defending federal civilian networks — remained locked out. The NSA and Commerce Department’s Center for AI Standards and Innovation were actively testing the model; CISA was not.
“Mythos is a wake-up call for our government leadership — this can upskill a novice hacker to a successful and lethal expert hacker. Government at all levels need to use Mythos to create a sense of urgency to prepare for and safeguard against emerging threats,” says Tanium Executive Client Advisor Ed Debish. “While it’s easy for the announcement to spread fear, agency leaders must separate facts from fear based on data and context from the alarming headlines.”
To support a grounded assessment, a team of researchers and cybersecurity experts at Tanium combed through Anthropic’s nearly 250-page Claude Mythos Preview document to pull out key findings and takeaways and offer strategies for mitigating risk.
What can be exploited can also be patched
According to Anthropic’s own assessment, its Opus 4.6 model was largely unsuccessful at autonomous exploit development, producing only two working exploits in hundreds of attempts. Mythos, meanwhile, produced 181 in the same test. Mythos also reached full-control hijack — tier 5 out of 5 — in a crash severity ladder, while Opus topped out at tier 3.
According to Debish, these findings support a narrow but important claim: models have crossed from ‘sometimes finds bugs’ toward ‘more reliably chains exploitation in realistic codebases,’ at least under these evaluation conditions.
“Going from two working exploits to 181 in identical test conditions isn't incremental progress — that's an inflection point,” Debish says, “It confirms that AI-powered exploitation has crossed into a different tier of sophistication, and federal patch postures need to reflect this reality — speed is the most important aspect to staying ahead those that want to do us harm."
What does this mean for government cybersecurity leaders? “Under these evaluation conditions” is a critical piece of context. A testing environment is revelatory but not the real world. Moreover, if Mythos is capable of identifying and exploiting bugs for nefarious purposes, then it can be used to identify the same bugs as priorities for patching.
Anthropic is predicting somewhere between six and eighteen months for the length of time it might take for open-weight models to reach Mythos-level capabilities. While this is a short timeline, agencies and industry experts can act now and spend the coming months hardening defenses in preparation.
The urgency is compounded by where much of the federal enterprise currently stands: in a recent GovExec Intelligence survey (in partnership with Tanium) of federal IT and security decision-makers, only 18% of respondents said their organization could identify and fully remediate an endpoint security risk in their Enterprise IT environment within minutes — and nearly half reported remediation timelines measured in days or longer.
When open-weight models reach Mythos-level exploit capability and adversary dwell times compress accordingly, a multi-day remediation window is not a workflow inconvenience. It is an open attack surface.
Patch management for the AI era
Traditional patching cadences — monthly, even weekly — were established based on the speed of human exploit development. What has been clear before Project Glasswing is that critical cybersecurity functions in the AI era must operate at the velocity of AI model advancement and not human capabilities.
The data suggests that federal agencies understand this imperative, even if many have yet to operationalize it. The same intelligence survey found that 55% of federal IT and security leaders named exposure and vulnerability risk reduction as a top modernization priority for the next 12 to 24 months — with security operations automation and AI-assisted decision-making close behind at 51 and 48%, respectively. Yet nearly half of respondents said their organizations are using intelligent automation only in limited pilot programs or not at all. Awareness of the gap and the ability to close it at operational speed are two different things.
"The survey tells us federal leaders already know where they need to go," Debish says. "Exposure reduction and automation are clear priorities. The work now is closing the gap between that intention and operational execution — and the timeline for doing so is tighter than it was six months ago."
Debish also noted that the landscape requires a fundamental change toward precision patching and continuous threat exposure management: they require processes that are trusted, secure, and repeatable at the speed of machines.
If those words — agile, targeted, rapid, iterative — bring to mind CI/CD pipelines, the goal is, indeed, to bring the innovation and acceleration the software development world has seen in recent years to the security side of the house. Vulnerability remediation and patching is no longer a scheduled activity, but a continuous process. In other words, continuous threat exposure management: exposure reduction is ongoing and iterative, not a discrete task.
While Anthropic may have placed a temporary hold on the proliferation of Mythos, eventually Mythos and other models like it will be available to the general public — government and adversaries alike. This intermediary time is critical for agencies at all levels of government to shore up defenses before the day arrives.
Tanium helps government IT and security leaders leverage AI to their own advantage and defend against hackers’ advancing skillsets. With Tanium Autonomous Patch Management, agencies can enhance the agility, velocity and iterative capabilities of cybersecurity teams. Driven by AI and real-time endpoint intelligence, Tanium enables autonomous, integrated and comprehensive patching. Mythos-class models may have the potential to wreak havoc, but AI-powered cybersecurity solutions can close exposure windows before it’s too late.
Learn more about how Tanium can help narrow your exposure windows, even as sophisticated AI models continuously change the cybersecurity game.
This content is made possible by our sponsor Tanium; it is not written by and does not necessarily reflect the views of NextGov/FCW's editorial staff.
NEXT STORY: From Mail Delivery to Fraud Prevention: Address Data Matters More Than Ever in Government




