Fri. Oct 24th, 2025

In the past few years, artificial intelligence has been the high-profile talk of the town. Every new AI model reveals incredible innovations that rival the version released just weeks before. Experts, developers, and CEOs of AI companies make bold claims about future trajectories, from the elimination of arduous labor and the increase in human longevity to the potential existential threats to humanity.

If everyone is talking about AI, that’s in part because the publicization of those innovations has generated exponentially growing revenues for the companies developing those models. But as AI becomes faster, more capable, more complex, that public conversation could quickly be moved behind closed doors. AI companies are increasingly deploying AI models within their own organizations, and it’s likely they’ll soon find it strategically essential to reserve their most powerful future models for internal use. But these seemingly innocuous decisions could stand to pose a serious threat to society at large, as argued below.

[time-brightcove not-tgx=”true”]

Most leading AI companies have publicly stated their intent to develop AI models as competent as humans on all cognitive tasks, which could generate trillions of dollars in economic value. With the currently-common belief in a winner-take-all race towards artificial general intelligence (AGI), the potential strategic advantage of highly advanced models may soon lead companies to leverage their models confidentially and internally to increase technical progress—but providing little signal of advancement to competitors and the broader outside world. 

Current AI systems already frequently behave in unexpected, unintended, and undesirable ways in experimentally simulated contexts, for example, threatening to blackmail users, faking alignment  or showing self-preserving behavior. However, should leading developers start holding their cards closer to their chests, society would no longer have a window, not even a narrow one, to publicly learn about and assess the upsides and downsides, the risk and security profiles, and the trajectory of this foundational technology. Once advanced future AI systems are deployed and used, and maybe exclusively so, behind closed doors, unseen dangers to society could emerge and evolve without oversight or warning shots—that’s a threat we can and must avoid.

Leading labs are already increasingly leveraging AI systems to accelerate their own research and development (R&D) pipelines, by designing new algorithms, proposing entirely new architectures or optimizing code. Google, for example, estimated in 2024 that 50% of their code was now written by AI. As highlighted in recent research, advanced AI systems could eventually be used to iteratively improve their own successors, potentially creating a powerful “feedback loop” of increasingly capable models. This outcome would be great news for AI companies aiming to quickly reach artificial general intelligence, or even superintelligence, ahead of competitors—but only if they leverage their strategic advantage away from prying eyes.

At first glance, all of this might sound harmless: what threat could an unreleased AI system pose? 

The problem is two-fold: first, as advanced AI systems become increasingly useful internally to build better AI, there may be strong competitive and economic incentive, even more than today, to prioritize speed and competitive advantage over caution. This race dynamic carries risks, especially if increasingly advanced AI systems begin to be used by company staff and deployed to use in security-critical areas such as AI R&D, potentially autonomously to reduce friction, baking in potential failure points before anyone can fully understand the AI systems’ behavior. 

Second, existing assessments and interventions predominantly focus on publicly available AI systems. For internally deployed AI systems, very little, if any, information is available about who has privileged access to them or what they are used for. More precisely, there is scant information made available about their capabilities, whether they behave in undesirable ways; whether they are under appropriate control with oversight mechanisms and safeguards; whether they can be misused by those who have access to them or their overall risk profiles. Nor are there enough level-headed and detailed requirements to ensure that these AI systems are rigorously tested and do not pose a cascading threat to society before they are put to use. 

If we do not require tech companies to provide detailed enough information about how they test, control, and internally use new AI models, governments cannot prepare for AI systems that could eventually have nation-state capabilities. Meanwhile, threats that develop behind closed doors could spill over into society without prior warning or ability to intervene. To be sure, already today, we can’t trust current AI systems to reliably behave as intended whether they are externally or internally deployed. However, we still have time to act.

There are straightforward measures that can be taken today. The scope of AI companies’ voluntary frontier AI safety policies should be explicitly expanded to cover high-stakes internal deployment and use, such as for accelerating AI R&D. As part of this, internal deployment should be treated with the same care as external deployment, and rigorous assessments and evaluations to identify dangerous capabilities, the establishment of clear risk profiles, and required control or guardrail mechanisms prior to usage should be encouraged .

Government agencies in charge of national preparedness should have proactive visibility into the internal deployment and use of highly advanced AI systems and receive all relevant national-security-critical information necessary. This could include, for example, information about who has access to these AI systems and under what conditions, what these AI systems are used for, what oversight is being applied to them, and what could happen if this oversight fails, in order to ensure that economic and intellectual property interests are balanced with legitimate national security interests.

AI companies and governments should collaboratively take the lead on the adoption of these straightforward best practices to ensure trustworthy innovation and protection of the public.

By

Leave a Reply

Your email address will not be published.