Christopher Olah, the co-founder and head of interpretability research at Anthropic, chose a historic stage to make a bold argument: the development of frontier artificial intelligence cannot be left to frontier AI labs alone. Speaking on Monday at the formal presentation of Pope Leo XIV's first encyclical, Magnifica humanitas, in the Vatican Synod Hall, Olah acknowledged that even well-intentioned researchers operate within powerful forces that can pull them away from ethical decisions.
This is the first time a leader of a major AI firm has publicly conceded such a point from a platform of this magnitude. Olah's presence at the Vatican, alongside the pope and an audience of cardinals and diplomats, signals a remarkable shift in how the AI industry is positioning itself in the global conversation about governance.
Olah has led Anthropic's interpretability team for years, a group that tries to reverse-engineer what large language models actually do inside their neural networks. The company treats this as its strongest claim to safety credibility. In his speech, he stated plainly: "Every frontier AI lab operates inside a set of incentives and constraints that can sometimes conflict with doing the right thing." He added that those forces include commercial, geopolitical, and personal pressures, and that the answer to them must sit outside the lab.
The speech also touched on labor. Olah warned of "a real possibility" that AI would displace human work "at very large scale" and called supporting those displaced "a moral imperative of historic proportions." This is the most specific public acknowledgment by any frontier-lab founder that the technology his company builds may, on its internal projections, dislay the labor market faster than it can re-absorb displaced workers.
Anthropic's move to the Vatican did not happen overnight. Two weeks before the speech, the company announced a new office in Milan, signaling a deeper European presence. The relationship with the Catholic Church places Anthropic inside the church's most consequential statement on technology since Pope Leo XIII's Rerum novarum addressed industrial capital in 1891. Magnifica humanitas, the encyclical launched on Monday, does not name specific policies but frames the moral and theological boundaries for technological development. Olah's speech mirrored that framing, declining to outsource the next decade's regulatory architecture to the companies building the technology it will regulate.
The political backdrop, however, is the inverse of the moral message. Anthropic spent the spring at the center of two separate confrontations with the U.S. government. In April, the Pentagon ejected the company from its top classified AI work because of Anthropic's own usage restrictions. The Pentagon then signed deals with Nvidia, Microsoft, and AWS instead. Around the same time, the Trump administration blocked an expansion of Mythos, an autonomous vulnerability-discovery model that had shaken bank-cybersecurity governance worldwide. Olah's appearance on the same stage as the pope, calling for outside oversight, is widely seen as a direct response to those confrontations.
The timing also carries immense commercial weight. Anthropic is currently in talks to raise $30 billion at a $900 billion valuation. Such numbers highlight the vast resources at play and the tension between profit motives and safety commitments. Olah did not pretend the dissonance did not exist. He told the room: "Companies like ours operate under strong commercial, geopolitical and personal pressures that can be at odds with the broader interests of society." His argument was not that Anthropic stands outside those pressures, but that the solution requires multiple institutions — governments, religious bodies, civil society — to step in.
Christopher Olah's career has been deeply rooted in interpretable machine learning. Before co-founding Anthropic, he worked at Google Brain, where he led early research into understanding neural network primitives and feature visualization. His work on feature attribution and adversarial examples became foundational in the field. At Anthropic, he built a team dedicated to mechanistic interpretability, a subfield that tries to deconstruct model behaviors down to individual neurons and circuits.
The Vatican event also underscores a broader trend of tech leaders seeking moral authority from traditional institutions. In the past, such discussions often took place at academic conferences or corporate ethics panels. Placing them inside the Vatican elevates the stakes and gives the arguments a weight that resonates far beyond the tech bubble.
What remains unresolved is how this moral framing will translate into practical policy. Magnifica humanitas asks governments to protect human dignity in the face of rapid technological change, but it does not specify regulations or enforcement mechanisms. Olah's invitation to the Vatican may help Anthropic build relationships with European regulators, particularly as the EU's AI Act comes into effect. The company has already advocated for binding regulation of frontier models, including mandatory safety testing and transparency requirements.
The encyclical itself draws on Catholic social teaching, emphasizing that technology must serve the common good and not be driven solely by profit or power. Olah's speech echoed these themes, arguing that AI development must be guided by a broader set of values beyond efficiency and capability. He specifically mentioned the risk of AI entrenching existing inequalities and the need for "distributive justice" in the allocation of AI's benefits.
Analysts have noted that Anthropic's strategy of engaging with religious institutions is partly a hedge against regulatory backlash. By positioning itself as a responsible actor calling for oversight, the company may hope to shape the rules that will govern its own products. Critics, however, argue that such performances of good faith do not replace concrete actions like releasing model weights for independent auditing or committing to binding safety guarantees.
Inside the Vatican, Olah's presence also carried symbolic weight for the Catholic Church. The church has been grappling with the implications of AI, from autonomous weapons to algorithmic bias. Pope Leo XIV's encyclical represents a major effort to articulate a vision for technology that respects human dignity. By inviting an AI lab founder to speak, the church signals a willingness to engage directly with the creators of the most powerful technologies of our time.
The speech also touched on the environmental costs of AI, though less prominently. Olah noted that training large models consumes vast amounts of energy, and he called for greater investment in energy efficiency and green data centers. This aligns with the Vatican's focus on stewardship of creation and the ecological crisis.
Whether Olah's argument moves practical policy remains an open question. The U.S. government has so far taken a hands-off approach to AI regulation, preferring voluntary commitments. The Biden administration's executive order on AI, signed in 2023, focused on safety testing but lacked enforcement teeth. The Trump administration's subsequent actions, such as blocking Mythos, have been more reactive than proactive. In Europe, the AI Act imposes stricter rules, but its implementation is still years away.
Olah's appearance at the Vatican may also influence other tech leaders. If such engagements become common, they could shift the narrative around AI governance from one of purely technical challenges to one of moral and institutional coordination. Some major AI companies, including OpenAI and Google DeepMind, have also expressed support for external oversight, but they have not yet taken similar public steps to seek it from religious institutions.
The labor implications remain the most potent aspect of Olah's message. By acknowledging that AI could displace millions of workers, he touched on a fear that many in the industry prefer to downplay. His call for a "historic" moral commitment to support the displaced resonates with proposals for universal basic income, retraining programs, and social safety nets. But he offered no concrete plans from Anthropic to fund or implement such measures.
Critics also point out that Anthropic's business model relies on selling access to its models to companies that may use them to automate jobs. The tension between making a profit and advocating for worker protection is not resolved by a single speech, no matter how grand the stage. Olah acknowledged this tension in his address, saying it is precisely why outside forces are necessary.
Source: TNW | Anthropic News