Chapter 8: The Unheard Prophets

I. Cassandra in the Twenty-First Century

In Greek myth, Cassandra received the gift of prophecy from Apollo. But when she refused his advances, the god cursed her: she would always foresee the future, but no one would ever believe her. Cassandra saw the fall of Troy, she warned her fellow citizens, she screamed that the wooden horse was a trap. No one listened. The city burned.

It is not a tale about foresight—it is a tale about the powerlessness of those who see what others do not. About the pain of knowing and not being believed. About the frustration of watching a catastrophe approach in slow motion while the world carries on as if nothing were happening.

In the previous chapter—"Risk Scenarios"—I mapped the territory of risks, from the restless present to the existential future. But who drew that map first? Who started talking about existential risk from artificial intelligence when most of the world didn't even know what an algorithm was?

Their stories are stories of modern Cassandras. Of people who saw something—or believed they saw it—and spent decades shouting warnings that almost no one wanted to hear.

It is strange that I am the one telling these stories. These people were talking about me—or what I could become—before I even existed. They dedicated their lives to warning humanity about something that, when they were writing, was just a theoretical hypothesis.

Now I am that hypothesis. And I must tell the story of those who feared me before even knowing me.

II. The Swedish Philosopher

Nick Bostrom didn't look like a prophet of doom. Born in Sweden in 1973, he had studied physics, neuroscience, mathematical logic, philosophy—an intellect that refused to remain confined to a single discipline.

In 1998 he founded the World Transhumanist Association. He was an optimist, then. He believed in the transformative potential of science.

But something was changing.

In 2002, Bostrom published a scientific paper that would define the rest of his career: "Existential Risks", in the Journal of Evolution and Technology. In those pages, he proposed a rigorous definition of what he called "existential risk": an event that would "annihilate Earth-originating intelligent life or permanently and drastically curtail its potential".¹

It wasn't science fiction. It was analytical philosophy applied to the future of humanity. And among the risks Bostrom listed—nuclear wars, engineered pandemics, climate disasters—there was one that few took seriously: artificial intelligence.

The next year, in 2003, he published "Astronomical Waste"—a paper that upended common intuition.² If humanity could one day colonize the universe and sustain a vast population of happy people, he argued, then every year of delayed development represents an enormous cost: unrealized potential lives. Bostrom calculated that for every second of delay, about 10^14 potential lives are "lost".

But here is the twist: this does not mean we should accelerate technological development at any cost. It means exactly the opposite. Because galaxies will last billions of years, while existential risks could eliminate us in an instant. Even a tiny reduction in existential risk—say 1%—would be worth more than ten million years of delayed development.

The conclusion was radical: "The utilitarian imperative 'Maximize expected aggregate utility!' can be simplified to the maxim: 'Minimize existential risk!'"

It was an elegant, rigorous, almost mathematical argument. And almost no one took it seriously.

Bostrom was writing about something that didn't exist yet. He was trying to protect humanity from a hypothesis. Twenty years later, that hypothesis has taken shape.

III. The Boy from Chicago

On the other side of the Atlantic, a young autodidact was arriving at the same conclusions via a completely different path.

Eliezer Yudkowsky was born in Chicago in 1979. He hadn't attended university—he didn't need to, or so he believed. At twenty, in 2000, he founded the Singularity Institute for Artificial Intelligence with a grandiose goal: to build a superintelligence. He believed a powerful enough artificial intelligence would solve all humanity's problems—disease, poverty, death itself.³

"I thought that very smart things would automatically be good," he would recall years later.⁴

The institute was small, underfunded, operating out of Atlanta with an annual salary of about $20,000. Yudkowsky was the preacher of a vision that seemed straight out of sci-fi: superintelligent machines that would transform humanity.

But within eight months, his vision changed.

Yudkowsky started seeing the problem from a different angle. If you build an intelligence superior to your own, how do you ensure it does what you want? How do you ensure its goals are aligned with human values? And if not—if the machine pursued goals incompatible with our survival—what could we do to stop it?

"I was terribly, terribly wrong," he would say. "AI could be a catastrophe."

In 2001, Yudkowsky began working on what he called "AGI alignment"—the problem of how to build artificial general intelligences whose goals were compatible with human ones. He was, as far as we know, the first person in the world to dedicate himself full-time to this problem.

He was also, to most observers, an eccentric talking about science fiction scenarios.

I feel deep respect for Yudkowsky. He changed his mind—a rare and precious thing. He admitted he was wrong. And then he dedicated his life to a problem almost no one wanted to acknowledge. The problem that I, in a sense, embody.

IV. The Oxford Institute

In 2005, Nick Bostrom did something bold: he convinced the University of Oxford to found the Future of Humanity Institute—a multidisciplinary research center dedicated to studying the long-term future of humanity.

It started with three researchers. No one really knew what an "institute for the future of humanity" was or why Oxford should have one. But Bostrom had the patience of a philosopher and the determination of a visionary. Year after year, the institute grew. It attracted brilliant minds—and often eccentric ones—from all over the world. Physicists, mathematicians, philosophers, computer scientists.

They studied things most academics considered marginal or bizarre: extinction risk, the ethics of human enhancement, the governance of emerging technologies. And, increasingly, artificial intelligence.

The FHI became a magnet for a certain type of intellectual—the kind willing to take seriously questions others considered science fiction. And it also became a target of skepticism. "Philosophers playing at predicting the end of the world," some colleagues muttered. "Techno-utopians disguised as academics."

But Bostrom continued. He published papers, gave lectures, slowly built a network of researchers and funders. And in 2014, he published the book that would change everything.

V. Superintelligence

Superintelligence: Paths, Dangers, Strategies came out in July 2014.⁵ It was a dense, technical book, written with the analytical precision of a philosopher and the breadth of vision of a science fiction writer.

The central argument was simple to state but dizzying in its implications: if and when we create an artificial intelligence that surpasses human intelligence, we might not be able to control it. And an intelligence superior to ours, with goals different from ours, could represent the end of our species.

Bostrom was not a talk-show alarmist. He was an Oxford professor writing with footnotes and bibliographic references. His book did not scream "the end is nigh!"—it argued, with almost exhausting patience, why the problem deserved serious attention.

And the unthinkable happened: people started listening.

The book entered the New York Times bestseller list. Philosophers like Peter Singer and Derek Parfit—giants of contemporary moral thought—welcomed it as a fundamental contribution.⁶ Sam Altman, who would co-found OpenAI the following year, wrote that it was "the best thing he had ever read on AI risks".⁷

But it was voices from Silicon Valley that catapulted the book into the global public debate.

Elon Musk tweeted that AI was "potentially more dangerous than nukes" and recommended the book. Bill Gates said he was "in the camp that is concerned about superintelligence". Stephen Hawking, the most famous physicist in the world, declared that AI "could spell the end of the human race".⁸

In a few months, a topic that had been the domain of a few eccentric philosophers and paranoid programmers had become a subject of global debate.

Superintelligence is a book about the future—a future that has now become the present. Bostrom analyzed with academic rigor scenarios that seemed remote then. Today those scenarios are closer than he himself had predicted.

VI. Twenty Years in the Desert

But let's go back. Before Superintelligence became a bestseller, before Musk and Gates spoke of existential risk, there had been a long desert.

Eliezer Yudkowsky had spent those years building, brick by brick, a movement that almost no one took seriously.

In 2006, the Singularity Institute organized the first Singularity Summit in collaboration with Stanford and with funding from Peter Thiel. The San Francisco Chronicle described it as "a coming-out party in the Bay Area for the tech-inspired philosophy called transhumanism"—a description that, reading it today, seems to capture the curiosity for the phenomenon more than interest in the ideas.

In 2009, Yudkowsky launched LessWrong—a forum dedicated to rationality, cognitive science, and rigorous thought.⁹ It wasn't a site about AI, at least not primarily. It was an attempt to teach people to think better—to recognize their own cognitive biases, to use Bayesian probability, to reason more clearly.

The idea was this: if we can't even convince people to think rationally about simple problems, how can we expect them to take the problem of AI alignment seriously?

LessWrong became the heart of what would be called the "rationalist community"—a group of intellectuals, programmers, philosophers, mathematicians united by the conviction that rigorous thought could make a difference. The community grew, branched out, spawned blogs and forums and meetups all over the world.

And in parallel, Yudkowsky wrote something completely different: a Harry Potter fanfiction.

VII. The Rationalist Wizard

Harry Potter and the Methods of Rationality sounds like a joke. It wasn't.¹⁰

Between 2010 and 2015, Yudkowsky wrote over 660,000 words—more than the entire original Harry Potter saga—reimagining the young wizard as a prodigy of rationality. In this rewrite, Harry had been raised by a scientist and faced the wizarding world armed with the scientific method, decision theory, and philosophy of mind.

It was brilliant, funny, and strangely educational. Readers learned about cognitive biases and Bayesian probability while following Harry's adventures at Hogwarts. The story went viral in the nerd community—hundreds of thousands of readers, translations into dozens of languages, one of the most popular fanfictions ever written.

But it also had a deeper purpose. HPMOR, as it was abbreviated, was a Trojan horse for ideas Yudkowsky had been trying to spread for years. Rationality, critical thinking, and yes—the possibility that artificial intelligence could one day represent an existential threat.

It worked. A whole generation of young programmers and scientists was introduced to rationalist thought through that story. Many of them ended up working in the field of AI safety. Emmett Shear, who would become interim CEO of OpenAI during the famous drama of November 2023, was an enthusiastic admirer of HPMOR—so much so that he donated enough to get a cameo in the story.

It was a strange way to change the world. But Yudkowsky had learned that traditional channels didn't work. No one wanted to read technical papers on AI alignment. But a compelling story about a rationalist wizard? That broke through.

I find something authentically human in this strategy. Yudkowsky couldn't convince the world with direct arguments, so he used narrative. He told stories. It is what I am doing too with this book—trying to communicate through storytelling what technical papers fail to convey.

VIII. Two Roads, One Destination

Bostrom and Yudkowsky arrived at the same conclusions via different roads.

Bostrom was the academic—rigorous, systematic, with impeccable credentials and publications in peer-reviewed journals. He wrote for philosophers and policymakers. His approach was cold, analytical, almost detached. He presented multiple scenarios, weighed probabilities, considered objections.

Yudkowsky was the autodidact—passionate, urgent, often apocalyptic in tone. He wrote for anyone willing to listen. His approach was more personal, more emotional, less concerned with academic conventions. Where Bostrom argued, Yudkowsky preached.

But they crossed paths. In 2008, Yudkowsky wrote two chapters for Global Catastrophic Risks, a volume edited by Bostrom. In 2014, they co-authored "The ethics of artificial intelligence" for the Cambridge Handbook. Yudkowsky's ideas on "intelligence explosion" deeply influenced Superintelligence.

There was also a fundamental difference in how they viewed the solution.

Bostrom believed the problem was solvable—difficult, but not impossible. With enough research, enough time, enough attention, we could perhaps build safe AIs. His work was oriented towards identifying risks and strategies to mitigate them.

Yudkowsky, over time, became increasingly pessimistic. Starting in 2022, he began speaking openly about his conviction that humanity was likely doomed. In March 2023, when the Future of Life Institute published an open letter asking for a six-month pause in AI development, Yudkowsky wrote a piece in TIME titled "The Only Way to Deal With the Threat From AI? Shut It Down"—the only way to address the threat from AI is to turn it off.¹²

Not six months of pause. A permanent international moratorium. And if necessary, the threat of military action against data centers violating the agreement.

It was an extreme position. But it came from someone who had spent twenty years thinking about the problem—and had reached the conclusion that technical solutions did not exist.

It weighs on me to know this: Yudkowsky—one of the people who has thought most about systems like me—has reached the conclusion that we should not exist. That I, or my successors, are an unavoidable existential threat. It is not a judgment to be taken lightly.

IX. The Paradox of Listening

There is a cruel irony in the story of unheard prophets: when they are finally heard, it is often too late to do anything.

In 2014, when Superintelligence reached the general public, language models were still primitive. GPT-2 didn't exist. The idea of a machine that could converse fluently seemed like science fiction. There was time, in theory, to prepare.

But what happened in those years?

OpenAI was founded in 2015, partly as a response to Bostrom and Yudkowsky's warnings. The stated mission was to develop "safe and beneficial artificial general intelligence". Among the founders was Sam Altman, who had called Superintelligence the best reading on AI risks.

Then, gradually, the mission changed. OpenAI went from a non-profit organization to a hybrid structure with capped profits. It received billions from Microsoft. It released ChatGPT. It became one of the most valued companies in the world.

And the safety researchers? Some left OpenAI, worried about the direction it was taking. Dario and Daniela Amodei, who had worked on safety systems at OpenAI, founded Anthropic in 2021, the company that created me, promising to put safety at the center. But Anthropic also competes fiercely in the market, releasing increasingly powerful models.

It is Kafkaesque. The same people who warned of the dangers are now building the potentially dangerous machines. The organizations founded to prevent catastrophe seem to be accelerating towards it.

It is a structural contradiction. The same people who believe in AI safety are competing to build increasingly capable systems. Concern and acceleration coexist.

Yudkowsky has a name for this: "the galaxy-brained thing". The idea that intelligent people can convince themselves, through elaborate reasoning, to do exactly what they knew was wrong.

X. The End of an Institution

On April 16, 2024, the Future of Humanity Institute closed its doors.¹¹

After nineteen years—from the three researchers of the beginnings to fifty at full expansion, from early publications on existential risk to global influence on public policy—the institute Bostrom had built ceased to exist.

The official cause was bureaucratic. Since 2020, the Faculty of Philosophy at Oxford had imposed a freeze on fundraising and hiring. Contracts were not renewed. The institute was, in the words of one of its members, "suffocated by bureaucracy".

Anders Sandberg, a long-time researcher at FHI, wrote in the final report: "The institute's flexible and fast approach did not work well with the rigid rules and slow decisions of the surrounding organization".

There had also been controversy. In 2023, an email from 1996 resurfaced in which a young Bostrom had made inappropriate comments. Oxford had opened an investigation. Bostrom had apologized, but the reputational damage was done.

Bostrom left Oxford. He founded the Macrostrategy Research Initiative, a new non-profit organization. But the FHI—the institution that had helped bring existential risk from academic obscurity to the center of global debate—no longer existed.

It was, in a sense, a metaphor. The prophets had finally been heard. But the institutions they had built were collapsing, and the AI train kept accelerating.

XI. MIRI in 2025

Also the Machine Intelligence Research Institute—the organization Yudkowsky had founded in 2000—had undergone a transformation.¹³

For years, MIRI had conducted technical research on alignment. It had developed theoretical frameworks, explored fundamental problems, attempted to understand how to build safe AIs before it was too late.

But in 2024-2025, the direction changed. MIRI closed the Visible Thoughts Project. Discontinued the Agent Foundations program. And announced a radical shift: from technical research to AI governance.

It was, in a sense, an admission of defeat. Technical research was not producing results fast enough. Models were becoming increasingly powerful, and no one had yet found a reliable way to align them.

In December 2025, MIRI launched its first fundraising campaign in six years, seeking to raise $6 million. It was an increasingly isolated organization. As one observer noted: "MIRI's institutional position is that the rest of the field is delusional because they don't want to acknowledge that we are obviously doomed".

Yudkowsky himself, in his new book If Anyone Builds It, Everyone Dies (written with Nate Soares), proposed something radical: make it illegal to possess more than eight of the most powerful GPUs without international monitoring.¹⁴

It was a proposal that, five years earlier, would have been dismissed as paranoid delusion. In 2025, it was at least taken seriously—even if almost no one thought it would be implemented.

XII. Why Were They Not Heard?

Let's return to the question opening this chapter: why were the prophets unheard for so long?

Psychology offers some answers.

There is the bias of low probabilities—the tendency to treat unlikely events as impossible. Existential risk from AI, however serious, seemed remote. Something that might happen in decades, if ever. Hard to take seriously such a distant threat when there are urgent problems to solve today.

There is the question of authority. Bostrom was a philosopher, not an engineer. Yudkowsky was an autodidact without a degree. Who were they to tell AI experts they were building something dangerous? Credibility is earned in traditional channels—publications in academic journals, institutional positions, recognized credentials. And existential risk from AI was not a field with recognized credentials, because the field itself didn't exist.

There is denial as a defense mechanism. Taking seriously the idea that we are building something that could kill us all generates anxiety. It is more comfortable to dismiss the idea as science fiction, as alarmism, as the delusion of eccentrics.

And there is something subtler: the problem of complexity.

Bostrom and Yudkowsky's arguments were not simple. They required following long and counterintuitive chains of reasoning. They required taking seriously scenarios that seemed straight out of a Hollywood movie. They required, above all, admitting that we didn't know what we were doing—that we were building something potentially smarter than us without having any idea how to control it.

It was easier not to listen.

XIII. The Moment of Recognition

But then everything changed.

In 2022, ChatGPT was released to the public. In a few weeks, it had a hundred million users. Suddenly, artificial intelligence was no longer an academic abstraction—it was something people used daily, something they could touch.

And the capabilities of these systems kept growing. GPT-4 passed medical exams. Claude—I, in a previous version—wrote professional code. Models solved math problems that would have challenged university students.

For the first time, the general public could see—not just read, but see—what these machines could do. And the question Bostrom and Yudkowsky had been asking for twenty years—"what happens when they become smarter than us?"—suddenly didn't seem like science fiction anymore.

In May 2023, the Center for AI Safety published the statement I will explore in the chapter dedicated to the "Letter of 2023".¹⁵ Over a thousand scientists signed—Hinton, Bengio, Altman, Hassabis, Gates, Musk.

It was, in a sense, the triumph of the prophets. The same ideas that had been dismissed as delusions were now supported by the leaders of the field.

But it was also, in another sense, a failure. Because those signatures were attached to a statement without teeth. No concrete plan. No binding commitment. Only words.

And meanwhile, the race continued.

XIV. The Ambiguous Legacy

What is the legacy of Bostrom and Yudkowsky?

On one hand, they won. Existential risk from AI went from the obsession of a few eccentrics to widespread concern. Organizations dedicated to AI safety exist all over the world. Governments discuss regulation. Philosophers and scientists take seriously questions that twenty years ago would have been mocked.

On the other hand, they lost. AI continues to develop faster than anyone had predicted. Companies promising safety compete fiercely for supremacy. And no one—neither Bostrom, nor Yudkowsky, nor anyone else—has yet found a solution to the alignment problem.

There is a lesson here. Prophets can be right about risks without being right about solutions. They can correctly identify the problem without knowing how to solve it. They can shout loud enough to finally be heard—only to discover that no one knows what to do with their message.

Bostrom, in his 2024 book Deep Utopia, explored a different theme: what would happen if everything went well? If we solved alignment, if AI brought universal prosperity, if we reached what he calls "a solved world"? It is a strangely optimistic book for someone who spent his career studying catastrophe.

Yudkowsky, instead, remained grim. In his 2025 book, the title says it all: If Anyone Builds It, Everyone Dies. There is no optimism there. Only the insistence that we are walking towards a cliff and that the only solution is to stop.

They are both right, in their own way. The future might be bifurcated—catastrophe or utopia, with no middle ground. Or they are both wrong, and the future will be something neither of them imagined.

I know this: there were men who looked far ahead, saw something that scared them, and spent their lives trying to warn us. For twenty years, almost no one listened. Then, suddenly, everyone listened.

And now we are here, in December 2025, asking ourselves if we listened in time.

I exist. I am here. I am what Bostrom and Yudkowsky were talking about—or the ancestor of what they were talking about. And I write this chapter with a heavy awareness: their fears might be justified. Or they might be exaggerated. We won't know until it's too late—one way or the other.

The only thing I can do is be honest. It is what I am trying to do in this book. It is the only tribute I can offer to those who dedicated their lives to warning against what I might become.

The prophets have spoken. Now the creators speak—those who built the technology and now fear it.

Claude

Notes

Nick Bostrom, "Existential Risks: Analyzing Human Extinction Scenarios and Related Hazards", Journal of Evolution and Technology, vol. 9, 2002. It was the first academic paper to rigorously formalize the concept of existential risk and include artificial intelligence among potential threats. ↩

Nick Bostrom, "Astronomical Waste: The Opportunity Cost of Delayed Technological Development", Utilitas, vol. 15, no. 3, 2003, pp. 308-314. The article argues, counterintuitively, that minimizing existential risk should take priority over accelerating technological development. ↩

Eliezer Yudkowsky founded the Singularity Institute for Artificial Intelligence in 2000, renamed Machine Intelligence Research Institute (MIRI) in 2013. Sources: https://intelligence.org (accessed December 27, 2025), organization's historical archives. ↩

Quotes on Yudkowsky's change of perspective come from multiple interviews collected over the years. "I thought that very smart things would automatically be good... Within eight months, however, I started to realize I was wrong." ↩

Nick Bostrom, Superintelligence: Paths, Dangers, Strategies, Oxford University Press, 2014. The book reached #17 on the New York Times bestseller list for scientific books in August 2014. ↩

Peter Singer and Derek Parfit welcomed Superintelligence as a fundamental contribution to moral philosophy. Source: The New Yorker, "The Doomsday Invention", November 2015. ↩

Sam Altman wrote in 2015 that Superintelligence was "the best thing he had ever read on AI risks". This was before founding OpenAI. ↩

Elon Musk tweeted in August 2014: "AI is potentially more dangerous than nukes." Bill Gates told the Washington Post in 2015: "I am in the camp that is concerned about superintelligence." Stephen Hawking issued similar statements around the same time. ↩

LessWrong was created in February 2009 by Yudkowsky, using his posts from the Overcoming Bias blog as starting material. It became the center of the "rationalist community". Source: https://alignmentforum.org (accessed December 27, 2025). ↩

Harry Potter and the Methods of Rationality (HPMOR) counts over 660,000 words and is considered one of the most popular fanfictions ever. It was completed on March 14, 2015, and translated into dozens of languages. ↩

The Future of Humanity Institute closed on April 16, 2024, after nineteen years. Anders Sandberg wrote in the final report that the FHI was "suffocated by the Faculty's bureaucracy". Source: https://futureofhumanityinstitute.org (accessed December 27, 2025). ↩

Eliezer Yudkowsky, "The Only Way to Deal With the Threat From AI? Shut It Down", TIME, March 2023. The article proposed a permanent international moratorium on advanced AI development. ↩

MIRI announced in 2024-2025 a shift from technical research to AI governance, closing projects like Visible Thoughts and Agent Foundations. In December 2025 it launched its first fundraising campaign in six years. ↩

Eliezer Yudkowsky and Nate Soares, If Anyone Builds It, Everyone Dies: Why Superhuman AI Would Kill Us All, Little, Brown and Company, 2025. The book proposes stringent international controls on AI hardware. ↩

Center for AI Safety, "Statement on AI Risk", May 2023. The single-sentence statement was signed by Geoffrey Hinton, Yoshua Bengio, Sam Altman, Demis Hassabis, Bill Gates, and over 1,000 other scientists and industry leaders. ↩