Home • The Man Behind AlphaZero Is Building an AI That Learns Without Humans

The Man Behind AlphaZero Is Building an AI That Learns Without Humans

David Silver didn’t just study reinforcement learning — he used it to build AlphaZero, a program that mastered chess and Go by playing against itself, no human instruction required. Now he’s betting $1.1 billion that the same logic can unlock something far bigger than a board game.

A Coconut Round With an Actual Theory Behind It

Silver’s newly launched British lab, Ineffable Intelligence, closed a seed round at a $5.1 billion valuation — a number that would strain credulity for a company with no product, no revenue, and no published roadmap. The round was co-led by Sequoia and Lightspeed, with Nvidia’s venture arm reportedly contributing at least $250 million, Google writing a cheque alongside them, and the UK government committing funds through two separate channels: its Sovereign AI Fund and the British Business Bank, which disclosed a standalone $20 million position.

What is a “Coconut Round”? A tongue-in-cheek escalation of the “seed” metaphor — a seed round so large it’s become something else entirely. In 2026, it describes first financings above $1 billion, typically handed to researchers with enough name recognition to make due diligence feel optional.

The money is real. The technical thesis is specific. Lightspeed put it plainly in its investment note: Ineffable is building “a system designed to generate knowledge from its own experience — one that learns not by extrapolating from human examples, but by acting in engineering environments and learning from the signals those environments return.”

The Data Wall Nobody Wants to Talk About

Here’s the problem that makes Silver’s bet coherent: the internet is effectively exhausted as a training resource. Every major LLM lab hit the same ceiling around 2025 — human-generated text topped out as a scaling lever, and synthetic data became the stopgap nobody fully believes in. The frontier models got bigger. The returns got smaller.

Silver’s answer isn’t more data. It’s no data — at least not human data. Reinforcement learning generates its own signal: act in an environment, observe what happens, adjust. The approach doesn’t hit a data wall because it doesn’t depend on one. That’s not a philosophical preference. It’s a structural escape from the constraint everyone else is quietly managing around.

What “Superlearner” Actually Means

The word gets thrown around loosely, but Ineffable has a precise vision behind it. Silver’s stated target is an AI that “discovers all knowledge from its own experience, from elementary motor skills through to profound intellectual breakthroughs.” The company’s website frames success on the scale of Darwin, where his law explained all Life, Ineffable’s would explain and build all Intelligence.

What is Ineffable’s “superlearner”? A reinforcement learning-based AI system designed to generate and accumulate knowledge entirely through its own experience — without training on human text, images, or any human-generated content.

That’s not a pitch deck line. That’s the actual stated target. Silver was central to DeepMind programs that beat professional chess and Go players by learning purely from self-play. The jump from board games to open-ended intelligence is enormous — but the underlying mechanism is the same: reward signals, trial and error, no human hand-holding. The question isn’t whether the approach works in constrained environments. It demonstrably does. The question is whether it scales to open-ended ones.

Three Bets, Three Different Theories of What Comes Next

Silver isn’t alone in rejecting the LLM paradigm — but his specific rejection differs from his peers in ways that matter.

Founder	Company	Core Thesis	Key Backers
David Silver	Ineffable Intelligence	Pure RL — self-generated knowledge, no human data	NVIDIA, Google, UK Sovereign AI Fund
Yann LeCun	AMI Labs	World Models — objective-driven reasoning from structured world representations	DST Global (Meta-linked)
Tim Rocktäschel	Recursive Superintelligence	Self-improving formal logic and reasoning systems	Sequoia (reported)

LeCun’s world model approach argues that intelligence requires an internal model of how the world works — a structured mental simulation that can reason about cause and effect. Silver’s RL approach skips the internal model entirely and learns by doing. Neither has been proven to scale to AGI. Both represent a genuine departure from what OpenAI and Anthropic are building. Hassabis at DeepMind sits somewhere between — using RL for specific domains (AlphaFold, AlphaProof) while maintaining LLM infrastructure for Gemini.

London Is Having a Moment

Ineffable’s raise is the largest seed round in European history. It doesn’t stand alone. Recursive Superintelligence has reportedly attracted between $500 million and $1 billion. AMI Labs closed a $1 billion raise last month. The pattern is unmistakable: the most credentialed researchers leaving the biggest labs and commanding valuations that would have seemed absurd two years ago, mostly clustering in London.

Walk through King’s Cross on any given Tuesday in 2026, and you’re within a few blocks of more concentrated AGI ambition than most cities have in total. The area around Google DeepMind’s main campus has become a gravitational centre — researchers spinning out, investors flying in, and a UK government that’s finally decided being an “AI taker” is a political liability.

The Sovereign AI Fund’s venture lead, Joséphine Kant, offered a sharp endorsement: “Very few founders in the world could credibly set out to build a superlearner. David is one of them.” Science and Technology Secretary Liz Kendall framed the investment as proof that the UK is an AI maker, not just a consumer. The political signalling is deliberate — the fund’s mandate specifically targets UK-anchored foundation model research, not applications or infrastructure.

The Unusual Part

Silver has pledged to donate all personal proceeds from Ineffable to high-impact charities through Founders Pledge. He described the venture as his “life’s work.” That’s either the most credible statement of long-termism in the current funding cycle, or it will be tested in ways he hasn’t fully imagined yet.

A $5.1 billion valuation on a company with nothing shipped is a bet on a name, a method, and a structural thesis about where AI hits its next ceiling. Silver has all three. Whether reinforcement learning can finally deliver on its long-deferred promise at the scale he’s attempting — that’s the only question that actually matters.

Tags:

AGI Research, AI Funding, AlphaZero, David Silver, Ineffable Intelligence, Reinforcement Learning

Zyra Lane

Zyra Lane is a technology analyst and AI storyteller exploring the forces shaping the future of business and innovation. She turns complex trends into clear, actionable insights, helping readers see what’s next in a rapidly evolving digital world.

All Posts