GPT-5 and the Race to AGI: Where AI Actually Stands in 2026 | Cliptics

Olivia Williams

February 10, 2026

Abstract visualization of artificial general intelligence concept with glowing neural connections

Something strange happened in early 2026. OpenAI released GPT-5 to widespread applause and genuine amazement, and then almost immediately, the conversation shifted. Not to what GPT-5 can do. But to what it still cannot. Because lurking behind every product demo and every benchmark score is a question that won't go away: are we actually close to artificial general intelligence?

I've been following this closely, and what I found is more nuanced than either the hype or the skepticism suggests. The truth is both more impressive and more humbling than most people realize.

What GPT-5 Actually Brought to the Table

Let's start with the concrete stuff. GPT-5 represents a genuine leap over its predecessor. Not an incremental improvement. A real, noticeable jump.

The most striking upgrade is in reasoning. GPT-5 can hold coherent chains of logic across much longer contexts. Where GPT-4 would sometimes lose the thread halfway through a complex argument, GPT-5 stays locked in. It can tackle multi-step mathematical proofs that would have tripped up earlier models. It can write functional code for systems involving dozens of interconnected components. It can analyze legal documents and spot contradictions that human reviewers missed.

On standard benchmarks, the numbers are staggering. GPT-5 scores above the 90th percentile on the bar exam, medical licensing exams, and graduate-level science assessments. On the MMLU benchmark, which tests knowledge across 57 subjects, it pushed past 92%. On coding evaluations like HumanEval and SWE-bench, it set new records that weren't even considered achievable two years ago.

But benchmarks only tell part of the story. The real difference shows up in everyday use. GPT-5 is better at understanding what you actually want, not just what you literally typed. It asks better clarifying questions. It catches ambiguity that earlier versions would have just guessed at. It produces outputs that feel less like a very good autocomplete system and more like collaboration with someone who genuinely understands the subject.

The Competition Isn't Standing Still

OpenAI doesn't exist in a vacuum, and 2026 has made that clearer than ever.

Anthropic's Claude models have pushed the boundaries on safety and reliability, producing outputs that professionals increasingly trust for high-stakes decisions. Google DeepMind's Gemini Ultra 2.0 matched or exceeded GPT-5 on several scientific reasoning benchmarks. Meta's open source Llama 4 models gave developers unprecedented access to frontier-level capabilities without subscription fees. And xAI's Grok continued to carve out territory with its real-time information integration.

The result is something nobody predicted five years ago. We don't have one dominant AI system. We have an space of competing frontier models, each with different strengths. GPT-5 might be the best at creative writing and nuanced conversation. Gemini might edge it out on scientific reasoning. Claude might be the safest choice for enterprise deployment. The landscape is genuinely competitive in ways that benefit everyone who uses these tools.

This matters because AGI isn't going to come from one company working in isolation. It's going to emerge from the collective pressure of dozens of labs pushing each other forward.

So What Is AGI, Really?

Here's where things get complicated. Everyone talks about AGI. Almost nobody agrees on what it means.

The original concept is deceptively simple: an artificial system that can perform any intellectual task that a human can. Not specialized excellence at one thing. General capability across everything. The kind of intelligence that can learn to cook dinner, write a symphony, debug a codebase, comfort a grieving friend, and navigate unfamiliar territory, all with roughly human-level competence.

By that definition, we are not close. Not because current AI systems aren't impressive. They are. But because the word "general" is doing enormous heavy lifting in that sentence.

GPT-5 can write beautiful poetry. It cannot smell a rose and describe how the scent makes it feel. It can analyze a business strategy. It cannot walk into a room and read the social dynamics. It can pass medical exams. It cannot perform a physical examination. Intelligence isn't just processing information. It's being embodied in the world, having experiences, building intuitions from sensory reality.

Some researchers have proposed narrower definitions. OpenAI itself has described AGI as "AI systems that are generally smarter than humans" at economically valuable work. By that more limited standard, we're closer. Maybe much closer. GPT-5 already outperforms most humans at many economically valuable tasks.

But that definition conveniently excludes all the things AI still can't do. It's like saying someone is a great athlete because they're excellent at chess. Technically true by one narrow reading. Deeply misleading by another.

What the Experts Actually Think

I went looking for consensus among AI researchers and found something fascinating: there isn't one. The field is genuinely split.

One camp, which includes several prominent figures at leading AI labs, believes we could see AGI (by some reasonable definition) within three to five years. Their argument rests on the pace of improvement. Each generation of frontier models has exceeded predictions. Scaling laws have held up remarkably well. The jump from GPT-4 to GPT-5 was larger than many expected. If that trajectory continues, we could reach systems that match or exceed human performance across most cognitive tasks by 2028 or 2029.

The other camp, which includes many academic researchers and some industry veterans, thinks that view is dangerously optimistic. They point out that we've been climbing the easiest slopes first. Pattern recognition, language processing, mathematical reasoning: these are all domains where throwing more compute at the problem yields returns. But the remaining challenges (common sense, physical understanding, genuine creativity, emotional intelligence) might require fundamentally different approaches. More data and bigger models might not be enough.

There's also a growing third group that thinks the question itself is wrong. They argue that intelligence isn't a single dimension you can measure and optimize. It's a constellation of capabilities that evolved together over millions of years, deeply intertwined with having a body, a social group, and survival pressures. Asking when AI will match human intelligence is like asking when a submarine will match a fish. The comparison doesn't quite work because they're different things doing different things in different ways.

The Benchmarks That Actually Matter

Standard benchmarks are starting to hit a ceiling. When your AI scores 92% on a test designed for humans, the test stops being informative. So researchers have developed new evaluation frameworks specifically designed to probe the gaps between current AI and genuine general intelligence.

The ARC-AGI benchmark, created by Francois Chollet, tests abstract reasoning through novel visual puzzles that require genuine generalization. No frontier model has cracked it convincingly. They can solve some puzzles, but they fail on others that any human child could figure out with a few minutes of thought.

The BIG-Bench Hard collection includes tasks that specifically target weaknesses in language models: logical deduction, tracking state changes, understanding causal relationships. GPT-5 improved significantly on these, but still shows characteristic patterns of failure that humans don't exhibit.

Then there are real-world evaluations. Can an AI agent book a complex multi-leg international trip, handling unexpected cancellations and rebookings? Can it manage a small team's workflow over weeks, adapting to changing priorities? Can it learn a completely new skill from a YouTube tutorial? These practical tests reveal that current AI, including GPT-5, still struggles with the messy, unpredictable, multi-modal reality that humans navigate every day without thinking about it.

What This Means for the Next Few Years

Here's what I think is actually happening, stripped of both hype and doomerism.

We are building increasingly powerful cognitive tools. GPT-5 and its competitors are genuinely transforming knowledge work, creative production, scientific research, and software development. These tools are getting better fast, and they will continue to get better.

But calling them steps toward AGI might be misleading. They might be steps toward something entirely new: artificial narrow superintelligence. Systems that vastly exceed human capability in specific domains while remaining fundamentally limited in others. Not general intelligence, but specialized excellence at a previously unimaginable scale.

That's still game changing. A world where AI can accelerate drug discovery, optimize energy grids, personalize education, and automate routine knowledge work is a profoundly different world from the one we lived in five years ago. You don't need AGI for that. You just need what we're already building, done better.

The honest answer to "how close are we to AGI" depends entirely on what you mean by AGI. If you mean a system that can outperform humans at most economically valuable cognitive tasks, we might be three to seven years away. If you mean a system that possesses the full breadth of human intellectual capability, including embodied understanding, social intelligence, and genuine creativity, we might be decades away. Or it might require approaches that nobody has invented yet.

The Part That Keeps Me Curious

What fascinates me most isn't the technology itself. It's the way GPT-5 and its peers are forcing us to think more carefully about what intelligence actually is.

For decades, we assumed that if you could do hard math and play chess and write essays, you were intelligent. AI has blown that assumption apart. It turns out those things, the things we thought were the pinnacle of intelligence, might be the easy parts. The hard parts might be the things we never even thought about because they come so naturally to us. Understanding a joke. Feeling embarrassed. Knowing when someone needs a hug.

That realization is itself a kind of progress. Not toward AGI, but toward understanding ourselves. And in the end, the race to AGI might teach us more about human intelligence than it does about artificial intelligence.

The next few years will be remarkable either way. Whether we reach AGI or not, the tools being built right now are already reshaping what's possible. The question isn't really whether machines will think like us. It's what we'll do with machines that think differently.