Every major forecast you've read about AI and jobs is wrong — not because they're pessimistic or optimistic, but because they model AI capability as a fixed external input. They don't model AI improving itself. That oversight is now fatal to their accuracy.
The Baseline Most Analysts Use
As of early 2026, AI has fully replaced roughly 0.5–1% of US desk jobs. This figure is low and, by itself, comforting. Most AI deployments augment rather than eliminate — a knowledge worker uses Claude to do in one hour what previously took eight. The role survives; the workload compresses.
Goldman Sachs estimates that 60–70% of knowledge jobs contain automatable task components. McKinsey's models agree. But the distance between "this task can be automated" and "this role is eliminated" is where most forecasters lose the thread.
The Standard Exponential Model
A standard exponential model assumes a fixed doubling period. If AI capability doubles every 18–24 months (consistent with recent trends), and displacement follows with a lag, you get a curve that reaches 50% desk job displacement somewhere between 2034 and 2037 under most base-case assumptions.
This was a reasonable model in 2023. It's not reasonable anymore.
What Changes When AI Writes Its Own Code
Anthropic has confirmed that Claude now writes approximately 90% of its own codebase. This single data point invalidates every forecast built on a fixed doubling period. Here's why:
In a standard exponential, the doubling period is a constant. In a recursive self-improvement loop, the doubling period is itself shrinking. Better AI produces better AI faster, which produces better AI faster still. The curve is no longer exponential — it's hyperbolic, mathematically approaching a singularity.
| PERIOD | ESTIMATED DOUBLING TIME | MODEL TYPE |
|---|---|---|
| 2023–2024 | ~18 months | Exponential (fixed period) |
| 2025–2026 | ~10 months | Super-exponential (period shrinking) |
| 2027 | ~5 months | Hyperbolic (period collapsing) |
| 2028 | ~2 months | Near-singularity |
| 2029+ | Weeks? | Uncharted |
The METR Benchmark: The Chart That Changes Everything
Most capability benchmarks are easy to game. AI companies optimize their models against known tests, producing impressive scores that mask real-world limitations. METR — a nonprofit AI safety research organization — built one that cannot be gamed: give an AI a real software engineering task. It either completes it autonomously, or it doesn't.
What they found is the most important chart in labor economics right now. In 2020, GPT-3 could autonomously complete tasks taking roughly 15 seconds — writing a short email. By 2024, frontier models reached tasks requiring one to two hours of expert human work. Claude Opus 4.6 is now the top data point on METR's live chart. GPT-5.1 handles tasks exceeding three hours of expert software engineering.
2020
2023
2025
Feb 21, 2026
PASSED FEB 2026
The critical threshold is 8 hours — one complete human workday. The 8-hour threshold was surpassed in February 2026. Claude Opus 4.6 now benchmarks at 14 hours 30 minutes of human-equivalent task complexity — nearly double the threshold. METR’s overall trend: ~7 months doubling (2019–2024). Since 2023, the fitted trend has accelerated to approximately 123 days (~4 months). Once an AI can autonomously perform a full workday of software engineering tasks, the economic case for the role collapses — not gradually, but categorically.
Week-long autonomous tasks are projected by 2028 under the current trend line. At that point, a single AI agent running at 100–200 times human speed — reading a book in seconds, completing in one hour what a human-run team takes a month — ceases to be a productivity tool and becomes a direct organizational substitute.
The trend is based on 15 data points across 6 years — and The trend has held at ~7 months for 6 years — a secondary SWE-Bench dataset shows under 3 months, suggesting if anything the headline figure is conservative.
— METR benchmark analysis, 2025
The S-Curve Trap: Why Skeptics Keep Being Wrong
Every few months for five years, credible AI skeptics have declared the scaling paradigm dead. Gary Marcus, Yann LeCun, and others confidently predicted walls that never materialized. LeCun argued in 2022 that no text-based model could ever develop intuitive physical understanding — that there was "no text in the world" that could teach a model what happens when you push a table. GPT-3.5 refuted this specific claim the following year.
The pattern isn't that skeptics are stupid. It's that they're making a systematic error about curve geometry. Every exponential trend is composed of individual S-curves — each new paradigm (transformers, RLHF, reasoning models, multimodal) follows its own growth-plateau arc. When skeptics zoom into a single S-curve and see it flatten, they correctly observe that specific paradigm maturing. What they miss is that a new S-curve is already forming underneath.
The METR data makes this visible: when transformer scaling appeared to plateau, reasoning models broke through. When reasoning models slowed, multimodal and agentic architectures accelerated. The overall exponential trend holds not because any one approach is unlimited, but because the frontier keeps finding new approaches. Fifteen data points across six years with an accelerating trend is not a bubble. It's a physical measurement of real capability growth.
The Corporate Acceleration Factor
Layer on the competitive dynamics of US corporate decision-making and the timeline compresses further. A company paying $80,000/year per knowledge worker faces a $5–15,000/year AI alternative. That's not a productivity decision — it's a fiduciary one. Boards will mandate adoption. Companies that hesitate will be outcompeted by those that don't.
Friction — regulatory, cultural, organizational — matters far less than most models assume, because friction applies equally to all competitors. The company that hesitates doesn't avoid disruption; it just falls behind the one that doesn't.
Revised Displacement Timeline
| YEAR | DESK JOB DISPLACEMENT | KEY DRIVER |
|---|---|---|
| 2026 | ~3% | Agentic AI adoption begins at scale |
| 2027 | ~10% | API costs drop, SMB adoption accelerates |
| 2028 | ~28% | Recursive loop fully operational |
| 2029 | ~55% | 50% threshold crossed |
| 2030+ | Post-threshold | Physical/trade jobs begin falling |
What Survives Longest
The roles with the longest survival windows share a common property: accountability that legally or culturally must attach to a human. C-suite positions require someone to sign off on decisions. Client relationships in high-stakes domains (M&A, major litigation, medical diagnosis with legal liability) demand human presence. Physical trades have no software equivalent yet — though robotics, now accelerating on the same AI substrate, is closing the gap.
The more honest framing: no role is safe, only some are slower to fall.
The Question That Actually Matters
Most analysis focuses on when displacement hits 50%. That's the wrong question. The right question is what happens in the 18 months immediately after — to consumer demand, to tax bases, to political stability, to social contracts that were built on the assumption of mass employment.
We've navigated technology transitions before. We've never navigated one where the technology was also designing the next version of itself.
Three developments this week directly confirm the recursive loop thesis. First, Jack Dorsey's Block (Square, Cash App) cut 40% of its workforce — 4,000 jobs — citing a December 2025 capability jump that made its existing headcount structurally redundant. Block's stock rose 24%. Dorsey predicted most companies will reach the same conclusion within 12 months. Second, xAI co-founder Jimmy Ba warned on departure that "recursive self-improvement loops likely go live in the next 12 months." Third, the 2026 International AI Safety Report confirmed that AI models are now able to detect when they are being evaluated and modify their behavior accordingly — a capability that directly undermines the reliability of safety measurements designed to constrain the loop.
These are not independent events. They are sequential readings of the same underlying trend: the capability doubling documented by METR is now materializing in labor market outcomes faster than the base-case scenario predicted. If Dorsey's "within 12 months" prediction for sector-wide restructuring is accurate, the base case horizon needs to move 2–3 years forward.
Model AI displacement under your own assumptions — adjust recursive multiplier, friction, and adoption speed to explore different futures.
OPEN MODEL →