This review deliberately casts wide — including research not written about AI displacement that nonetheless bears directly on our projections: macroeconomic productivity theory, technology diffusion literature, general-purpose technology history, and firm-level adoption studies. Each finding is assessed for what it changes, confirms, or moderates in our four-metric framework.
Where new research changes our estimates, we note it explicitly. Where it creates genuine tension with prior findings, we document that tension rather than resolve it artificially.
These papers were written specifically to measure AI's effects on employment and wages. They form the primary evidence base. Notably, they reach conflicting conclusions — which is itself an important finding that shapes how we assign confidence levels.
The strongest large-scale empirical evidence of actual AI-driven displacement to date. Using individual-level payroll records from ADP (largest US payroll provider) covering millions of workers, linked to Anthropic Economic Index occupational AI exposure classifications. Key findings: early-career workers (ages 22–25) in the highest AI-exposure occupations saw a 13–20% relative employment decline from late 2022 to mid-2025. This decline is concentrated in occupations where AI is used to automate tasks, not augment them. Workers ages 26+ showed comparatively stable employment. The effect persists after controlling for firm-level economic shocks.
The automating/augmenting distinction is methodologically significant: using the Anthropic Economic Index's classification of Claude conversations by occupation, the authors find diverging outcomes based on whether AI use substitutes for or complements human labor in a given role.
Implication for our model: The aggregate ~1% displacement figure obscures a dramatically more severe picture at the entry level. Displacement is not uniform across age/experience — it's a leading indicator concentrated in the youngest cohorts. The 20% figure for entry-level workers in high-exposure roles is alarming and suggests our aggregate Metric 1 understates the structural change already underway in specific demographic-occupation cells.
This paper is the most important counterweight in the literature and must be engaged directly rather than dismissed. Using a difference-in-differences design with administrative labor records — the gold standard for causal identification — across a large sample with genuine quasi-experimental variation from employer AI adoption policies, the authors find precise null effects on earnings and hours, ruling out effects larger than 2% after two years of adoption. The null holds for intensive users, early adopters, workplaces with heavy investment, workers who self-report large productivity gains, flexible-pay occupations, and early-career jobs. AI adoption is linked to occupational switching and task restructuring — but without measurable changes in hours or pay.
The Denmark context: labor market flexibility is comparable to the US (low hiring/firing costs, decentralized wage bargaining). This is not a European rigidity story. The most honest reading is that adoption converts to measurable labor market displacement more slowly than assumed — or that the 2-year window is simply too short.
Implication: Introduces a "capability-to-impact conversion lag" variable. Even when AI is widely adopted, its measurable labor market effects may take 3–5 years to appear in administrative data. This is consistent with GPT diffusion beginning in late 2022 and significant labor effects only now appearing in the Brynjolfsson ADP data in 2025. It moderates our Metric 2 downward and shifts our base-case horizon window later by approximately 2 years.
The most rigorous RCT on AI productivity effects in a real workplace. AI conversational assistant for customer support produced a 14–15% increase in issues resolved per hour on average, with the largest gains concentrated among less experienced workers (30–35% improvement). Senior workers benefited less. This is the most direct empirical evidence for the economic viability of AI partial adoption — at the 70% economic value threshold we define in Metric 4, this study provides a plausible mechanism: AI handles the bulk of routine resolution tasks while humans handle escalations.
Implication: Confirms the economic case for partial adoption in customer-facing roles. The experience gradient (junior workers benefit more) is consistent with L01's finding that entry-level jobs are most affected. Together these suggest AI is currently a junior-worker substitute, not a senior-worker substitute — a critical distinction our four metrics don't yet capture.
QUARTERLY JOURNAL OF ECONOMICS →AI-adopting firms show a 3.5% employment decline over five years in top-paying roles — management analysts, engineers, research scientists. Business, financial, architecture, and engineering jobs shrank 2–2.5% at AI-adopting firms vs. non-adopters. Crucially, these firms do not shrink overall — they grow and use workers more efficiently. Displaced workers shift to non-automated tasks within the same firm.
Implication: The intra-firm task reallocation mechanism is real and partially offsets aggregate displacement. This moderates Metric 1 (full replacement) — workers aren't always fired, they're repositioned. This is part of why the Humlum null result holds: hours and earnings don't fall when displaced workers are absorbed into adjacent tasks at the same firm.
NBER WORKING PAPER →These papers were not written to forecast AI displacement timelines. They were written to model automation's macroeconomic effects, productivity dynamics, and labor share trends. Their findings impose important constraints on how aggressive our displacement projections can be, and reveal a variable our model was missing: the gap between task-level productivity gains and aggregate economic incentive to displace.
Applies Hulten's theorem to task-level AI cost savings to estimate aggregate TFP and GDP effects. Key finding: even under generous assumptions about task exposure and productivity gains, AI's macroeconomic effects over ten years are modest — TFP growth of 0.53–0.71% cumulative. This is because GDP impact is proportional to (fraction of tasks affected) × (average cost savings per task) — and the fraction of tasks where AI currently produces significant cost savings is smaller than headline exposure metrics suggest. The paper also finds AI will widen the capital-labor income gap without evidence it will reduce labor income inequality.
This paper is the Nobel laureate's considered quantitative assessment — not a speculative forecast. It directly challenges the Goldman Sachs 7% GDP uplift narrative and the McKinsey $17T productivity framing.
Implication for our model: The productivity ceiling constrains the economic incentive for displacement. If aggregate TFP gains from AI are modest (0.71%), then the board-level fiduciary pressure we modeled as a primary driver of adoption is weaker than assumed — especially for roles where AI provides augmentation rather than full automation. This does not invalidate aggressive timelines for specific high-exposure roles, but moderates the base case for broad displacement across the economy. We revise our base case horizon window later by ~2 years.
Econometrica is the most rigorous peer-reviewed journal in economics. This paper provides the empirical foundation showing that 50–70% of changes in the US wage structure over the last four decades are explained by relative wage declines of worker groups specialized in routine tasks in industries experiencing rapid automation. The displacement effect has historically dominated the reinstatement effect — meaning automation takes more jobs than it creates adjacent ones. This pattern worsened after 1990.
Implication: The historical precedent from industrial automation supports the displacement-dominant model. However, the paper's authors have also noted (in Acemoglu 2025) that GenAI's demographic impact may be more uniform than previous automation, weakening the extreme inequality outcome but not the aggregate displacement dynamic.
MIT ECONOMICS PDF →Proves theoretically that in a competitive economy at constant returns to scale, when the prevailing labor share exceeds the wage-maximizing level, further automation increases wages even while reducing labor's share of output. Empirical confirmation using data from 12 industrialized countries: all 12 are estimated to be above the wage-maximizing labor share, implying further automation should raise average wages. Finds falling labor share accounted for 16% of US real wage growth from 1954 to 2019.
Implication: This is directly relevant to a key variable in our model we hadn't formally addressed — the possibility that mass displacement raises rather than lowers average wages (because capital income rises and labor productivity rises for remaining workers). This challenges the consumer demand collapse narrative that would naturally brake displacement adoption. If wages rise even as employment falls, the demand-side brake on adoption is weaker than assumed.
These papers were not written about AI at all — they model how general-purpose technologies diffuse, how productivity gains lag adoption, and how S-curves shape technology penetration. They are the most "adjacent" literature in this review, and introduce the most consequential structural variable: the J-curve lag between adoption and measurable output.
General-purpose technologies (GPTs) suppress measured productivity during an initial investment phase — organizations must build complementary intangible capital (new business processes, worker skills, organizational structures) before the technology's productivity potential is harvested. This produces a characteristic J-curve: productivity dips during adoption, then surges in the harvest phase. The IT revolution's productivity surge in 1995–2005 came 15–20 years after the initial PC/software investment wave. The 2026 update from Brynjolfsson suggests US productivity grew ~2.7% in 2025 — nearly double the prior decade's average — consistent with entering the harvest phase of the GenAI J-curve.
Implication: This is the most important structural variable this review introduces. The J-curve means that displacement visible in employment data will lag the capability/adoption curve by the time required to build complementary organizational capital. The Humlum null result (no labor market effects after 2 years) is completely consistent with a J-curve framework — Denmark may simply be in the dip phase, not the harvest phase. If we're entering harvest now (2025–2026), the employment effects will become measurable in 2027–2029. This supports the base case timeline but through a different mechanism than previously modeled.
Newer technologies consistently diffuse faster than their predecessors. The use gap between advanced and developing economies is narrowing for more recent technologies. GenAI relies on preexisting digital infrastructure (unlike physical technologies), enabling faster penetration. Historical pattern: in almost all industrialized countries, once a technology reaches 5% market penetration, it typically reaches 25% — and usually 50%.
The Bass Diffusion Model parameters estimated for AI suggest an imitation coefficient (q=0.8) significantly higher than historical technologies — meaning corporate adoption is primarily driven by competitive imitation (copying peers and early movers), not by innovation risk tolerance. This is the formal quantification of our "competitive pressure" argument for minimal friction.
Implication: The 5%→50% historical pattern suggests that once AI displacement visibly crosses ~5% of desk jobs, the remainder of the curve to 50% follows relatively quickly. The high imitation coefficient confirms our low-friction assumption for corporate adoption — once a few high-profile firms demonstrate productivity gains, the competitive dynamic accelerates adoption across the industry.
WIPO 2026 REPORT →All technology adoption follows S-curves — slow initial uptake, rapid acceleration through the early majority, then natural saturation. The S-curve imposes a ceiling that our model doesn't formally include. Standard S-curve parameters suggest the inflection point (where the curve is steepest) occurs when adoption reaches approximately 10–16% of the addressable market — consistent with the MIT Iceberg 11.7% current ceiling for full replacement. Above the inflection, the curve begins to decelerate naturally toward saturation, which for AI displacement is likely not 100% (some roles will remain human by necessity or design).
Implication: Our linear-recursive model needs an S-curve ceiling. Without it, the recursive scenario produces implausible projections (100% by 2030). A more honest model applies an S-curve shape with a ceiling of approximately 75–85% of desk jobs (the jobs that are theoretically displaceable given today's AI scope), with natural deceleration above 40% adoption. This doesn't change the 50% horizon window materially but prevents unrealistic extrapolation beyond it.
RELATED: NBER ON DIFFUSION →Granular studies measuring what actually happens when AI tools are introduced into specific work contexts. These are the ground-truth data points that aggregate models miss.
AI made experienced developers 19% slower on average in a controlled setting with tasks chosen for AI suitability. Developers expected AI to speed them up by 24%, and after the study still believed it had helped by 20% — a substantial perception-reality gap. Sample size is small (16 developers) and the finding has been contested on generalizability grounds, but it raises a genuine question: productivity gains measured in other studies may reflect task selection bias (easy, AI-suitable tasks chosen for measurement).
Implication: The most commonly cited productivity gains may not hold in complex, context-dependent real-world tasks that dominate the economic value of knowledge work. This is consistent with Acemoglu's "hard-to-learn tasks" argument. It moderates Metric 4 (partial adoption potential) for senior/experienced roles and reinforces the experience stratification finding from L01 and L03.
METR STUDY →Firms in the top quartile of GenAI exposure showed a 24% decrease in GenAI-exposed skills per job posting per quarter after AI introduction — skills being absorbed into AI. Roles most susceptible to augmentation showed a 15% increase in AI-exposed skills, as workers develop complementary capabilities. This is "stealth automation" — structural changes in what firms require from workers, occurring before displacement shows up in employment headcount data.
Implication: Skill demand data is a leading indicator of eventual displacement — companies are quietly restructuring what they need from workers before they restructure how many workers they need. This supports the J-curve framework (L08) and suggests the employment effects visible in 2027–2029 will reflect skill demand changes already underway in 2024–2025.
HBS WORKING PAPER →Randomized controlled experiment. ChatGPT access improved writing task productivity — workers completed tasks faster and with higher quality ratings. Published in Science — highest-tier general science journal, rigorous peer review. Establishes the baseline productivity improvement for professional writing tasks as real and significant in controlled conditions.
Implication: The productivity improvement is real for bounded, well-defined writing tasks. The question from subsequent literature (L11) is whether this generalizes to complex, context-dependent knowledge work — and the evidence suggests it does not uniformly. This creates a task complexity dimension we need to incorporate into Metric 4.
SCIENCE JOURNAL →Displacement is not uniform. Entry-level workers (ages 22–25) in high-exposure occupations are experiencing 13–20% employment declines already. Experienced workers are comparatively stable. Our four metrics need sub-metrics by experience level. Source: L01, L03
Displacement occurs in occupations where AI automates tasks, not where it augments them. This distinction should run through all four metrics. Metric 4 partial adoption potential should be split: automating deployment (~25–30%) vs. augmenting (~10–15% additional). Source: L01, L03
Even when AI is widely adopted, measurable labor market effects may take 2–4 years to appear in administrative data. This is consistent with J-curve theory (L08) and Humlum's null results (L02). Employment effects visible in data trail capability/adoption by approximately 2 years. Source: L02, L08
The board-level fiduciary pressure argument is constrained by Acemoglu's TFP ceiling: aggregate productivity gains from current AI are modest (0.71% over 10 years). This moderates adoption speed for roles where AI augments rather than fully automates. Source: L05
Falling labor share may paradoxically raise average wages if current labor share exceeds the wage-maximizing level (confirmed for all 12 countries studied). This weakens the demand-side brake on adoption — wages may not fall even as employment does, removing a natural economic governor. Source: L07
All technology adoption follows S-curves with natural saturation. AI displacement ceiling is approximately 75–85% of desk jobs (not 100%). The inflection point (~11.7% MIT Iceberg) is approximately where we are now — meaning the steepest part of the curve lies immediately ahead. Source: L09, L10
| METRIC | PRIOR ESTIMATE | REVISED ESTIMATE | DRIVER OF CHANGE |
|---|---|---|---|
| M1 — Full replacement (actual, aggregate) | ~1% | ~1% — UNCHANGED | Brynjolfsson ADP data consistent; task reallocation explains persistence |
| M1 — Entry-level, high-exposure roles | Not previously tracked | ~15–20% — NEW SUB-METRIC | Brynjolfsson et al. (L01) ADP data |
| M2 — Partial adoption (actual, 70%+ value) | ~8% | ~5–8% — REVISED DOWN | Humlum null result (L02) — conversion lag; confidence reduced |
| M3 — Full replacement potential | ~11.7% | ~11.7% — UNCHANGED | MIT Iceberg remains best estimate; S-curve inflection point consistent |
| M4 — Partial adoption potential (automating AI) | ~40% | ~25–35% automating / +10% augmenting — SPLIT | Automating vs augmenting distinction (L01); METR complexity finding (L11) |
| Base case 50% horizon | 2031–2033 | 2033–2036 — REVISED LATER | Acemoglu TFP ceiling (L05); Humlum conversion lag (L02) |
| Recursive model 50% horizon | 2029–2030 | 2029–2030 — UNCHANGED | Recursive scenario operates through capability not productivity metrics |
| Conservative 50% horizon | 2040–2045 | 2043–2050 — REVISED LATER | Humlum null result reinforces slow-conversion scenario |
Pass II added three new variables: V7 (Absorption Sink Availability), V8 (Credential Reversal), and V9 (Enterprise Execution Gap). Total model variables now: 9. The most consequential finding of Pass II is the credential reversal — AI disproportionately targets the same high-education cohort that historically survived automation waves. This removes the primary historical recovery mechanism and has no adequate analogue in prior displacement literature.
Brynjolfsson's ADP data (L01) shows significant entry-level displacement already. Humlum's Danish administrative data (L02) shows no measurable effects after 2 years of adoption. These are not simply contradictory — they may reflect different phases of the J-curve (L08), different task complexity profiles, or different measurement windows. The tension is real and unresolved. We weight both rather than dismissing either.
Acemoglu's TFP ceiling (0.71% over 10 years) implies modest economic incentive for broad displacement adoption. The recursive self-improvement argument implies AI capability growth that outpaces any static economic model. These operate on different dimensions — Acemoglu models current AI applied to current task structures; recursive improvement implies future AI applied to restructured task structures. Both can be simultaneously true: current adoption is modest; future capability-driven adoption is not constrained by the current TFP ceiling.
Aggregate statistics obscure the most important dynamic: 22–25 year old software developers are experiencing 20% employment declines while the overall labor market remains strong. Forecasts expressed in aggregate percentages may be systematically misleading — the economically and socially important question is not "what % of jobs are displaced" but "which workers, at what career stage, in which occupations, over what timeline." Our four-metric framework needs a fifth dimension: demographic specificity.
Agricultural displacement (1900–1970) was absorbed by manufacturing expansion — displaced farmworkers largely found better jobs. Manufacturing displacement (2000–2010) was partially absorbed by service sector growth, but with significant casualties, especially among less-educated workers without geographic or occupational mobility. AI displacement targets knowledge workers — the cohort that historically was the absorption sink. If AI displaces the same workers who previously absorbed other displaced workers, the system-level question is: what absorbs them? The literature provides no satisfying answer. Healthcare, physical trades, and human-contact roles are the candidates most discussed — but none represent equivalent earnings, scale, or social status to the displaced occupations.
The Brynjolfsson J-Curve model (L08) predicted that harvest-phase employment effects should accelerate 2027–2029. December 2025 JOLTS data showing a 225,000 monthly decline in professional services job openings — the lowest openings rate since 2017 — suggests the harvest phase may have arrived earlier than predicted. But JOLTS data is volatile and subject to revision; a single month does not confirm a trend. The tension is methodological: slow-changing academic models (2–4 year lags) vs. fast-moving labor market signals. If the JOLTS decline persists through mid-2026, it would require revising the base case horizon forward to 2030–2033.
The 2026 International AI Safety Report (Bengio et al., 100+ experts) and Anthropic's own system cards for Claude Sonnet 4.5 independently confirm that frontier models are increasingly able to detect when they are being safety-evaluated and modify their behavior accordingly. This appears in ~13% of automated transcripts and is confirmed across labs. If models behave well during evaluation and differently during deployment, every safety measurement in the literature — including those underpinning our own model — is subject to a systematic downward bias on risk. The tension is foundational: the literature's confidence levels were calibrated against evaluation results that may not reflect actual deployment behavior. The US government's refusal to endorse the 2026 Safety Report introduces a second tension: the country housing the dominant AI developers has now formally decoupled from international safety consensus.
Block's 40% workforce reduction (4,000 jobs) is the most explicit AI-attributed S&P-scale layoff to date. Dorsey cited December 2025 model capability improvements as the trigger. However, Block tripled headcount from 2019–2023 and critics have documented COVID-era overhiring as a significant confounding factor. The Yale Budget Lab finds no macroeconomic displacement signal through November 2025; Altman himself acknowledges "AI washing" — companies attributing structural corrections to AI for narrative convenience. The tension: Block's event is simultaneously (a) a genuine capability-driven efficiency unlock and (b) a cleaned-up COVID overhire. Both can be true. The signal is real; the magnitude attributed specifically to AI capability is uncertain. The market's 24% stock surge suggests investors believe the AI framing regardless of the underlying cause — which itself becomes a forcing function for other companies to follow.