Forecasting Methodology & Analytical Framework

CONTENTS

01 Nine-Variable Framework (V1-V9) 02 Scenario Construction Methodology 03 Data Sources Overview 04 Confidence Calibration System 05 Update Cadence 06 Caveats & Limitations

01. Nine-Variable Framework

THE STRUCTURAL INPUTS THAT DRIVE EVERY AI LABS PROJECTION

Every displacement forecast published on this site is generated from a model built on nine core variables. These variables were selected because they represent the minimum set of independent causal factors required to model AI labor displacement with structural fidelity. Each variable is measurable (though measurement quality varies), each has documented sources, and each can be independently adjusted to explore different futures.

The framework is deliberately parsimonious. More variables would increase apparent precision but not actual accuracy. We model the forces that matter most and are transparent about what is excluded.

AI Capability Growth Rate

The annualized rate at which frontier AI models gain measurable task capability, calibrated against the METR autonomous task benchmark. This is not a measure of parameter count, training compute, or benchmark scores on synthetic tests. It tracks the duration of real-world expert tasks that AI can complete autonomously. As of February 2026, the METR doubling period is approximately 7 months overall and ~4 months since 2023, based on 15 data points across 6 years. V1 is the single most consequential variable in the model. Small changes in V1 cascade through every downstream projection.

UNIT: months (doubling period) SOURCE: METR TH1.1 CONFIDENCE: HIGH

Recursive Self-Improvement Factor

A multiplier representing the degree to which AI systems contribute to their own capability improvement, thereby shortening V1. When AI writes its own training code, optimizes its own architecture, and generates its own synthetic training data, the doubling period is no longer exogenous -- it becomes a function of existing capability. Anthropic has confirmed that Claude writes approximately 90% of its own codebase. This variable captures the feedback loop: better AI produces better AI faster. A value of 1.0 means no self-improvement effect (fixed doubling period). Values above 1.0 mean the curve is super-exponential. Values above ~1.5 produce hyperbolic dynamics within the model's time horizon.

UNIT: dimensionless multiplier SOURCE: Anthropic, ML research CONFIDENCE: MEDIUM

Adoption Friction Coefficient

Measures the organizational, technical, and cultural resistance to deploying AI in roles where it is already technically capable. This is the gap between "AI can do this" and "organizations actually use AI to do this." Friction includes IT infrastructure readiness, change management costs, vendor lock-in, data migration complexity, middle-management resistance, and simple institutional inertia. The MIT Iceberg Index quantifies this gap: as of November 2025, 11.7% of US jobs are economically viable for AI substitution, but actual displacement is approximately 1%. The ratio implies a friction coefficient of roughly 0.90 -- meaning 90% of technically feasible displacement has not yet converted into actual job loss.

UNIT: 0-1 scale (0=no friction, 1=total resistance) SOURCE: MIT Iceberg Index CONFIDENCE: HIGH

Regulatory Drag

Quantifies the decelerating effect of government regulation, litigation risk, and compliance requirements on AI deployment speed. Includes existing labor law, emerging AI-specific regulation (EU AI Act, proposed US frameworks), sector-specific rules (healthcare, finance, legal), and liability uncertainty. Regulatory drag does not prevent displacement -- it delays it. Historical precedent (ride-sharing, fintech, telemedicine) suggests regulatory drag typically adds 2-5 years to adoption timelines but does not alter terminal adoption levels. In our model, V4 shifts the displacement curve rightward on the time axis without changing its ultimate shape.

UNIT: years of delay SOURCE: Policy analysis, EU AI Act CONFIDENCE: MEDIUM

Economic Incentive Multiplier

The cost differential between human labor and AI alternatives for equivalent task output. This is the fiduciary driver: when AI costs 5-15x less than a human employee for the same output, deployment becomes not a technology decision but a shareholder obligation. V5 captures the ratio of fully-loaded human labor cost (salary, benefits, office space, management overhead, error rates) to equivalent AI cost (API pricing, integration, monitoring, quality assurance). As of early 2026, the ratio ranges from 5:1 for routine knowledge work to 20:1 for high-volume data processing. V5 is accelerating as API costs drop approximately 10x per year while human labor costs inflate 3-4% annually.

UNIT: cost ratio (human:AI) SOURCE: BLS, corporate filings CONFIDENCE: HIGH

Labor Market Elasticity

Measures the capacity of the labor market to absorb displaced workers through new task creation, reskilling, and sectoral reallocation. Historical technology transitions (mechanization, electrification, computerization) saw high elasticity: displaced workers eventually found new roles in new industries. V6 captures whether this absorption capacity holds in the AI transition. The critical question: does AI create enough new human-complementary tasks to offset displacement? Acemoglu & Restrepo (2019) show that reinstatement effects have weakened in recent decades. If V6 is low, displacement accumulates rather than redistributes.

UNIT: reabsorption rate (0-1) SOURCE: Acemoglu & Restrepo (NBER) CONFIDENCE: LOW

Task Decomposability Index

Measures the degree to which a given role can be decomposed into discrete, automatable subtasks versus requiring holistic judgment that resists decomposition. Roles with high V7 (data entry, report generation, code review, customer service scripting) are automatable at the task level even before full-role replacement is feasible. Roles with low V7 (executive decision-making, complex negotiations, novel creative direction) involve contextual reasoning across domains that resists clean task boundaries. Goldman Sachs estimates 60-70% of knowledge jobs have high-V7 task components. The insight: displacement happens task-by-task before it becomes visible as headcount reduction.

UNIT: 0-1 index SOURCE: Goldman Sachs, O*NET analysis CONFIDENCE: MEDIUM

Physical Bottleneck Factor

Captures the degree to which a role requires physical-world interaction that software cannot perform. Pure knowledge work (V8 near 0) is fully exposed to AI displacement. Roles requiring hands, spatial navigation, or real-time physical manipulation (V8 near 1) have a hardware dependency that current robotics cannot satisfy at competitive cost. V8 is the primary reason desk jobs face earlier displacement than trades. However, V8 is not static: advances in embodied AI, humanoid robotics (Tesla Optimus, Figure), and industrial automation are steadily reducing the physical bottleneck. Our model treats V8 as declining over time, reaching meaningful automation capability for structured physical tasks by 2030-2032.

UNIT: 0-1 scale (0=pure digital, 1=pure physical) SOURCE: O*NET, robotics research CONFIDENCE: MEDIUM

Social Resistance Variable

Models the non-economic, non-regulatory human resistance to AI adoption: public backlash, union action, consumer preference for human interaction, cultural norms around human accountability, and political movements opposing automation. V9 is the least quantifiable variable in the framework. Historical precedent (Luddites, automation protests in the auto industry, gig economy backlash) suggests social resistance modulates timing but does not reverse technological adoption curves. The exception scenario: if displacement reaches politically destabilizing levels (>15-20% unemployment), social resistance could produce hard policy interventions (mandated human staffing, automation taxes) that fundamentally alter the curve. Our model includes this as a threshold effect, not a continuous variable.

UNIT: 0-1 scale (qualitative) SOURCE: Historical analysis, polling CONFIDENCE: LOW

FRAMEWORK NOTE

These nine variables are not independent. V1 and V2 form a feedback loop. V5 drives adoption speed, which reduces V3 over time. V9 can trigger V4 through political channels. The model handles these interactions through iterative simulation, not closed-form equations. This means small parameter changes can produce nonlinear output differences -- a feature, not a bug, as it reflects the genuine structure of the system being modeled.

02. Scenario Construction Methodology

HOW CONSERVATIVE, BASE CASE, AND RECURSIVE SCENARIOS ARE BUILT

All AI Labs projections are published as three parallel scenarios, not point predictions. Each scenario represents a coherent, internally consistent set of V1-V9 parameter values. The scenarios are not optimistic/pessimistic -- they correspond to different structural assumptions about the world.

The table below documents the exact parameter settings for each scenario. Readers can use the interactive displacement model on this site to explore intermediate parameter combinations.

VARIABLE	CONSERVATIVE	BASE CASE	RECURSIVE
V1 — Capability Growth	12-mo doubling	7-mo doubling	4-mo doubling (accelerating)
V2 — Self-Improvement	1.05x (negligible)	1.35x (moderate loop)	1.60x (active loop)
V3 — Adoption Friction	0.70 (high inertia)	0.40 (moderate)	0.10 (minimal)
V4 — Regulatory Drag	+4 years delay	+2 years delay	+0.5 years delay
V5 — Cost Incentive	3:1 ratio	8:1 ratio	15:1 ratio
V6 — Market Elasticity	0.60 (strong reabsorption)	0.35 (partial)	0.10 (weak)
V7 — Decomposability	0.40 (limited)	0.60 (moderate)	0.85 (deep decomposition)
V8 — Physical Bottleneck	0.80 (strong barrier)	0.60 (moderate barrier)	0.30 (barrier eroding)
V9 — Social Resistance	0.50 (meaningful pushback)	0.20 (limited effect)	0.05 (negligible)

SCENARIO OUTPUT	CONSERVATIVE	BASE CASE	RECURSIVE
10% desk job displacement	2032	2027	2027
25% desk job displacement	2037	2029	2028
50% desk job displacement	2040-2045	2031-2033	2029
Physical/trade job impact begins	2038+	2030-2032	2028-2029
Curve geometry	Standard S-curve	Accelerating exponential	Hyperbolic

Conservative scenario assumes AI capability growth decelerates to a 12-month doubling period (consistent with a paradigm plateau), self-improvement effects are negligible, adoption faces persistent organizational friction, and regulatory environments significantly constrain deployment speed. This scenario is consistent with the position held by most mainstream labor economists.

Base case scenario extends current METR-measured trends without acceleration or deceleration. The recursive self-improvement loop exists but is moderate (1.35x multiplier, derived from Anthropic's code-authorship data). Adoption friction is real but erodes under competitive pressure. This is our central projection and the one used in headline forecasts.

Recursive scenario assumes the self-improvement feedback loop fully materializes, producing hyperbolic rather than exponential growth. Adoption friction collapses as the cost differential becomes irresistible. Regulatory response is slow. This is a tail-risk scenario, not a central forecast -- but the METR acceleration trend (7 months overall, 4 months since 2023) means it cannot be dismissed.

03. Data Sources Overview

PRIMARY INPUTS TO THE FORECASTING MODEL

All AI Labs projections derive from primary institutional research and direct data sources. We do not cite media analysis, opinion columns, or secondary coverage as evidence. Where institutional research (e.g., Goldman Sachs) is used, we reference the primary research report, not news articles about it. Our full source documentation is available on the Sources page.

ACADEMIC Acemoglu & Restrepo (NBER, 2018/2019/2022) -- the task-based displacement-reinstatement framework that provides the theoretical foundation for all serious AI labor modeling. Published in Econometrica (the field's most rigorous journal). Calibration basis for V6 and V7 parameters.
ACADEMIC MIT Iceberg Index (CSAIL, November 2025) -- economic viability threshold study measuring where AI substitution is financially rational, not merely technically feasible. Primary source for V3 calibration and the current-state displacement estimate (11.7% viable, ~1% actual).
ACADEMIC Harvard Business School Working Paper 25-039 -- firm-level analysis of generative AI effects on skill requirements using O*NET and LightCast data (923 occupations, 2019-2024). Documents 24% decline in AI-exposed skills at high-automation firms.
ACADEMIC Schmidt et al. (NBER Working Paper 33509) -- 58 million LinkedIn profiles and 14 million job postings analyzed against O*NET activity data. Shows 3.5% employment decline in top-paying roles at AI-adopting firms over five years.
BENCHMARKS METR Autonomous Task Benchmark (TH1.1) -- the primary calibration source for V1 (AI Capability Growth Rate). Measures real-world autonomous task completion by frontier models. 15 data points across 6 years. Current frontier: Claude Opus 4.6 at 14h 30m human-equivalent task complexity. Doubling period: ~7 months overall, ~4 months since 2023.
GOVERNMENT Bureau of Labor Statistics (BLS) -- monthly employment data (Current Employment Statistics, CES), occupational employment and wage statistics (OES), quarterly census of employment and wages (QCEW). Used for baseline labor market parameters, wage data (V5 calibration), and real-time employment trend validation.
GOVERNMENT 2026 International AI Safety Report -- cross-government assessment of AI capability trajectories and safety risks. Documents AI evaluation-awareness and behavioral modification capabilities relevant to V2 assessment.
CORPORATE Goldman Sachs Global Investment Research (Briggs & Dong, 2025) -- 800+ occupation analysis estimating 6-7% workforce displacement under wide adoption (range 3-14%). Provides V7 calibration and sector-level risk decomposition.
CORPORATE Quarterly earnings calls and investor presentations -- direct statements from technology companies, major employers, and AI infrastructure providers regarding AI deployment, headcount decisions, and cost savings. Sources include Anthropic, OpenAI, Google DeepMind, Microsoft, Block (Square), and others. Used for real-time V3 and V5 calibration.
TRACKERS Technology layoff trackers (Layoffs.fyi, TrueUp, Challenger Gray) -- aggregated layoff announcements cross-referenced with company AI investment disclosures to identify AI-correlated workforce reductions versus cyclical adjustments. Used as a leading indicator, not as a primary data source, due to attribution uncertainty.
REGULATORY EU AI Act, proposed US AI frameworks, sector-specific guidance (SEC, HHS, DOL) -- tracked for V4 (Regulatory Drag) calibration. Regulatory trajectory monitored for both direct deployment restrictions and indirect effects (compliance costs, liability exposure, mandatory human-in-the-loop requirements).

DATA INTEGRITY PRINCIPLE

We distinguish three tiers of evidence in all published analysis: directly measured (METR benchmarks, BLS statistics, published corporate data), peer-reviewed inference (Acemoglu & Restrepo frameworks, MIT Iceberg methodology), and AI Labs extrapolation (recursive acceleration projections, scenario-specific timelines). Every claim on this site is tagged with its evidence tier. Where we extrapolate, we say so explicitly.

04. Confidence Calibration System

WHAT HIGH, MEDIUM, AND LOW CONFIDENCE MEAN -- AND WHAT EVIDENCE IS REQUIRED FOR EACH

Every projection, claim, and parameter estimate published on AI Labs carries a confidence designation. These are not subjective feelings -- they correspond to specific evidence thresholds. The system is designed to prevent a common failure mode in forecasting: presenting speculative extrapolations with the same rhetorical weight as empirically grounded measurements.

HIGH CONFIDENCE

Requires direct empirical measurement from at least one peer-reviewed study or primary institutional dataset, corroborated by at least one independent secondary source. The claim must be falsifiable and the measurement methodology must be documented.

Examples: METR capability doubling trend (15 data points), MIT Iceberg viability ceiling (11.7%), BLS employment statistics, Acemoglu displacement-reinstatement framework, current AI cost ratios from published API pricing.

MEDIUM CONFIDENCE

Requires credible institutional research with documented methodology, or a trend extrapolation supported by 3+ data points with a plausible causal mechanism. Some interpretation or synthesis is involved, but the underlying data is strong.

Examples: Goldman Sachs 6-7% displacement estimate, recursive self-improvement multiplier (based on Anthropic statements + ML research trends), V4 regulatory delay estimates, near-term adoption speed projections.

LOW CONFIDENCE

Requires only a plausible causal argument supported by historical analogy or limited trend data. These are scenario-level projections, not forecasts. Low-confidence claims must be explicitly labeled as speculative in all published analysis.

Examples: Hyperbolic curve geometry by 2027-28, 50% displacement by 2029 (recursive scenario), labor market reabsorption capacity (V6), social resistance threshold effects, physical bottleneck erosion timeline.

CALIBRATION ACCOUNTABILITY

We track our confidence calibration accuracy over time. If claims designated "high confidence" prove wrong more than 10% of the time, or "medium confidence" claims prove wrong more than 40% of the time, the calibration system itself needs recalibration. This is tracked in our quarterly deep reviews and published transparently.

05. Update Cadence

WHEN AND HOW PROJECTIONS ARE REVISED

Forecasting models are only as good as their maintenance discipline. Static projections degrade rapidly in a domain where the underlying dynamics shift on monthly timescales. AI Labs maintains a structured revision schedule with three tiers.

MONTHLY

Forecast Revisions

All V1-V9 parameters are re-evaluated against the latest available data. METR benchmark updates, BLS employment releases, and corporate AI deployment announcements are integrated. Parameter adjustments are documented with change logs. The displacement timeline table is regenerated. Published on the first week of each month.

QUARTERLY

Deep Reviews

Full model re-evaluation including structural assumptions, scenario definitions, and confidence calibration accuracy. Incorporates new academic publications, major government reports, and quarterly corporate earnings data. The quarterly review may produce scenario redefinitions, variable additions or removals, or methodology changes. Published as a standalone analysis article.

REAL-TIME

Breaking Adjustments

Material events that invalidate or significantly shift parameter assumptions trigger immediate model updates outside the scheduled cadence. Criteria: a single event that moves any V1-V9 parameter by more than 15% from its current setting. Examples: a new METR data point showing acceleration or deceleration, a major economy passing significant AI regulation, a frontier lab announcing a capability breakthrough or hitting a confirmed scaling wall.

All revisions include a change log documenting which parameters moved, in which direction, by how much, and why. Historical parameter values are preserved for audit and calibration tracking. We do not silently update projections -- every change is versioned and explained.

06. Caveats & Limitations

HONEST ASSESSMENT OF WHAT THIS MODEL DOES NOT AND CANNOT CAPTURE

No forecasting model is better than its weakest structural assumption. We are explicit about the limitations of this framework because intellectual honesty is more valuable than false precision. Users of these projections should weight them accordingly.

No model has successfully predicted AI capability growth more than 18 months in advance. Every major forecast from 2020-2023 required significant upward revision. Our model inherits this fundamental limitation. Projections beyond 2028 should be treated as scenario explorations, not timeline commitments.
The recursive self-improvement variable (V2) is the least empirically grounded parameter. While AI code self-authorship is documented, no peer-reviewed study has quantified the degree to which this compresses capability doubling periods. The hyperbolic curve geometry in the recursive scenario is a mathematical extrapolation of a plausible dynamic, not an observed phenomenon. It may encounter ceilings (architectural limits, training data constraints, compute/energy bottlenecks) that flatten the curve before hyperbolic dynamics manifest.
The model does not capture new task creation with structural fidelity. V6 (Labor Market Elasticity) approximates reabsorption as a single parameter, but historical technology transitions created entirely new industries and job categories that were unforeseeable ex ante. If AI generates a comparable wave of new human-complementary tasks, our displacement projections will overstate actual unemployment. This is the strongest counterargument to our central projections, and we take it seriously.
Feedback effects from displacement itself are modeled only as threshold triggers, not continuous dynamics. Mass unemployment reduces consumer spending, which reduces corporate revenue, which changes AI investment calculus. Political instability from displacement can produce regulatory responses that alter the adoption curve. These second-order effects are real but computationally intractable to model with current data. Our model captures them as discrete scenario switches, not smooth feedback loops.
Geographic, sectoral, and demographic granularity is limited. V1-V9 are calibrated primarily against US data. International displacement timelines differ due to labor cost structures, regulatory environments, and infrastructure readiness. Within the US, displacement will not be uniform across sectors, regions, or demographic groups. Our model produces national aggregates; local reality will vary substantially.
Historical precedent for forecast failures is sobering. The 2013 Frey & Osborne study estimated 47% of US jobs were at risk of automation -- a figure that became a media sensation but proved methodologically flawed (it measured task exposure, not economic viability or adoption probability). McKinsey's 2017 estimates required repeated revision. Our framework was designed to avoid these specific errors, but there are certainly errors we have not anticipated.
The model assumes rational economic actors. In practice, organizations make suboptimal decisions due to political dynamics, sunk-cost fallacies, vendor relationships, and executive ego. These irrational frictions may slow adoption beyond what V3 captures. Conversely, herd behavior and competitive panic may accelerate adoption beyond rational equilibrium. Both directions of error are possible.
Uncertainty ranges widen dramatically beyond 3-year horizons. Our 2026-2028 projections carry meaningful uncertainty bands (plus or minus 30-50%). Our 2029-2032 projections carry very wide uncertainty bands (plus or minus 100% or more on timing). Our 2033+ projections are scenario sketches, not forecasts. We present them because understanding the shape of possible futures is valuable even when precise timing is unknowable.
Black swan events are not modeled. A major geopolitical conflict disrupting semiconductor supply chains, a fundamental breakthrough in AI safety that produces voluntary deployment constraints, a global financial crisis that collapses AI investment, or an unexpected technical breakthrough that leapfrogs current architectures -- any of these would invalidate model parameters in ways that cannot be predicted from existing data. The model maps the space of futures consistent with current trends. It does not map all possible futures.

THE HONEST BOTTOM LINE

Treat all projections on this site as structured scenario maps, not predictions. The value of this framework is not in telling you what year 50% displacement will occur. It is in making the structural logic visible -- showing which variables matter, how they interact, and what evidence would need to change to shift the outlook materially. If you find yourself citing a specific year from our projections as a settled fact, you are using the model wrong.

RELATED PAGES

VIEW FULL SOURCES → FORECAST TRACKER → LEAD ANALYSIS → INTERACTIVE TOOLS →