METHODOLOGY
Nine-variable framework (V1-V9) documented with full parameter definitions Three scenario tiers: Conservative, Base Case, Recursive — each with distinct V1-V9 settings Confidence calibration: HIGH requires peer-reviewed evidence; LOW requires only trend extrapolation Monthly forecast revisions, quarterly deep reviews, real-time breaking event adjustments

Forecasting Methodology
& Analytical Framework

Complete documentation of the nine-variable model, scenario construction logic, confidence calibration system, data inputs, update cadence, and honest limitations underpinning every projection published on this site.

FEB 27, 2026 10 MIN READ AI LABS RESEARCH
FRAMEWORK OVERVIEW
CORE VARIABLES
9
V1 through V9
SCENARIO TIERS
3
Conservative / Base / Recursive
CONFIDENCE LEVELS
3
High / Medium / Low
CONTENTS
01 Nine-Variable Framework (V1-V9) 02 Scenario Construction Methodology 03 Data Sources Overview 04 Confidence Calibration System 05 Update Cadence 06 Caveats & Limitations

01. Nine-Variable Framework

THE STRUCTURAL INPUTS THAT DRIVE EVERY AI LABS PROJECTION

Every displacement forecast published on this site is generated from a model built on nine core variables. These variables were selected because they represent the minimum set of independent causal factors required to model AI labor displacement with structural fidelity. Each variable is measurable (though measurement quality varies), each has documented sources, and each can be independently adjusted to explore different futures.

The framework is deliberately parsimonious. More variables would increase apparent precision but not actual accuracy. We model the forces that matter most and are transparent about what is excluded.

V1
AI Capability Growth Rate
The annualized rate at which frontier AI models gain measurable task capability, calibrated against the METR autonomous task benchmark. This is not a measure of parameter count, training compute, or benchmark scores on synthetic tests. It tracks the duration of real-world expert tasks that AI can complete autonomously. As of February 2026, the METR doubling period is approximately 7 months overall and ~4 months since 2023, based on 15 data points across 6 years. V1 is the single most consequential variable in the model. Small changes in V1 cascade through every downstream projection.
UNIT: months (doubling period) SOURCE: METR TH1.1 CONFIDENCE: HIGH
V2
Recursive Self-Improvement Factor
A multiplier representing the degree to which AI systems contribute to their own capability improvement, thereby shortening V1. When AI writes its own training code, optimizes its own architecture, and generates its own synthetic training data, the doubling period is no longer exogenous -- it becomes a function of existing capability. Anthropic has confirmed that Claude writes approximately 90% of its own codebase. This variable captures the feedback loop: better AI produces better AI faster. A value of 1.0 means no self-improvement effect (fixed doubling period). Values above 1.0 mean the curve is super-exponential. Values above ~1.5 produce hyperbolic dynamics within the model's time horizon.
UNIT: dimensionless multiplier SOURCE: Anthropic, ML research CONFIDENCE: MEDIUM
V3
Adoption Friction Coefficient
Measures the organizational, technical, and cultural resistance to deploying AI in roles where it is already technically capable. This is the gap between "AI can do this" and "organizations actually use AI to do this." Friction includes IT infrastructure readiness, change management costs, vendor lock-in, data migration complexity, middle-management resistance, and simple institutional inertia. The MIT Iceberg Index quantifies this gap: as of November 2025, 11.7% of US jobs are economically viable for AI substitution, but actual displacement is approximately 1%. The ratio implies a friction coefficient of roughly 0.90 -- meaning 90% of technically feasible displacement has not yet converted into actual job loss.
UNIT: 0-1 scale (0=no friction, 1=total resistance) SOURCE: MIT Iceberg Index CONFIDENCE: HIGH
V4
Regulatory Drag
Quantifies the decelerating effect of government regulation, litigation risk, and compliance requirements on AI deployment speed. Includes existing labor law, emerging AI-specific regulation (EU AI Act, proposed US frameworks), sector-specific rules (healthcare, finance, legal), and liability uncertainty. Regulatory drag does not prevent displacement -- it delays it. Historical precedent (ride-sharing, fintech, telemedicine) suggests regulatory drag typically adds 2-5 years to adoption timelines but does not alter terminal adoption levels. In our model, V4 shifts the displacement curve rightward on the time axis without changing its ultimate shape.
UNIT: years of delay SOURCE: Policy analysis, EU AI Act CONFIDENCE: MEDIUM
V5
Economic Incentive Multiplier
The cost differential between human labor and AI alternatives for equivalent task output. This is the fiduciary driver: when AI costs 5-15x less than a human employee for the same output, deployment becomes not a technology decision but a shareholder obligation. V5 captures the ratio of fully-loaded human labor cost (salary, benefits, office space, management overhead, error rates) to equivalent AI cost (API pricing, integration, monitoring, quality assurance). As of early 2026, the ratio ranges from 5:1 for routine knowledge work to 20:1 for high-volume data processing. V5 is accelerating as API costs drop approximately 10x per year while human labor costs inflate 3-4% annually.
UNIT: cost ratio (human:AI) SOURCE: BLS, corporate filings CONFIDENCE: HIGH
V6
Labor Market Elasticity
Measures the capacity of the labor market to absorb displaced workers through new task creation, reskilling, and sectoral reallocation. Historical technology transitions (mechanization, electrification, computerization) saw high elasticity: displaced workers eventually found new roles in new industries. V6 captures whether this absorption capacity holds in the AI transition. The critical question: does AI create enough new human-complementary tasks to offset displacement? Acemoglu & Restrepo (2019) show that reinstatement effects have weakened in recent decades. If V6 is low, displacement accumulates rather than redistributes.
UNIT: reabsorption rate (0-1) SOURCE: Acemoglu & Restrepo (NBER) CONFIDENCE: LOW
V7
Task Decomposability Index
Measures the degree to which a given role can be decomposed into discrete, automatable subtasks versus requiring holistic judgment that resists decomposition. Roles with high V7 (data entry, report generation, code review, customer service scripting) are automatable at the task level even before full-role replacement is feasible. Roles with low V7 (executive decision-making, complex negotiations, novel creative direction) involve contextual reasoning across domains that resists clean task boundaries. Goldman Sachs estimates 60-70% of knowledge jobs have high-V7 task components. The insight: displacement happens task-by-task before it becomes visible as headcount reduction.
UNIT: 0-1 index SOURCE: Goldman Sachs, O*NET analysis CONFIDENCE: MEDIUM
V8
Physical Bottleneck Factor
Captures the degree to which a role requires physical-world interaction that software cannot perform. Pure knowledge work (V8 near 0) is fully exposed to AI displacement. Roles requiring hands, spatial navigation, or real-time physical manipulation (V8 near 1) have a hardware dependency that current robotics cannot satisfy at competitive cost. V8 is the primary reason desk jobs face earlier displacement than trades. However, V8 is not static: advances in embodied AI, humanoid robotics (Tesla Optimus, Figure), and industrial automation are steadily reducing the physical bottleneck. Our model treats V8 as declining over time, reaching meaningful automation capability for structured physical tasks by 2030-2032.
UNIT: 0-1 scale (0=pure digital, 1=pure physical) SOURCE: O*NET, robotics research CONFIDENCE: MEDIUM
V9
Social Resistance Variable
Models the non-economic, non-regulatory human resistance to AI adoption: public backlash, union action, consumer preference for human interaction, cultural norms around human accountability, and political movements opposing automation. V9 is the least quantifiable variable in the framework. Historical precedent (Luddites, automation protests in the auto industry, gig economy backlash) suggests social resistance modulates timing but does not reverse technological adoption curves. The exception scenario: if displacement reaches politically destabilizing levels (>15-20% unemployment), social resistance could produce hard policy interventions (mandated human staffing, automation taxes) that fundamentally alter the curve. Our model includes this as a threshold effect, not a continuous variable.
UNIT: 0-1 scale (qualitative) SOURCE: Historical analysis, polling CONFIDENCE: LOW
FRAMEWORK NOTE

These nine variables are not independent. V1 and V2 form a feedback loop. V5 drives adoption speed, which reduces V3 over time. V9 can trigger V4 through political channels. The model handles these interactions through iterative simulation, not closed-form equations. This means small parameter changes can produce nonlinear output differences -- a feature, not a bug, as it reflects the genuine structure of the system being modeled.

02. Scenario Construction Methodology

HOW CONSERVATIVE, BASE CASE, AND RECURSIVE SCENARIOS ARE BUILT

All AI Labs projections are published as three parallel scenarios, not point predictions. Each scenario represents a coherent, internally consistent set of V1-V9 parameter values. The scenarios are not optimistic/pessimistic -- they correspond to different structural assumptions about the world.

The table below documents the exact parameter settings for each scenario. Readers can use the interactive displacement model on this site to explore intermediate parameter combinations.

VARIABLE CONSERVATIVE BASE CASE RECURSIVE
V1 — Capability Growth 12-mo doubling 7-mo doubling 4-mo doubling (accelerating)
V2 — Self-Improvement 1.05x (negligible) 1.35x (moderate loop) 1.60x (active loop)
V3 — Adoption Friction 0.70 (high inertia) 0.40 (moderate) 0.10 (minimal)
V4 — Regulatory Drag +4 years delay +2 years delay +0.5 years delay
V5 — Cost Incentive 3:1 ratio 8:1 ratio 15:1 ratio
V6 — Market Elasticity 0.60 (strong reabsorption) 0.35 (partial) 0.10 (weak)
V7 — Decomposability 0.40 (limited) 0.60 (moderate) 0.85 (deep decomposition)
V8 — Physical Bottleneck 0.80 (strong barrier) 0.60 (moderate barrier) 0.30 (barrier eroding)
V9 — Social Resistance 0.50 (meaningful pushback) 0.20 (limited effect) 0.05 (negligible)
SCENARIO OUTPUT CONSERVATIVE BASE CASE RECURSIVE
10% desk job displacement 2032 2027 2027
25% desk job displacement 2037 2029 2028
50% desk job displacement 2040-2045 2031-2033 2029
Physical/trade job impact begins 2038+ 2030-2032 2028-2029
Curve geometry Standard S-curve Accelerating exponential Hyperbolic

Conservative scenario assumes AI capability growth decelerates to a 12-month doubling period (consistent with a paradigm plateau), self-improvement effects are negligible, adoption faces persistent organizational friction, and regulatory environments significantly constrain deployment speed. This scenario is consistent with the position held by most mainstream labor economists.

Base case scenario extends current METR-measured trends without acceleration or deceleration. The recursive self-improvement loop exists but is moderate (1.35x multiplier, derived from Anthropic's code-authorship data). Adoption friction is real but erodes under competitive pressure. This is our central projection and the one used in headline forecasts.

Recursive scenario assumes the self-improvement feedback loop fully materializes, producing hyperbolic rather than exponential growth. Adoption friction collapses as the cost differential becomes irresistible. Regulatory response is slow. This is a tail-risk scenario, not a central forecast -- but the METR acceleration trend (7 months overall, 4 months since 2023) means it cannot be dismissed.

03. Data Sources Overview

PRIMARY INPUTS TO THE FORECASTING MODEL

All AI Labs projections derive from primary institutional research and direct data sources. We do not cite media analysis, opinion columns, or secondary coverage as evidence. Where institutional research (e.g., Goldman Sachs) is used, we reference the primary research report, not news articles about it. Our full source documentation is available on the Sources page.

DATA INTEGRITY PRINCIPLE

We distinguish three tiers of evidence in all published analysis: directly measured (METR benchmarks, BLS statistics, published corporate data), peer-reviewed inference (Acemoglu & Restrepo frameworks, MIT Iceberg methodology), and AI Labs extrapolation (recursive acceleration projections, scenario-specific timelines). Every claim on this site is tagged with its evidence tier. Where we extrapolate, we say so explicitly.

04. Confidence Calibration System

WHAT HIGH, MEDIUM, AND LOW CONFIDENCE MEAN -- AND WHAT EVIDENCE IS REQUIRED FOR EACH

Every projection, claim, and parameter estimate published on AI Labs carries a confidence designation. These are not subjective feelings -- they correspond to specific evidence thresholds. The system is designed to prevent a common failure mode in forecasting: presenting speculative extrapolations with the same rhetorical weight as empirically grounded measurements.

HIGH CONFIDENCE
Requires direct empirical measurement from at least one peer-reviewed study or primary institutional dataset, corroborated by at least one independent secondary source. The claim must be falsifiable and the measurement methodology must be documented.
Examples: METR capability doubling trend (15 data points), MIT Iceberg viability ceiling (11.7%), BLS employment statistics, Acemoglu displacement-reinstatement framework, current AI cost ratios from published API pricing.
MEDIUM CONFIDENCE
Requires credible institutional research with documented methodology, or a trend extrapolation supported by 3+ data points with a plausible causal mechanism. Some interpretation or synthesis is involved, but the underlying data is strong.
Examples: Goldman Sachs 6-7% displacement estimate, recursive self-improvement multiplier (based on Anthropic statements + ML research trends), V4 regulatory delay estimates, near-term adoption speed projections.
LOW CONFIDENCE
Requires only a plausible causal argument supported by historical analogy or limited trend data. These are scenario-level projections, not forecasts. Low-confidence claims must be explicitly labeled as speculative in all published analysis.
Examples: Hyperbolic curve geometry by 2027-28, 50% displacement by 2029 (recursive scenario), labor market reabsorption capacity (V6), social resistance threshold effects, physical bottleneck erosion timeline.
CALIBRATION ACCOUNTABILITY

We track our confidence calibration accuracy over time. If claims designated "high confidence" prove wrong more than 10% of the time, or "medium confidence" claims prove wrong more than 40% of the time, the calibration system itself needs recalibration. This is tracked in our quarterly deep reviews and published transparently.

05. Update Cadence

WHEN AND HOW PROJECTIONS ARE REVISED

Forecasting models are only as good as their maintenance discipline. Static projections degrade rapidly in a domain where the underlying dynamics shift on monthly timescales. AI Labs maintains a structured revision schedule with three tiers.

MONTHLY
Forecast Revisions
All V1-V9 parameters are re-evaluated against the latest available data. METR benchmark updates, BLS employment releases, and corporate AI deployment announcements are integrated. Parameter adjustments are documented with change logs. The displacement timeline table is regenerated. Published on the first week of each month.
QUARTERLY
Deep Reviews
Full model re-evaluation including structural assumptions, scenario definitions, and confidence calibration accuracy. Incorporates new academic publications, major government reports, and quarterly corporate earnings data. The quarterly review may produce scenario redefinitions, variable additions or removals, or methodology changes. Published as a standalone analysis article.
REAL-TIME
Breaking Adjustments
Material events that invalidate or significantly shift parameter assumptions trigger immediate model updates outside the scheduled cadence. Criteria: a single event that moves any V1-V9 parameter by more than 15% from its current setting. Examples: a new METR data point showing acceleration or deceleration, a major economy passing significant AI regulation, a frontier lab announcing a capability breakthrough or hitting a confirmed scaling wall.

All revisions include a change log documenting which parameters moved, in which direction, by how much, and why. Historical parameter values are preserved for audit and calibration tracking. We do not silently update projections -- every change is versioned and explained.

06. Caveats & Limitations

HONEST ASSESSMENT OF WHAT THIS MODEL DOES NOT AND CANNOT CAPTURE

No forecasting model is better than its weakest structural assumption. We are explicit about the limitations of this framework because intellectual honesty is more valuable than false precision. Users of these projections should weight them accordingly.

  1. No model has successfully predicted AI capability growth more than 18 months in advance. Every major forecast from 2020-2023 required significant upward revision. Our model inherits this fundamental limitation. Projections beyond 2028 should be treated as scenario explorations, not timeline commitments.
  2. The recursive self-improvement variable (V2) is the least empirically grounded parameter. While AI code self-authorship is documented, no peer-reviewed study has quantified the degree to which this compresses capability doubling periods. The hyperbolic curve geometry in the recursive scenario is a mathematical extrapolation of a plausible dynamic, not an observed phenomenon. It may encounter ceilings (architectural limits, training data constraints, compute/energy bottlenecks) that flatten the curve before hyperbolic dynamics manifest.
  3. The model does not capture new task creation with structural fidelity. V6 (Labor Market Elasticity) approximates reabsorption as a single parameter, but historical technology transitions created entirely new industries and job categories that were unforeseeable ex ante. If AI generates a comparable wave of new human-complementary tasks, our displacement projections will overstate actual unemployment. This is the strongest counterargument to our central projections, and we take it seriously.
  4. Feedback effects from displacement itself are modeled only as threshold triggers, not continuous dynamics. Mass unemployment reduces consumer spending, which reduces corporate revenue, which changes AI investment calculus. Political instability from displacement can produce regulatory responses that alter the adoption curve. These second-order effects are real but computationally intractable to model with current data. Our model captures them as discrete scenario switches, not smooth feedback loops.
  5. Geographic, sectoral, and demographic granularity is limited. V1-V9 are calibrated primarily against US data. International displacement timelines differ due to labor cost structures, regulatory environments, and infrastructure readiness. Within the US, displacement will not be uniform across sectors, regions, or demographic groups. Our model produces national aggregates; local reality will vary substantially.
  6. Historical precedent for forecast failures is sobering. The 2013 Frey & Osborne study estimated 47% of US jobs were at risk of automation -- a figure that became a media sensation but proved methodologically flawed (it measured task exposure, not economic viability or adoption probability). McKinsey's 2017 estimates required repeated revision. Our framework was designed to avoid these specific errors, but there are certainly errors we have not anticipated.
  7. The model assumes rational economic actors. In practice, organizations make suboptimal decisions due to political dynamics, sunk-cost fallacies, vendor relationships, and executive ego. These irrational frictions may slow adoption beyond what V3 captures. Conversely, herd behavior and competitive panic may accelerate adoption beyond rational equilibrium. Both directions of error are possible.
  8. Uncertainty ranges widen dramatically beyond 3-year horizons. Our 2026-2028 projections carry meaningful uncertainty bands (plus or minus 30-50%). Our 2029-2032 projections carry very wide uncertainty bands (plus or minus 100% or more on timing). Our 2033+ projections are scenario sketches, not forecasts. We present them because understanding the shape of possible futures is valuable even when precise timing is unknowable.
  9. Black swan events are not modeled. A major geopolitical conflict disrupting semiconductor supply chains, a fundamental breakthrough in AI safety that produces voluntary deployment constraints, a global financial crisis that collapses AI investment, or an unexpected technical breakthrough that leapfrogs current architectures -- any of these would invalidate model parameters in ways that cannot be predicted from existing data. The model maps the space of futures consistent with current trends. It does not map all possible futures.
THE HONEST BOTTOM LINE

Treat all projections on this site as structured scenario maps, not predictions. The value of this framework is not in telling you what year 50% displacement will occur. It is in making the structural logic visible -- showing which variables matter, how they interact, and what evidence would need to change to shift the outlook materially. If you find yourself citing a specific year from our projections as a settled fact, you are using the model wrong.

RELATED PAGES
VIEW FULL SOURCES → FORECAST TRACKER → LEAD ANALYSIS → INTERACTIVE TOOLS →
AI READINESS ANALYZER PRO

Apply this methodology to your own career. Get a personalized AI readiness assessment with workflow-level vulnerability scoring.

ANALYZE YOUR ROLE → $9.99/mo