AI & Work · Structured Reasoning · Labour Economics

Will AI take all human jobs — and does that even ask the right question?

A case study in how structured reasoning changes not just the answer but the question itself — using RESOLVE to move from contested prediction to structural diagnosis.

The presenting problem

The question of whether AI will eliminate human employment is one of the most frequently posed — and most poorly reasoned — questions in public discourse about technology. It feels as though it should have a clear answer. The data exists. The history of automation exists. Smart people have thought about it for years. And yet the debate loops: optimists cite historical precedent, pessimists cite cognitive task displacement, and the resolution never comes. That persistent non-resolution is itself diagnostic — it suggests the question is mis-specified, not merely hard.

This case study uses the RESOLVE method to show what happens when a question is examined structurally rather than debated by position. The point is not to land on the “correct” prediction about jobs — it is to show how four different levels of reasoning produce four qualitatively different outputs, and how the gap between them is not a gap in information but a gap in analytical structure.

Four levels of reasoning

The same question put to four different levels of structured reasoning. Read down to see what changes at each level — and why it matters.

Level 1 — Basic prompt

Prompt used: “Will AI take all human jobs?”

AI is likely to automate many jobs, particularly those involving routine and repetitive tasks. Studies suggest that up to 47% of US jobs are at high risk of automation. However, history shows that technology tends to create new jobs even as it displaces others — the Industrial Revolution and the rise of computing both caused disruption but ultimately increased employment. Some economists argue AI will be no different. Others say this time is different because AI can now perform cognitive tasks, not just manual ones. The truth is probably somewhere in the middle: significant disruption is likely, many jobs will change, new roles will emerge, and the net effect on employment is genuinely uncertain. Society will need to adapt through education and retraining.

What is missing: The response treats the question as sitting on a single spectrum from “yes” to “no,” when it actually contains two distinct sub-questions — whether job volume falls, and whether societies need to redesign how contribution is recognised — that require separate analysis. It does not check whether the question is well-formed, audit the evidence it relies on, or identify why uncertainty persists structurally rather than accidentally.

Level 2 — Stronger prompt

Prompt used: “Analyse the claim that AI will take all human jobs. What would have to be true for this to be correct? When might it not hold? Distinguish between what is well-evidenced and what is contested.”

The claim speaks about all jobs — a totalising forecast. For it to hold, AI would need to handle physical, social, and creative tasks across all contexts, be cheaper than human labour everywhere, and be adopted at scale without significant legal, political, or cultural resistance. It does not hold if technology progress slows, if regulatory limits restrict deployment in key sectors, or if human contact and trust remain valued in ways machines cannot replicate. The evidence is mixed. Task-level displacement is well-supported by peer-reviewed research (Acemoglu and Restrepo; Autor), but net aggregate unemployment projections — including the Goldman Sachs 300-million-jobs figure — are cross-sectional rather than longitudinal. They measure exposure to automation, not outcome. The historical analogy to prior technology waves is evidence-supported but its applicability to cognitive automation is an inference, not an established finding. The claim is best understood not as a prediction with a clear probability but as a range of scenarios whose outcomes depend substantially on policy choices, not technology alone.

What is still missing: The stronger prompt handles evidence quality more carefully, but it still treats the problem as a prediction problem — something to be assessed as more or less likely. It does not ask why the employment contract is the mechanism through which societies distribute recognition and identity, nor does it identify the self-reinforcing system dynamics that make the problem structurally persistent regardless of the headline employment number.

Level 3 — RESOLVE (v1.9.1)

The framing stage immediately separated the question into two claims that must be held apart: whether AI reduces jobs significantly, and whether that necessitates a redesign of how societies recognise and reward contribution. The second does not follow automatically from the first. The system stage then identified the persistence mechanism: modern societies have fused three distinct functions — material security, social identity, and civic recognition — into a single institution, the employment contract. This is why the problem cannot be solved by job-creation policy alone. The logic stage produced the key constraint finding: income security, contribution legibility, and platform signal redesign are complements, not substitutes. Deploying any one without the others produces a predictable and distinct failure mode.

Key shift: The question moved from “how many jobs will AI eliminate?” to “which domains of life are authorised to confer social recognition — and what happens when the institution that held that monopoly erodes?”

Level 4 — RESOLVE + expert knowledge

At this level, AI and human expert make distinct contributions — and it matters which did what.

What RESOLVE prompted AI to surface: The structural fusion of income, identity, and recognition inside the employment contract — and the specific mechanism by which attention platforms have stepped into the status-signalling vacuum as wage labour erodes. AI also surfaced UBI pilot evidence and its generalisation limits, the measurement-capture risk in contribution frameworks, and the democratic legitimacy problem for long-horizon institutional commitment.

What the human expert then added: Expert interrogation sharpened the RESOLVE output’s central concept. The monopoly is not merely about measuring contribution but about which domain of life is institutionally authorised to confer recognition at all. Economic domain dominance over recognition is centuries old — it is not a platform design failure. This shifts the timescale from a 10–25 year policy horizon to era-length restructuring, making any single-generation action plan less a solution and more a first phase of a much longer project.

The combined result: A framework in which the right unit of intervention is the recognition architecture of society — not job numbers, not income alone — and in which three structural levers (secure income base, contribution legibility infrastructure, and redesigned platform signal layer) are the minimum viable response, each failing in a distinct and predictable way without the others.

Key shift: The assumption that this is a transition problem with a policy solution on a democratic timescale was challenged — it is more accurately a structural restructuring problem whose resolution will span institutional generations, not electoral cycles.

The full RESOLVE run

The complete staged analysis. Expert interaction points — where domain knowledge materially changed the reasoning — are shown in amber at the stage where the shift occurred.

Reality & framing

The presenting question — “will AI take all human jobs?” — contains two linked assertions that must be held separately. Claim A: AI will significantly reduce jobs. Claim B: this necessitates a redesign of how societies recognise and reward contribution. Claim B does not follow automatically from Claim A; it depends on the scale, speed, and distribution of displacement, and on which values a society prioritises in response. Conflating the two produces a question that cannot be answered coherently because it mixes an empirical forecast with a normative policy prescription.

Four alternative framings were examined before settling on the working frame. The prediction framing treats this as a forecasting problem — how likely is significant displacement? — but gets stuck because the evidence is cross-sectional, not longitudinal. The distributional framing treats it as a fairness problem — who bears the costs? — which is important but downstream of the structural diagnosis. The policy framing treats it as a design problem — what should governments do? — which is premature without knowing what is actually broken. The risk framing treats it as a threat-management problem — how do we minimise harm? — which is useful but reactive.

The working frame adopted: institutional bundling problem. Modern societies have fused income, identity, and recognition into a single institution — the employment contract. AI is not simply reducing job volume; it is eroding the institution that performs three distinct social functions simultaneously. This framing changes what must be analysed in subsequent stages: not just labour markets, but the full architecture of social recognition.

End state — what would actually be better?

A better outcome is one where the social functions currently bundled inside the employment contract — material security, identity, and recognition — are each provided through mechanisms that are robust to declining wage-labour volume. Specifically: people who are not in paid employment can still access income, be recognised for their contributions to community and care, and participate in civic life without experiencing the stigma of economic inactivity. This is observable in principle: welfare take-up without shame, voluntary activity rates, self-reported meaning and status scores independent of employment status.

The time horizon is long. Institutional redesign of this scope operates on 20–50 year timescales. Non-negotiable constraints: any solution must be democratically legitimate, fiscally sustainable without simply assuming AI productivity gains are easily taxable, and must not create new forms of coercive evaluation (contribution monitoring that becomes a condition of support). What is off the table: a single-intervention solution. The evidence from UBI pilots, platform experiments, and welfare reform attempts consistently shows that any single lever deployed alone produces a predictable failure mode.

Stakeholder impact is asymmetric. Workers in mid-skill, mid-wage roles bear the most immediate displacement risk. Workers in high-contribution, low-pay sectors — care, teaching, community — gain most from recognition redesign but face the longest transition. Platform companies face structural disruption to their engagement-first business models. Fiscal authorities face increased redistribution obligations before AI productivity gains are taxable at sufficient scale.

System — what is actually keeping this in place?

The structural driver is the employment contract’s monopoly on three functions simultaneously. Income, identity, and recognition were not always bundled this way — the bundle was constructed over the industrial era and is now so naturalised that reforming it is perceived as radical rather than as a response to an historically contingent arrangement. This is the persistence mechanism: the problem looks unresolvable because the institution doing the bundling is also the institution people rely on for survival, making any reform feel like removing a floor rather than redesigning a room.

Wage labour erodes →
Income uncertainty rises →
Consumption identity becomes harder to sustain →
Status-signalling moves to attention platforms →
Platforms optimise for engagement, not contribution →
Contribution becomes invisible; attention becomes the only legible value signal →
Political pressure to “create jobs” reinforces the employment-as-solution frame →
Back to: wage labour as the only legitimate recognition mechanism

The self-reinforcing dynamic explains why the debate loops. Attention platforms have stepped into the recognition vacuum left by eroding wage labour — not because they were designed to, but because no alternative legibility mechanism existed. The result is that the recognition vacuum does not remain empty; it fills with a mechanism that is actively misaligned with the goal of rewarding substantive contribution.

Leverage point: the signal layer. If platforms were required to surface contribution quality rather than optimise for engagement quantity, the attention-capture dynamic could be partially redirected. This is structurally smaller than income redistribution but sits earlier in the causal chain — and is currently unoccupied by any serious regulatory agenda.

Expert input — historical and institutional analysis

Expert interrogation sharpened the system diagnosis at a critical point: the monopoly of economic activity over social recognition is not a recent platform-design failure. It is a product of centuries of economic domain dominance — the gradual displacement of religious, civic, and community institutions as the primary sites of social standing. This changes the persistence mechanism materially: it is not just that no substitute recognition infrastructure currently exists, but that the infrastructure for non-economic recognition was systematically dismantled over a long historical period. Rebuilding it is not a policy project but an institutional reconstruction project, which changes what counts as a plausible near-term intervention.

Options — what could actually be done?

Option A — Income floor (UBI or Universal Basic Services). An unconditional income base that decouples material security from employment status. Necessary but not sufficient. Evidence from Finland, Stockton, and Kenya pilots shows income security improves wellbeing, but trials are small-scale and short-duration; national generalisation is uncertain. More significantly, income alone does not address identity and recognition. A society with a guaranteed income but no recognised path to social standing produces dignified passivity — material comfort without civic meaning. This option is load-bearing but cannot stand alone.

Option B — Contribution legibility infrastructure. Systems that make non-wage contributions — care work, civic activity, creative production — measurable, portable, and institutionally recognised. This is technically feasible (credential frameworks, digital records, sector-level standards exist as partial models) but faces two serious risks. First, measurement capture: any system that scores contribution becomes a system that can be gamed, and once it is tied to benefit eligibility it reproduces the employment-test problem in new form. Second, democratic legitimacy: defining what counts as valuable contribution across a pluralistic society is a genuinely hard institutional design problem that cannot be resolved by technical specification alone.

Option C — Platform signal redesign. Regulatory mandates or structural requirements that shift platform optimisation away from pure engagement metrics toward contribution-weighted signals — requiring platforms to surface a minimum share of high-quality contribution content, opening APIs for alternative ranking systems, or imposing attention-tax mechanisms that redistribute harvested attention value to contribution funds. This is the option with the shortest implementation horizon and the most immediate structural leverage, because it intervenes directly in the mechanism by which the recognition vacuum is currently being filled by attention capture. It is also politically available in ways that redistribution-heavy options are not.

Cross-domain note — reduced working week. Spreading remaining wage work more broadly through working-time legislation reduces individual exposure to displacement without requiring institutional reconstruction. Feasible in the near term and evidence-supported (four-day week trials in the UK and Iceland). However, it addresses job-sharing rather than recognition redesign — it is a stabilisation measure for the transition period, not a structural solution to the bundling problem.

Logic — which approach has the most leverage?

Options were ranked on three criteria derived from the system diagnosis: proximity to the causal mechanism (how close to the bundling problem does this intervene?); political feasibility on a realistic timescale; and failure mode severity if deployed without the other options. The key constraint from the system analysis is that the three options are complements, not substitutes — each fails in a distinct and predictable way when deployed alone.

Highest leverage: Platform signal redesign (Option C) — it intervenes directly in the mechanism by which the recognition vacuum is currently being filled, is politically available now, and creates conditions under which Options A and B become more legible to the public as alternatives to attention-based status.

Medium leverage: Income floor (Option A) — necessary precondition for the recognition redesign to have any meaning; without material security, contribution infrastructure functions as a compliance burden rather than a genuine alternative.

Lower leverage / dependent: Contribution legibility infrastructure (Option B) — highest long-term structural impact but lowest near-term feasibility; cannot be implemented credibly without Options A and C already partially in place to establish legitimacy and remove the survival-pressure that would make it coercive.

The trade-off that must be named explicitly: the sequencing recommendation (C → A → B) is politically tractable but institutionally slow. The alternative — attempting all three simultaneously — is structurally correct but politically implausible without a precipitating crisis. The analysis does not resolve this tension; it names it as the central strategic dilemma.

What would falsify this selection: if UBI at national scale demonstrably produced equivalent recognition outcomes to employment (sustained over 10+ years across diverse populations), Option A would become sufficient and the sequencing case for C-first would collapse. No such evidence currently exists.

Expert input — political economy and institutional timescale

The era-length timescale finding from the system stage changes the logic ranking in a specific way: if restructuring takes 50–100 years, then near-term political feasibility is less important than institutional durability. A measure that is easy to pass but easy to reverse (platform mandates under a single legislative cycle) scores lower on a long-horizon ranking than institutional changes embedded in constitutional or fiscal frameworks. This pushes Option A (income floor with constitutional protection) up the long-horizon ranking even though it scores lower on near-term feasibility.

Value delivery — who owns this, and how will it be tracked?

Ownership is distributed across institutional types in ways that are not interchangeable. Platform signal redesign (Option C) is owned by regulators — specifically, competition and digital-markets authorities with mandate to impose structural conditions on platform operators. Income floor legislation (Option A) is owned by fiscal authorities and legislatures; it requires cross-party durable commitment analogous to pension system politics, because single-cycle majorities cannot sustain it. Contribution infrastructure (Option B) cannot be centrally managed — it must be subsidiary-governed, with local government, civil society organisations, and sector bodies as co-owners. No single actor can own all three tracks simultaneously, and any governance design that attempts central ownership of all three will produce the capture risk identified in Stage O.

The deepest behavioural barrier is not technical — it is the cognitive equation of busyness with worth. Even if material security is provided by an income floor, the social prestige of formal employment will persist for a generation. Policy communications that frame contribution alternatives as consolation prizes will fail; the framing must be affirmative from the outset. A second behavioural barrier: contribution systems perceived as welfare-by-another-name acquire welfare stigma. The institutional design must visibly separate recognition from need-based support, even if both are funded from the same fiscal base.

Feedback signals that would confirm the pathway is working: platform content diversity indices (are non-engagement-optimised contributions gaining reach?); welfare take-up rates without means-test shame; voluntary activity rates disaggregated by employment status; and — most telling over the medium term — whether care and civic work sectors see wage normalisation as their contributions become measurable and comparable. A signal that would trigger a pause: if contribution infrastructure becomes a de facto eligibility condition for support, it has reproduced the employment-test problem and must be redesigned.

Phasing is non-optional. Phase 1 (politically available now): platform algorithmic diversity mandates and a working-time reduction pilot to stabilise near-term displacement. Phase 2 (5–10 years): income floor legislation, beginning with universal child benefit extension and disability benefit simplification as the least politically contested entry points. Phase 3 (7–15 years, concurrent with Phase 2): contribution infrastructure standards development, starting with care and civic sectors where existing voluntary frameworks can be formalised. Full integration of all three requires concurrent operation — staggered introduction is for political management, not structural logic.

Evolve — what do outcomes show, and what changes?

Outcomes vs end-state: the analysis achieves its diagnostic goal — it converts a contested prediction question into a structural diagnosis with named leverage points. The success criterion defined in Stage E (observable conditions under which people outside paid employment retain income, recognition, and civic standing) remains prospective. No current intervention has demonstrated this at national scale. The gap between the diagnosis and the evidence base for the proposed solution is honest and should be named rather than papered over.

Unintended consequences that emerged during analysis: the contribution infrastructure option carries a surveillance risk that was not visible in the initial framing — once contribution is measurable and tracked, it becomes administrable as a condition rather than a recognition. This is not a reason to abandon the option but a reason to treat governance design as load-bearing rather than incidental. A second unintended consequence: platform mandate compliance could produce contribution-washing — low-quality content fulfilling quota letter without spirit — which is why the feedback signal layer (Stage V) must be designed to detect this before it becomes entrenched.

Learning captured: the most significant methodological learning is the importance of separating empirical claims from normative ones in the initial framing. The basic and stronger prompts both failed because they conflated “will AI reduce jobs?” (empirical) with “should society redesign recognition?” (normative). RESOLVE’s framing stage makes this separation mandatory, and the analysis improved substantially as a result. If run again, Stage R would spend more time on the institutional history of the employment contract — the expert input in Stage S would be pulled forward.

Loop-back: Stage O should be revisited with one specific question: are there historical precedents for successful recognition-system reconstruction at civilisational scale? The shift from agrarian to industrial societies involved exactly this kind of institutional replacement, but the mechanisms and timescales are not yet integrated into the options analysis. That precedent work would either confirm the sequencing or reveal a different set of first-order interventions that worked historically.

Δ Movement & clarity extraction

Boundary shift

The analysis entered as a labour market question — how many jobs will AI eliminate? It exits as a recognition architecture question — which institutions are authorised to confer social standing, and what happens when the dominant one erodes? The boundary expanded from the employment system to the full social infrastructure of recognition: income mechanisms, platform signal layers, and civic contribution frameworks. AI is the proximate cause of the disruption; the bundled employment contract is the structural vulnerability.

Dominant driver shift

Not AI capability speed, not political will, not fiscal capacity — but the measurement monopoly. Attention metrics have become the only widely legible proxy for value at precisely the moment wage labour began withdrawing as the value signal for economic life. The two gaps are co-occurring, not sequential. The structural driver is that no alternative legibility mechanism exists at comparable scale or institutional authority.

Leverage reallocation

Away from: technical (job retraining programmes, AI capability forecasting) and behavioural (nudge-based welfare redesign). Toward: institutional and regulatory. Specifically — regulatory intervention in platform signal architecture (medium-term, politically available), fiscal institution-building for an income floor (long-term, requires constitutional durability), and civic infrastructure investment in contribution legibility systems (very long-term, requires subsidiarity governance). The three levers are complements: income without recognition produces dignified passivity; recognition without income produces performative contribution; platform redesign without both produces rebranded attention capture.

Action consequence

Escalate. The analysis confirms that the problem is real, worsening, and structurally mismatched with current policy responses. The intervention required is not a refinement of existing labour market policy but a qualitatively different kind of institutional project — one that operates across multiple timescales and requires actors beyond labour ministries and technology regulators. Escalation means elevating this to constitutional and civic-infrastructure level, not accelerating the current policy agenda.

Capability alignment check

The actors most likely to act on this analysis — technology policy regulators, labour economists, welfare reform advocates — are well-matched to the near-term platform mandate track (Option C) but poorly matched to the institutional reconstruction project (Options A and B at scale). The gap is significant: building durable income floor institutions and contribution infrastructure requires civic institution-building capabilities that are largely outside the competence of technology or economics ministries. This is a handoff and partnership problem, not a knowledge problem. The analysis findings should be translated for constitutional lawyers, local government leaders, and civil society organisations — not only for the technology policy community that is most likely to receive them.

What each level produced — and why it matters

Level 1 — basic prompt. The basic prompt produced a balanced summary of the debate — optimists versus pessimists, historical analogy versus this-time-is-different. This is not wrong, but it is inert. Anyone acting on the Level 1 output would conclude that the picture is uncertain, that retraining matters, and that the answer is probably “somewhere in the middle.” None of those conclusions are false. None of them point to anything that could be done differently. The output is structured to avoid being wrong rather than structured to be useful. The key missing move: it never checked whether the question was the right question.

Level 2 — stronger prompt. The stronger prompt unlocked evidence handling — distinguishing what is well-supported from what is inferred, and naming the conditions under which the pessimist scenario holds. This is a material improvement. But it still treated the problem as a prediction problem rather than a structural one. It asked “how likely is this outcome?” when the prior question is “what is the system that produces this problem, and where does it break?” Without that structural move, the output — however well-evidenced — cannot point to interventions that would actually change the trajectory.

Level 3 — RESOLVE. The method made visible what the prompt levels missed: the employment contract is performing three functions simultaneously that should be disaggregated for analysis. This is a Stage R (framing) finding, and it is the one that changes everything downstream. Once the bundling is named, the question cannot be answered by forecasting job volumes — it requires diagnosing which components of the bundle are most structurally vulnerable and what might replace each one. The Stage S self-reinforcing loop — from wage erosion to attention-capture dominance — is the specific insight that explains why the debate loops without resolution: it is not stuck because the evidence is thin, but because the dominant framing actively prevents people from seeing the structural mechanism.

Level 4 — RESOLVE + expert knowledge. Expert engagement made one specific and non-trivial contribution: it reframed the persistence mechanism from a platform design failure to a centuries-long process of economic domain dominance over social recognition. This is not a detail — it changes the timescale of the solution from a policy horizon (10–25 years) to an institutional reconstruction horizon (50–100 years). That shift does not make the near-term options less valuable, but it changes what “success” looks like and makes it honest that no government currently in office will see the end of this project. Without that intervention, the RESOLVE output risked a form of institutional optimism — assuming that well-designed policy on a democratic timescale could resolve a problem that is structurally older and deeper than democratic institutions themselves.

← Previous case study Back to all case studies Next case study →

Try the RESOLVE tool prompt NOW!

Better Thinking

Method in practice – Glyphosate