The Intelligence Cost Curve
AI inference pricing is commoditizing faster than any technology in recorded economic history. The question is no longer whether intelligence becomes cheap—it is how enterprises position themselves before it does.
In March 2023, accessing GPT-4-class intelligence cost $60 per million output tokens. By February 2026, equivalent quality is available from DeepSeek V3 at $0.28 per million tokens—a 99.5% price collapse in under three years. Even within OpenAI’s own product line, the journey from GPT-3 to GPT-4o-mini represents a 400x reduction in per-token cost at roughly comparable quality. The Stanford HAI 2025 AI Index documents a 280x inference cost reduction across the broader market. No prior technology—not semiconductors, not bandwidth, not electricity—has deflated this rapidly at the same stage of adoption.
We call this dynamic the Densing Law: capability density per dollar of inference spend doubles approximately every 3.5 months. Unlike Moore’s Law, which operated on an 18–24 month cadence constrained by lithography physics, the Densing Law compounds across four simultaneous vectors—hardware efficiency, algorithmic optimization, competitive pricing pressure, and open-source commoditization. The result is a cost curve that bends downward faster than enterprise planning cycles can absorb.
This chapter tracks 22 models across six years, projects three probabilistic scenarios through 2029, dissects the emerging subscription crisis, and maps the historical analogies that make near-free intelligence a near-certainty. The data tell a single, unambiguous story: value is migrating from the model layer to the orchestration layer, and the organizations that grasp this shift earliest will define the next era of competitive advantage.
1. Historical Pricing — The 1,000x Collapse
Output price per 1M tokens from GPT-3 (2020) to February 2026. Logarithmic scale reveals the exponential decline.
AI Output Pricing Timeline (Log Scale)
The pricing timeline above reveals a pattern that should alarm any executive still budgeting AI as a premium line item. The 1,000x cost reduction from GPT-3 to today’s budget models is not a gradual decline—it is a series of step-function collapses, each triggered by a new competitive entrant or architectural breakthrough. GPT-3.5-turbo slashed prices 30x. GPT-4o cut them another 4x. Then GPT-4o-mini dropped the floor by 25x more. DeepSeek V3, trained for a reported $5.6 million, proved that frontier-class quality could be delivered at $0.28 per million tokens—undercutting every Western provider by an order of magnitude.
Yet the market has simultaneously bifurcated. While budget models race toward zero, premium reasoning models like o1-pro command $600 per million tokens—ten times more expensive than the original GPT-4. This creates a 2,143x price range within a single product category, a spread unprecedented in technology markets. The implication is strategic: enterprises must learn to route tasks to the appropriate cost tier, matching model capability to task complexity rather than defaulting to the most expensive option.
2. The Intelligence per Dollar Curve
Original projection vs February 2026 reality — the curve is 3-6x ahead of schedule.
Intelligence/$ Multiplier (vs 2023 GPT-4 Baseline)
The intelligence-per-dollar curve is the single most important metric in this report. It captures both sides of the equation—falling prices and rising capability—in a single trajectory. The original 2024 projection estimated a 2.5x multiplier by 2026 relative to the GPT-4 baseline. The February 2026 reality is 8–15x, placing us three to six times ahead of schedule. GPT-4-equivalent output cost has fallen from $60 per million tokens to $0.40. Meanwhile, the best available MMLU score has climbed from 86.4% to 93%+, a gain of nearly seven percentage points.
This acceleration is not a temporary anomaly. It reflects the compounding of multiple deflationary forces acting simultaneously: NVIDIA’s Blackwell architecture delivers roughly 10x cost reduction per token over Hopper; inference optimizations—speculative decoding, INT4 quantization, KV-cache sharing, continuous batching—have collectively reduced serving costs 3–5x from 2023 baselines; and open-source models from Meta, Mistral, and DeepSeek have compressed the time between a frontier release and a commodity-priced equivalent to fewer than six months. The curve is not merely doubling every 12–18 months. It is compounding faster than any prior technology cost curve on record.
3. Cost Drivers — Five Forces Down, Five Forces Up
Reducers dominate for standard inference (80%+ of tasks). Increasers only hold for frontier reasoning.
▼ Price Reducers
▲ Price Increasers
Cost Driver Impact Weights
The five price reducers collectively account for a net annual decline of roughly 50% for standard inference tasks. Hardware efficiency and competitive price wars carry the heaviest weight at 25% each. Open-source pressure—from Llama, Mistral, DeepSeek, and Qwen—acts as a relentless floor-lowering mechanism: whenever a closed provider maintains premium pricing, an open alternative emerges within months offering 83–94% of the capability at a fraction of the cost. The price increasers—training compute, energy, talent, safety overhead, and reasoning token multiplication—are real but concentrated. They matter most for frontier reasoning tasks, which represent only about 5% of total inference volume.
The net assessment is decisive: for the 80% of tasks that are routine or standard professional work, the deflationary forces are overwhelming. For the 15% that require mid-tier intelligence, prices are falling at 40–60% per year. Only the top 5%—frontier reasoning, novel research, and breakthrough problem-solving—see prices hold or rise, driven primarily by the 10–100x reasoning token multiplier that chain-of-thought and tree-of-thought architectures demand. This creates the bifurcated market the scenario analysis below explores.
4. Three Scenarios for 2025-2029
Scenario A (aggressive, 60%) is on track or ahead. Scenario C (premium divergence, 10%) is happening faster than expected. They coexist.
Scenario A vs B: Mid-tier Price Trajectory
Scenario C: Premium Divergence
The critical insight from the scenario analysis is that Scenarios A and C are not mutually exclusive—they are playing out simultaneously across different market tiers. Standard inference is following the aggressive Scenario A trajectory, with budget models already at $0.28–$0.60 per million tokens—prices that the original 2024 analysis did not expect until 2028. Meanwhile, frontier reasoning is tracking Scenario C, with premium models commanding $25–$600 per million tokens, creating a 50–2,000x gap that already exceeds the original 2027 projections. The market is not following a single path; it is stratifying into distinct economic layers, each with its own cost dynamics.
For enterprise strategists, this bifurcation presents both an opportunity and a trap. The opportunity lies in the 80/15/5 routing pyramid: intelligently matching tasks to the correct cost tier can reduce inference budgets by 10–50x without sacrificing output quality. The trap is treating all AI spend as a single budget line, overpaying for routine tasks with premium models or, worse, under-investing in frontier capabilities where reasoning quality genuinely matters. The organizations that master this routing discipline will extract dramatically more value per dollar than those that do not.
5. The $20 Subscription Margin Collapse
Provider cost drops from $5-15 (2025) to $0.10-0.50 (2028) while users still pay $20 — 97-99% margins are unsustainable.
$20 Subscription: User Pays vs Provider Cost
The subscription crisis is not hypothetical—it is already visible in the data. A typical casual subscriber generates 1–2 million tokens per month. At February 2026 pricing with efficient models and caching, that workload costs the provider $0.50–$4. Yet the user pays $20. Margins of 85–95% are already the norm for light users, and they will reach 97–99% by 2028. This is not a sustainable equilibrium. API pricing is public; BYOK tools like OpenRouter make the arithmetic transparent; and open-source alternatives from Meta, Mistral, and DeepSeek offer frontier-minus-one quality at zero marginal cost after a modest hardware investment. The $20 price point survives only if what it buys evolves from raw intelligence into a differentiated experience.
6. Three Subscription Futures
Price drops (Netflix), tiered (Spotify), or feature differentiation (already happening). The $20 survives but transforms.
Price Drops (Netflix Model)
2026: $15/month → Better models
2027: $10/month → Much better models
2028: $5-8/month → Near-frontier models
2029: $3-5/month → Commodity AI
Tiered Model (Spotify Model)
Good: $5/month → Good AI (today's Sonnet level)
Pro: $20/month → Frontier AI + features
Enterprise: $50/month → Enterprise + priority + agents
Feature Differentiation (Already Happening)
• Priority access (no rate limits)
• Advanced features (artifacts, projects, memory)
• Integrations (Claude Code, desktop apps)
• Support (faster response)
The three subscription futures outlined above converge on a single strategic truth: AI providers will follow the same evolutionary path as internet service providers, streaming platforms, and cloud computing vendors. Pure access to the resource gets commoditized. Premium pricing migrates to the experience layer—priority access, agents, integrations, memory, and workflow tools. By 2029, raw AI chat becomes a $3–$5 commodity. The $20 tier transforms into an “unlimited agents plus priority frontier plus verification tools” package. This is not speculation; it is the structural logic of every technology utility market in the past century.
7. Historical Analogies — Same Law, Same Outcome
Bandwidth, silicon, electricity — all followed the same commoditization pattern. AI is not the exception.
Internet Bandwidth
Microprocessor Computing / Moore's Law
Electricity
The Universal Commoditization Pattern
The historical parallels are not approximate—they are structurally identical. Internet bandwidth dropped 10,000–100,000x from dial-up to fiber. Transistor costs fell by a factor of roughly one trillion over sixty years. Electricity evolved from a luxury powering private generators for the wealthy to a metered utility so cheap that households scarcely think about it. In every case, the same five-phase pattern held: expensive and metered, then efficiency plus competition collapses unit cost, then flat-rate subscriptions emerge, then subscriptions face pressure and pivot to experience, and finally, value migrates permanently to the layer above the commodity resource. AI inference is currently transitioning from phase two to phase three. The analogies do not merely suggest what comes next; they make it economically inevitable.
The strategic implication is blunt: companies building moats around model access are building on a commodity. Microsoft, Apple, and Google did not win by selling transistors—they won by building software, ecosystems, and user experiences on top of near-free silicon. Netflix did not win by selling bandwidth—it won by building a content and recommendation platform that rode the bandwidth commodity to global scale. The AI equivalent is already taking shape: the orchestration layer—agents, workflow tools, memory systems, and multi-model routing—is where durable competitive advantage will accrue.
8. Task Cost Evolution by Complexity
Five tiers from trivial (already free) to frontier/breakthrough (premium persists). Complex tasks stay expensive 3-4 years longer.
| Category | Examples | 2026 Cost | 2028 | 2030 | Utility Cheap Date |
|---|---|---|---|---|---|
| Trivial / Everyday | Casual chatting, Basic math (sqrt, calc), Fact check | $0.000005-$0.0005 | <$0.00005 | Free / embedded | Already here (2025-2026) |
| Routine Professional | Email writing, Simple code snippet, Customer support reply | $0.0005-$0.02 | $0.00005-$0.002 | Free / $0.0005 | 2027-early 2028 |
| Advanced Professional | Full scripts/apps, Market/legal drafts, Creative campaigns | $0.02-$1 | $0.001-$0.10 | $0.0001-$0.02 | Mid 2028-2029 |
| Expert / PhD-level | Novel research synthesis, Hypothesis generation + testing, Peer-review paper draft | $0.50-$30+ (often $2-15) | $0.05-$3 | $0.005-$0.50 | Late 2029-2030 |
| Frontier / Breakthrough | Original invention, Solving unsolved math/science problems, Multi-year corporate strategy | $10-$300+ | $1-$30 | $0.10-$5 | 2030+ (premium tier persists for true novelty) |
Task Cost by Tier (2026 vs 2028 vs 2030)
The task cost evolution table above encodes a profound insight: the cost gap between trivial and frontier tasks is 100–10,000x in 2026, but the gap compresses relentlessly over time. A casual 500-token chat costs approximately $0.0002 today. A PhD-level literature review involving 50,000 effective tokens with heavy reasoning runs $5–$30. The difference is driven almost entirely by the reasoning token multiplier—chain-of-thought and extended thinking architectures consume 10–100x more tokens than simple response generation. As reasoning optimization matures (reducing the multiplier from 20x toward 2–10x), even expert-level tasks will slide down the cost curve, reaching utility pricing by 2029–2030.
9. Research Paper Cost Calculator
How much does it cost to AI-generate a 50-page research paper? From $0.25 (budget) to $120 (frontier PhD) in 2026.
| Year | Tier | Model Examples | Total Cost | Quality Level |
|---|---|---|---|---|
| 2026 | Budget | GPT-5 Nano, Grok 4.1 Fast, Gemini Flash | $0.25-$0.80 | Solid undergrad / quick draft (good enough for many internal reports) |
| 2026 | Balanced / Pro | GPT-5.2, Claude Sonnet 4.6 | $12.00-$25.00 | Strong master's / journal-ready first draft |
| 2026 | Frontier / PhD | Claude Opus 4.6, GPT-5.2 pro | $65.00-$120.00 | True PhD / top-tier journal submission quality (novel insights, rigorous) |
| 2028 | Budget | Next-gen cheap models | $0.08-$0.25 | Utility / near-free |
| 2028 | Balanced / Pro | — | $4.00-$9.00 | Excellent professional / conference paper |
| 2028 | Frontier / PhD | — | $22.00-$45.00 | PhD / top-journal ready |
| 2030 | Budget | — | $0.01-$0.05 | Completely free / embedded in tools |
| 2030 | Balanced / Pro | — | $1.00-$2.50 | Near-perfect for most users |
| 2030 | Frontier / PhD | — | $6.00-$15.00 | Still premium but affordable even for individuals |
50-Page Research Paper Cost (3 Tiers x 3 Years)
The research paper calculator makes the abstract concrete. In 2026, a solo researcher can generate a journal-ready first draft of a 50-page paper for the price of two coffees—$12–$25 using a balanced-tier model. A full PhD-quality paper with novel contributions costs $65–$120 at frontier tier, already 70–90% cheaper than hiring a research assistant for two weeks. By 2028, that same PhD-quality paper drops below the cost of a single lunch. By 2030, it costs less than printing the paper on physical pages. These are not marginal efficiency gains. They represent a structural reordering of the economics of knowledge production, with implications for every professional services industry from consulting to legal to scientific research.
10. Commoditization Timeline — When Each Tier Reaches Utility Pricing
2026: trivial tasks free. 2027: routine commoditizes. 2028: advanced affordable. 2029: PhD-level utility-priced. 2030+: 95% near-free.
The commoditization timeline traces a remarkably predictable progression. In 2026, trivial and routine professional tasks are already utility-priced—no one marvels at an AI drafting an email or answering a factual question, just as no one marvels at a calculator computing a square root. By 2027, agents handle 80%+ of white-collar busywork at near-zero marginal cost. By 2028, full coding projects and legal drafts feel routine. By 2029, most PhD-level work enters the utility zone. The frontier—genuine invention, unsolved scientific problems, multi-year strategic synthesis—remains premium, much as custom supercomputing or dedicated fiber lines remain premium today. But the premium tier shrinks to 5% of total demand, and even it faces a 10x cost reduction per year.
11. The Value Shift — From Intelligence to Capabilities
The most important strategic insight: value migrates from the model layer (commoditizing) to the orchestration/experience layer (differentiating).
Today (2026)
Tomorrow (2029+)
The value shift visualized above distills the entire chapter into a single strategic directive. Today, enterprises pay for intelligence—access to a smart model is the product, and the best model is the moat. Tomorrow, intelligence is a commodity input, and enterprises pay for capabilities: agents that execute multi-step workflows, memory systems that retain institutional knowledge, routing layers that match tasks to optimal cost tiers, and integration frameworks that embed AI into existing business processes. The winning formula is already clear: BYOK (bring your own key) access to commodity models, combined with proprietary orchestration, persistent memory, and low-friction user experience. The companies that own the experience layer will capture the margin as the intelligence layer compresses to utility pricing.
12. Connections Across the Report
How the intelligence cost curve connects to every other chapter of this strategic analysis.
What Comes Next
The intelligence cost curve establishes the economic foundation upon which every subsequent chapter of this report builds. Prices are falling at 50% per year for standard inference. The intelligence-per-dollar multiplier has already reached 8–15x the 2023 baseline—three to six times ahead of projections. And the market is bifurcating into commodity and premium tiers that require fundamentally different strategic responses. But cost deflation alone does not explain how enterprises should deploy these rapidly cheapening capabilities. For that, we must examine what happens when models that were once separated by vast performance gaps begin to converge—when last year’s frontier becomes this year’s commodity, and the window of proprietary advantage shrinks from years to months. That convergence—its pace, its implications, and the strategic playbook it demands—is the subject of Chapter 3.