Let’s be brutally honest for a second.

When a cloud financial analyst says, “In 2025, costs moved from an afterthought to a product constraint”, they’re not being dramatic. They’re describing a structural shift that just got a turbocharged catalyst named DeepSeek.

On May 23, 2026, the Chinese startup made its 75% discount on the flagship V4 Pro model permanent. The original promotion was set to expire on May 31. Now, instead of a temporary fire sale, we’re looking at a permanent price reset that threatens the entire business model of Western AI incumbents.

Here’s the math that should keep Sam Altman and Dario Amodei up at night.

The Numbers That Broke the AI Market’s Priceline

DeepSeek V4 Pro now sits at $0.435 per million input tokens and $0.87 per million output tokens. Compare that to GPT-5.5 at $5 input and $30 output, or Claude Opus 4.7 at $5 input and $25 output.

The gap is obscene.

11.5x cheaper than GPT-5.5 on input
34.5x cheaper than GPT-5.5 on output
28.7x cheaper than Claude Opus on output

And that’s just the Pro model. The V4 Flash model, priced at $0.14 input and $0.28 output, is a jaw-dropping up to 99% cheaper than GPT-5.5 according to CostGoat’s analysis.

But here’s what the simple per-token comparison misses: total task cost. The Decoder pointed out that token consumption per task matters just as much as raw per-token pricing. Think of it like gas mileage, a low price per gallon doesn’t help if your engine guzzles fuel.

DeepSeek’s architecture, built on Mixture-of-Experts (MoE), Multi-head Latent Attention (MLA), and their proprietary Engram and mHC innovations, slashes KV cache and compute needs so dramatically that the same task genuinely costs less. It’s not a pricing trick. It’s an architectural advantage.

Why Permanent Changes Everything

Temporary discounts are marketing. Permanent price cuts are strategic.

The “DeepSeek permanently reduces the price of its flagship V4 model by 75 percent” announcement from Engadget captures the distinction perfectly. When a promotion becomes permanent pricing, it signals that the underlying cost structure can sustain it.

DeepSeek’s V4 models claimed to usher in the “era of cost-effective 1M context length.” With a 1M-token context window and up to 384K output tokens, they’re not just cheap, they’re functionally competitive for the agentic AI workloads that are consuming increasingly massive token budgets.

And here’s the kicker: both models support OpenAI and Anthropic API formats natively. Developers can switch by changing a base URL and API key. The migration friction is essentially zero.

The Revenue Reality Check

The most dangerous part of this for OpenAI and Anthropic isn’t the performance gap. On raw benchmarks, GPT-5.5 and Claude Opus 4.7 still lead. But for a huge swath of real-world applications, customer support, content generation, data extraction, even most coding tasks, “good enough” at 1/30th the cost is a compelling value proposition.

TechnoSports reported that “OpenAI, Google, and Anthropic have all reduced their prices throughout 2025” to keep enterprise clients. The market is already forcing price compression. DeepSeek’s permanent cut just slammed the accelerator.

Model	Input (per 1M tokens)	Output (per 1M tokens)	Cost Ratio vs DeepSeek V4 Pro
DeepSeek V4 Pro	$0.435	$0.87	1x
DeepSeek V4 Flash	$0.14	$0.28	0.3x
GPT-5.5	$5.00	$30.00	11.5x/34.5x
Claude Opus 4.7	$5.00	$25.00	11.5x/28.7x
Claude Sonnet 4.6	$3.00	$15.00	6.9x/17.2x

Think about what this means for the revenue models these companies are taking to their IPO roadshows. OpenAI and Anthropic are both heading toward public offerings, relying on the narrative that AI is a high-margin SaaS-like business. DeepSeek is arguing, with data, that it’s a commodity infrastructure business.

The Enterprise Trust Problem

The most common pushback I hear: “American enterprises won’t trust Chinese-origin DeepSeek with their data.”

It’s a legitimate concern. But it’s also more nuanced than the headline suggests.

DeepSeek is open-weights. Enterprises can download and host it themselves for maximum privacy. The counter-argument is that self-hosting eliminates the cost advantage, you won’t get the same electricity contracts or hardware amortization as DeepSeek’s datacenters. But as one Redditor pointed out, “soon there will be companies that do that” in regions like the EU, leveraging cheap Scandinavian hydropower.

The real question isn’t whether enterprises can use DeepSeek. It’s whether they can afford not to evaluate it when a competitor using DeepSeek is operating at 30x lower inference cost.

Airbnb CEO Brian Chesky already called Chinese AI “fast and cheap”, prompting Congressional scrutiny. The dam is cracking.

Beyond Token Prices: The Long Game

The most fascinating angle in this story isn’t the price cut itself, it’s what DeepSeek is building with the strategic cushion those prices create.

They’re not chasing quick money from coding plans or multimodal subscriptions. Instead, their radical architecture innovations are designed to slash KV cache and compute needs so dramatically that they can build an entire 10T Chinese AI hardware ecosystem, NAND, LPDDR, ASICs, and position themselves for a 1T valuation in the process.

This is a long game, masterfully played.

The semiconductor context matters here. China’s domestic chip production surge from 2025-2026, driven by U.S. export restrictions on NVIDIA A100 and H100 GPUs, has forced companies like DeepSeek to innovate on software efficiency. By refining their architecture to work brilliantly on available hardware, they’ve turned a constraint into a competitive moat.

What This Means for the AI Bubble

Let’s be clear: AI is not dead. But the fantasy of unlimited AI pricing power just got a reality check.

The AI bubble was never about AI technology being worthless. It was about the assumption that AI companies could maintain SaaS-like margins indefinitely. DeepSeek’s pricing proves that assumption is wrong.

For agentic AI workloads, which consume orders of magnitude more tokens than simple chatbots, the cost differential becomes existential. A company running AI agents at scale on GPT-5.5 could see costs drop by 97% by switching to DeepSeek V4 Flash.

That’s not a pricing pressure. That’s a margin implosion.

The incumbents have options. They can compete on safety, on enterprise features, on ecosystem lock-in, on brand trust. But they’ve already proven they can be undercut by an order of magnitude. The question isn’t whether prices will fall further, it’s how fast and how far.

The Bottom Line

DeepSeek’s permanent price cut isn’t just a competitive move. It’s a thesis statement about the future of AI economics.

The market is signaling that intelligence is becoming a commodity. The companies that win won’t be the ones with the most capable models. They’ll be the ones that can deliver “good enough” capability at the lowest cost, with the best developer experience, and the trust to handle sensitive data.

And if you’re building AI products today, you need to be asking yourself a hard question: are you paying 30x more than you should be for your inference tokens?

The answer might hurt.