# AI Models Are the New Rebar

**Author:** Philipp D. Dubach | **Published:** March 11, 2026 | **Updated:** March 8, 2026
**Categories:** AI, Investing
**Keywords:** AI model commoditization, AI commoditization, open-source AI models, open source AI vs proprietary, AI inference cost decline, OpenAI valuation, AI pricing war 2026, AI margin compression, Qwen 3.5, LLM switching costs, AI model pricing, disruptive innovation AI, model layer commodity, who wins AI value chain, OpenAI profitability, open source AI performance gap, Christensen AI disruption, Anthropic valuation

## Key Takeaways

- Qwen 3.5-35B matches Claude Sonnet 4.5 on select benchmarks at $0.10 per million input tokens versus $3.00, a 97 percent cost gap for comparable performance.
- The performance gap between open-source and proprietary AI models shrank from 8 percent to 1.7 percent in a single year, per the Stanford HAI 2025 AI Index.
- OpenAI's adjusted gross margin fell from 40 to 33 percent in 2025 as inference costs quadrupled to $8.4 billion, while the company lost $13.5 billion in H1 2025.
- AI inference prices decline at a median rate of 50x per year for equivalent performance, according to Epoch AI, a pace that dwarfs Moore's Law.

---


[Qwen 3.5-35B-A3B](https://huggingface.co/Qwen/Qwen3.5-35B-A3B), a model released by Alibaba in February 2026, runs on a single consumer GPU with 24 gigabytes of VRAM. A secondhand RTX 4090, available for around $2,000, generates 60 to 100 tokens per second with it. On select benchmarks per Alibaba's own evaluations, it matches or beats Claude Sonnet 4.5. The Qwen 3.5 Flash tier costs [**$0.10 per million input tokens**](https://www.alibabacloud.com/help/en/model-studio/model-pricing) through Alibaba's API. [Claude Sonnet 4.5 costs **$3.00**](https://www.anthropic.com/news/claude-sonnet-4-5).

That's a 97 percent discount. For comparable performance.

I'm not cherry-picking. Zhipu AI's [GLM-5 scores 1,452 on the Chatbot Arena leaderboard](https://medium.com/@mlabonne/glm-5-chinas-first-public-ai-company-ships-a-frontier-model-a068cecb74e3), the highest Elo rating of any open-source model, and its developer's own figures put it at roughly 95 percent of closed-model performance at around 15 percent of the cost. Moonshot AI's [Kimi K2.5](https://www.kimi.com/blog/kimi-k2-5), a trillion-parameter model, scores 99.0 on HumanEval and 96.1 on AIME 2025, with a Chatbot Arena Elo of 1,447, at roughly 88 percent less than Claude Opus 4.5 per token. The [Stanford HAI 2025 AI Index](https://hai.stanford.edu/ai-index/2025-ai-index-report/technical-performance) found the performance gap between open-source and proprietary AI models on the Chatbot Arena leaderboard shrank from **8 percent to 1.7 percent in a single year**.

This is not an IP story. It is not a China story. It is an industrial economics story. And we know how those end. <a href="#lightbox-ai-performance-vs-price-png-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;">
  <picture class="img-lightbox">
    
    <source media="(max-width: 768px)" 
            srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/ai-performance-vs-price.png 320w,
                    https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/ai-performance-vs-price.png 480w,
                    https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/ai-performance-vs-price.png 640w,
                    https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/ai-performance-vs-price.png 960w,
                    https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai-performance-vs-price.png 1200w"
            sizes="80vw">
    
    
    <source media="(max-width: 1024px)" 
            srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/ai-performance-vs-price.png 768w,
                    https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/ai-performance-vs-price.png 1024w,
                    https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/ai-performance-vs-price.png 1440w"
            sizes="80vw">
    
    
    <source media="(min-width: 1025px)" 
            srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai-performance-vs-price.png 1200w,
                    https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/ai-performance-vs-price.png 1600w,
                    https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/ai-performance-vs-price.png 2000w"
            sizes="80vw">
    
    
    <img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/ai-performance-vs-price.png" 
         alt="Exhibit showing open-source AI models have crossed the performance threshold at a fraction of the price, with GLM-5, Kimi K2.5, DeepSeek V3, and Qwen 3.5 Flash all landing in the high-performance low-cost quadrant below $1 per million tokens while Claude Opus 4.5 sits at $15 and GPT-4o at $2.50" 
         class=""
         width="1200"
         height="630"
         loading="lazy"
         decoding="async"
         style="width: 100%; height: auto; display: block;">
  </picture>
</a>




## What the steel mills can tell us

In the mid-1960s, electric arc furnace mini-mills entered the steel market at the lowest-quality segment: rebar. Capital costs ran one-fifth to one-seventh of what an integrated plant required. Nucor, the most aggressive operator, built its first mill for $6 million when a comparable integrated facility cost $500 million or more. The response from companies like U.S. Steel was rational: retreat from low-margin rebar, harvest the better-margin products, improve average profitability in the short term. Sensible but wrong.

Each segment mini-mills conquered had higher margins than the last. From rebar to structural steel, from structural steel to sheet metal, the disruptors climbed the value chain until there was nowhere left to climb. The American steel industry [lost money for five consecutive years in the early 1980s](https://www.chicagotribune.com/news/ct-xpm-1990-06-04-9002150481-story.html), posting aggregate losses of **$3.38 billion in 1982 alone**. U.S. Steel shed more than half its workforce, pivoted to oil and gas, and by [June 2025 accepted a $14.9 billion acquisition by Nippon Steel](https://investors.ussteel.com/news-events/news-releases/detail/659/nippon-steel-corporation-nsc-to-acquire-u-s-steel), a fraction of its inflation-adjusted peak valuation. Nucor, the mini-mill, became the largest American steelmaker.

Clayton Christensen spent a career documenting this pattern of disruptive innovation. The incumbents never failed because they made bad decisions. They failed because they made good decisions for their existing customers while the market shifted beneath them. OpenAI is serving demanding enterprise customers with the most capable models available. Anthropic is building trust with regulated industries. These are the correct moves for their current customers. They may also be exactly the wrong moves for the next five years.












































































































































































## The cost decline eats strategy

[Epoch AI's research](https://epoch.ai/data-insights/llm-inference-price-trends), published in 2025, found that AI inference prices are declining at a **median rate of 50x per year** for equivalent performance levels, with a range spanning 9x to 900x depending on the task. Achieving GPT-4's original performance on PhD-level science questions cost $30 per million input tokens when GPT-4 launched in early 2023. Through open-source alternatives today, the same performance costs under $0.10. A roughly 300-fold reduction in three years, at a pace that dwarfs Moore's Law.

David Cahn at Sequoia Capital put the structural problem plainly in his ["$600 Billion Question"](https://sequoiacap.com/article/ais-600b-question/) analysis: "GPU computing is increasingly turning into a commodity, metered per hour. Without a monopoly or oligopoly, high fixed cost plus low marginal cost businesses almost always see prices competed down to marginal cost, like airlines." The airline analogy is more foreboding than it sounds. The global airline industry generated cumulative net profits of $36 billion between 1945 and 2000, a net margin of 0.8 percent across 55 years. In the 2000s, the industry lost more than it had earned in the prior half-century combined. Even today, [IATA projects airlines' return on invested capital at 6.8 percent](https://www.iata.org/en/pressroom/2025-releases/2025-12-09-01), below their weighted average cost of capital of 8.2 percent. 

The difference between AI and airlines is that switching a flight carrier requires rebooking. Switching an AI model requires changing two lines of code. <a href="#lightbox-inference-cost-collapse-png-2" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;">
  <picture class="img-lightbox">
    
    <source media="(max-width: 768px)" 
            srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/inference-cost-collapse.png 320w,
                    https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/inference-cost-collapse.png 480w,
                    https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/inference-cost-collapse.png 640w,
                    https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/inference-cost-collapse.png 960w,
                    https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/inference-cost-collapse.png 1200w"
            sizes="80vw">
    
    
    <source media="(max-width: 1024px)" 
            srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/inference-cost-collapse.png 768w,
                    https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/inference-cost-collapse.png 1024w,
                    https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/inference-cost-collapse.png 1440w"
            sizes="80vw">
    
    
    <source media="(min-width: 1025px)" 
            srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/inference-cost-collapse.png 1200w,
                    https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/inference-cost-collapse.png 1600w,
                    https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/inference-cost-collapse.png 2000w"
            sizes="80vw">
    
    
    <img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/inference-cost-collapse.png" 
         alt="Exhibit showing GPT-4 level performance went from $30 to $0.10 per million tokens in three years, with closed proprietary models shown alongside open-source alternatives that now match frontier performance at a fraction of the cost, representing a 300x cost reduction" 
         class=""
         width="1200"
         height="630"
         loading="lazy"
         decoding="async"
         style="width: 100%; height: auto; display: block;">
  </picture>
</a>




## Switching costs that approach zero

The OpenAI API format has become the de facto industry standard, supported by virtually every major model provider and open-source inference engine. [LiteLLM](https://github.com/BerriAI/litellm), an open-source gateway with approximately 37,000 GitHub stars, provides a unified interface to over 100 providers through a single configuration change. OpenRouter offers managed access to more than 400 models. Setup time: under five minutes.

Enterprise behavior already reflects this. Perplexity's own data shows 92 percent of Fortune 500 employees use multi-model AI platforms, and their top enterprise accounts access an average of 30 different models. These are Perplexity's internal figures, not independent market research: treat them as directional. The one meaningful source of lock-in is custom fine-tuned models, which are provider-specific and cannot be directly ported. That affects a small fraction of deployments. For the vast majority of inference calls, the model is interchangeable, and the customer buys on price.

## What OpenAI's numbers actually require

On February 27, 2026, [OpenAI closed a $110 billion funding round](https://openai.com/index/scaling-ai-for-everyone/), the largest private capital raise in history, at a post-money valuation of **$840 billion**. Amazon committed $50 billion. SoftBank $30 billion. Nvidia $30 billion. The valuation implies extraordinary confidence in OpenAI's ability to maintain pricing power and grow revenue to somewhere between $200 and $280 billion by 2030. At 42x trailing revenue, it is priced not for today's market but for a specific version of the future.

OpenAI reported [**$20 billion in annualized recurring revenue**](https://openai.com/index/scaling-ai-for-everyone/) as of January 2026, up 233 percent year over year. Impressive. But the adjusted gross margin fell to 33 percent in 2025, down from 40 percent the prior year, as [inference costs quadrupled to $8.4 billion](https://the-decoder.com/openai-adds-111-billion-to-its-cash-burn-forecast-as-ai-costs-spiral-beyond-projections/). In the first half of 2025 alone, OpenAI lost $13.5 billion. Compute and technical talent costs consume approximately 75 percent of total revenue, and Microsoft takes another 20 percent through 2032. That leaves very little room for the margin expansion the valuation demands.

[Anthropic](https://www.anthropic.com/news/anthropic-raises-30-billion-series-g-funding-380-billion-post-money-valuation) tells a similar story at a smaller scale. At a **$380 billion valuation** on $14 billion in run-rate revenue, 27x, the company is also unprofitable, projecting positive cash flow somewhere around 2027 to 2028. Both companies are betting they can simultaneously grow revenue and expand margins. In commoditized markets, that is the bet that fails.

Part of the financing is also circular. Amazon invests $50 billion in OpenAI; a portion flows back to AWS as compute spending. Nvidia invests $30 billion; the same money returns as GPU purchases. This inflates revenue figures while obscuring how much of the demand is genuinely independent. <a href="#lightbox-openai-margin-squeeze-png-3" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;">
  <picture class="img-lightbox">
    
    <source media="(max-width: 768px)" 
            srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/openai-margin-squeeze.png 320w,
                    https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/openai-margin-squeeze.png 480w,
                    https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/openai-margin-squeeze.png 640w,
                    https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/openai-margin-squeeze.png 960w,
                    https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/openai-margin-squeeze.png 1200w"
            sizes="80vw">
    
    
    <source media="(max-width: 1024px)" 
            srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/openai-margin-squeeze.png 768w,
                    https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/openai-margin-squeeze.png 1024w,
                    https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/openai-margin-squeeze.png 1440w"
            sizes="80vw">
    
    
    <source media="(min-width: 1025px)" 
            srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/openai-margin-squeeze.png 1200w,
                    https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/openai-margin-squeeze.png 1600w,
                    https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/openai-margin-squeeze.png 2000w"
            sizes="80vw">
    
    
    <img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/openai-margin-squeeze.png" 
         alt="Exhibit showing OpenAI financials: $20B ARR up 233% but gross margin fell from 40% to 33% as inference costs quadrupled to $8.4B, net loss of $13.5B in H1 2025, with the $840B valuation requiring 43% revenue CAGR to 2030 while expanding margins against open-source price pressure" 
         class=""
         width="1200"
         height="630"
         loading="lazy"
         decoding="async"
         style="width: 100%; height: auto; display: block;">
  </picture>
</a>




## Who actually wins when the model layer is a commodity

Before writing off the incumbents, two historical cases are worth sitting with.

Amazon Web Services has cut prices [134 times since 2006](https://docs.aws.amazon.com/wellarchitected/latest/cost-optimization-pillar/cost_cloud_financial_management_scheduled.html), yet its operating margins expanded to a record [39.5 percent in Q1 2025](https://www.cnbc.com/2025/05/01/aws-q1-earnings-report-2025.html). Apple captures roughly 80 to 85 percent of global smartphone operating profits with around 18 to 21 percent of unit shipments, while commodity Android manufacturers earn negligible margins. Both got there the same way: years of accumulated switching costs, vertical integration, ecosystems that cost real money to leave. The question is whether AI model providers can build any of that. I don't think they can, not at the model layer. An API endpoint returning text is not an iPhone. You change it in a config file on a Tuesday afternoon.

So who does benefit? Nvidia and cloud providers collect rent regardless of which model runs on their hardware. That position is durable. The application layer looks better still: companies embedding AI into domain-specific workflows with proprietary data, where the model is an input rather than the product. As [Andrew Lewis at EQT](https://eqtgroup.com/thinq/technology/why-ai-value-wont-just-accrue-to-foundational-models) put it, "Over time, the value is likely to accrue to the application layer and the product companies." And then there are the platforms with distribution so large they can integrate AI at near-zero marginal cost: Meta embedding Llama into Instagram and WhatsApp, Google weaving Gemini into Search and Workspace. When Mark Zuckerberg open-sources Llama, he is deliberately commoditizing the model layer to prevent any single player from owning the stack above his distribution. When a $1.6 trillion company is your most committed price-cutter, that tells you something about where the margins are going.

---

## Frequently Asked Questions

### How fast are AI inference costs declining?

At a median rate of 50x per year for equivalent performance, according to Epoch AI. GPT-4-level performance on PhD-level science questions cost $30 per million input tokens in early 2023 and under $0.10 through open-source alternatives today, roughly a 300-fold reduction in three years.

### Are open-source AI models as good as proprietary ones?

Nearly. The Stanford HAI 2025 AI Index found the gap shrank from 8 percent to 1.7 percent in a single year. Qwen 3.5-35B matches Claude Sonnet 4.5 on select benchmarks at roughly 3 percent of the cost, and GLM-5 achieves the highest Chatbot Arena Elo of any open-source model.

### Is OpenAI's $840 billion valuation justified?

At 42x trailing revenue, the valuation requires revenue growth to $200-280 billion by 2030 while expanding margins. But adjusted gross margins fell from 40 to 33 percent in 2025 as inference costs quadrupled, and the company lost $13.5 billion in the first half of 2025 alone.

### Who benefits from AI model commoditization?

Infrastructure providers like Nvidia and cloud platforms collect rent regardless of which model runs. Application-layer companies embedding AI into domain-specific workflows with proprietary data also benefit. Platforms with massive distribution like Meta and Google deliberately accelerate commoditization to prevent anyone from owning the model layer.

### What are the switching costs for AI models?

Near zero. The OpenAI API format is the de facto standard supported by virtually every provider. LiteLLM, an open-source gateway with 37,000 GitHub stars, provides a unified interface to over 100 providers through a single configuration change. OpenRouter offers managed access to more than 400 models. The only meaningful lock-in is custom fine-tuned models, which affect a small fraction of deployments.

### Can OpenAI and Anthropic become profitable?

Both face significant challenges. OpenAI lost $13.5 billion in the first half of 2025, with compute and talent consuming 75 percent of revenue and Microsoft taking another 20 percent through 2032. Anthropic at a $380 billion valuation on $14 billion in run-rate revenue projects positive cash flow around 2027-2028. Both are betting they can simultaneously grow revenue and expand margins in a market where open-source alternatives offer comparable performance at 3-15 percent of the cost.


---

*Philipp D. Dubach — [http://philippdubach.com/](http://philippdubach.com/) — 2026*