Start by reading Is AI Really Eating the World? What we’ve Learned [1/2]
All current recommendation systems work by capturing and analyzing user behavior at scale. Netflix needs millions of users watching millions of hours to train its recommendation algorithm. Amazon needs billions of purchases. The network effect comes from data scale. What if LLMs can bypass this? What if an LLM can provide useful recommendations by reasoning about conceptual relationships rather than requiring massive behavioral datasets? If I ask for “books like Pirsig’s Zen and the Art of Motorcycle Maintenance but more focused on Eastern philosophy,” a sufficiently capable LLM might answer well without needing to observe 100 million readers. It understands (or appears to understand) the conceptual space. I’m uncertain whether LLMs can do this reliably by the end of 2025. The fundamental question is whether they reason or pattern-match at a very sophisticated level. Recent research suggests LLMs may rely more on statistical correlations than true reasoning. If it’s mostly pattern-matching, they still need the massive datasets and we’re back to conventional network effects. If they can actually reason over conceptual spaces, that’s different. That would unbundle data network effects from recommendation quality. Recommendation quality would depend on model capability, not data scale. And if model capability is commoditizing, then the value in recommendations flows to whoever owns customer relationships and distribution, not to whoever has the most data or the best model. I lean toward thinking LLMs are sophisticated pattern-matchers rather than reasoners, which means traditional network effects still apply. But this is one area where I’m genuinely waiting to see more evidence.
Now, on AGI. The Silicon Valley consensus, articulated by Sutskever, Altman, Musk, and others, is that we’re on a clear path to artificial general intelligence in the next few years, possibly by 2027 or 2028. The argument goes: scaling laws continue to hold, we’re seeing emergent capabilities at each scale jump, and there’s no obvious wall before we reach human-level performance across all cognitive domains. I remain unconvinced. Not because I think AGI is impossible, but because the path from “really good at pattern completion and probabilistic next-token prediction” to “general reasoning and planning capabilities” seems less straightforward than the AI CEOs suggest. Current LLMs still fail in characteristic ways on tasks requiring actual causal reasoning, spatial reasoning, or planning over extended horizons. They’re getting better, but the improvement curve on these specific capabilities looks different from the improvement curve on language modeling perplexity. That suggests to me that we might need architectural innovations beyond just scaling, and those are harder to predict.
But let’s say I’m wrong. Let’s say AGI arrives by 2028. Even then, I find it hard to model why this would be tremendously economically beneficial specifically to the companies that control the models. Here’s why: we already have multiple competing frontier models (ChatGPT, Claude, Gemini, Microsoft’s offerings, and now DeepSeek). If AGI arrives, it likely arrives for multiple players at roughly the same time, given how quickly capabilities diffuse in this space. Multiple competing AGIs means price competition. Price competition in a product with near-zero marginal cost means prices collapse toward marginal cost. Where does economic value flow in that scenario? It flows to the users of AI, not the providers. Engineering firms using AGI for materials development capture value through better materials. Pharmaceutical companies using AGI for drug discovery capture value through better drugs. Retailers using AGI for inventory management capture value through better margins. The AGI providers compete with each other to offer the capability at the lowest price. This is basic microeconomics. You capture value when you have market power, either through monopoly, through differentiation, or through control of a scarce input. If models are commodities or near-commodities, model providers have none of these.
The counterargument is that one provider achieves escape velocity and reaches AGI first with enough of a lead that they establish dominance before others catch up. This is the OpenAI/Microsoft theory of the case. Maybe. But the evidence so far suggests capability leads are measured in months, not years. GPT-4 launched in March 2023 with a substantial lead. Within six months, Claude 2 was comparable. Within a year, multiple models clustered around similar capability. The diffusion is fast. Another counterargument is vertical integration. Maybe the hyperscalers that control cloud infrastructure plus model development plus customer relationships plus application distribution can capture value even if models themselves commoditize. This is more plausible, essentially the AWS playbook. Amazon didn’t make money by having the best database. They made money by owning the infrastructure, the customer relationships, and the entire stack from hardware to application platform. Microsoft is clearly pursuing this strategy with Azure plus OpenAI plus Copilot plus Office integration. Google has Search plus Cloud plus Gemini plus Workspace. This could work, but it’s a different thesis than “we have the best model.” It’s “we control the distribution and can bundle.”
Evans shows a scatter plot (Slide 34) of model benchmark scores from standard evaluations like MMLU and HumanEval. Leaders change weekly. The gaps are small. Meanwhile, consumer awareness doesn’t track model quality. ChatGPT dominates with over 700 million weekly active users not because it has the best model anymore, but because it got there first and built brand. If models are commodities, value moves up the stack to product design, distribution, vertical integration, and customer relationships. This is exactly what happened with databases. Oracle didn’t win because they had the best database engine. They won through enterprise sales, support contracts, and ecosystem lock-in. Microsoft didn’t beat them with a better database. They won by bundling SQL Server with Windows Server and offering acceptable performance at a lower price. The SaaS pattern suggests something similar happens here. The model becomes an input. The applications built on top, the customer relationships, the distribution, those become the valuable assets. Why do I think this pattern applies rather than, say, the search pattern where Google maintained dominance despite no fundamental technical moat? Two reasons: (1) Search had massive data network effects. Every search improved the algorithm, and Google’s scale meant they improved faster. LLMs have weaker data network effects because the pretraining data is largely static and publicly available, and fine-tuning data requirements are smaller. (2) Search had winner-take-all dynamics through defaults and single-answer demand. You pick one search engine and use it for everything. AI applications look more diverse. You might use different models for different tasks, or your applications might switch between models transparently based on price and performance. The switching costs are lower.
So where does this leave us? The technology exists and the underlying capabilities are real. But I think the current evidence points toward a world where value flows to applications and customer relationships, and where the $400 billion the hyperscalers are spending buys them competitive positioning rather than monopoly. The integrators are making money now by helping enterprises navigate uncertainty. Some of that will produce real productivity gains. Much of it is expensive signaling and competitive positioning. The startups unbundling existing software will see mixed results, the ones that succeed will do so by owning distribution or solving really specific problems where switching costs are high, not by having better access to AI. The biggest uncertainty is whether the hyperscalers can use vertical integration to capture value anyway, or whether the applications layer fragments and value flows to thousands of specialized companies. That depends less on AI capabilities and more on competitive dynamics, regulation, and whether enterprises prefer integrated platforms or best-of-breed solutions. My guess is we end up somewhere in between. The hyperscalers maintain strong positions through bundling and infrastructure control. A long tail of specialized applications captures value in specific verticals. The model providers themselves, unless they’re also infrastructure providers, struggle to capture value proportional to the capability they’re creating. But I’m genuinely uncertain, and that uncertainty is where the interesting bets are.
What makes Evans’ presentation valuable is precisely what frustrated me about it initially: his refusal to collapse uncertainty prematurely. I’ve spent this entire post arguing for a specific view of how value will flow in AI markets, but Evans is right that we’re pattern-matching from incomplete data. Every previous platform shift looked obvious in retrospect and uncertain in real time. The PC revolution, the internet boom, mobile, they all had credible skeptics who turned out wrong and credible bulls who were right for the wrong reasons. Evans’ discipline in laying out the full range of possibilities, from commodity to monopoly to something entirely new, is the intellectually honest position. I’ve made specific bets here because that’s useful for readers trying to navigate the space, but I’m more confident in my framework than in my conclusions. His presentation remains the best map of the territory. Go watch it, even if you end up disagreeing with how much certainty is warranted.
Original presentation linked in this post’s title.