Is AI Really Eating the World?

In August 2011, Marc Andreessen wrote “Why Software Is Eating the World”, an essay about how software was transforming industries, disrupting traditional businesses, and revolutionizing the global economy. Recently, Benedict Evans, a former a16z partner, gave a presentation on generative AI three years after ChatGPT’s launch. His argument in short:

we know this matters, but we don’t know how.

In this article I will try to explain why I find his framing fascinating but incomplete. Evans structures technology history in cycles. Every 10-15 years, the industry reorganizes around a new platform: mainframes (1960s-70s), PCs (1980s), web (1990s), smartphones (2000s-2010s). Each shift pulls all innovation, investment, and company creation into its orbit. Generative AI appears to be the next platform shift, or it could break the cycle entirely. The range of outcomes spans from “just more software” to a single unified intelligence that handles everything. The pattern recognition is smart, but I think the current evidence points more clearly toward commoditization than Evans suggests, with value flowing up the stack rather than to model providers.

The hyperscalers are spending historic amounts. In 2025, Microsoft, Google, Amazon, and Meta will invest roughly $400 billion in AI infrastructure, more than global telecommunications capex. Microsoft now spends over 30% of revenue on capex, double what Verizon spends. What has this produced? Models that are simultaneously more capable and less defensible. When ChatGPT launched in November 2022, OpenAI had a massive quality advantage. Today, dozens of models cluster around similar performance. DeepSeek proved that anyone with $500 million can build a frontier model. Costs have collapsed. OpenAI’s API pricing has dropped by 97% since GPT-3’s launch, and every year brings an order of magnitude decline in the price of a given output.

Now, $500 million is still an enormous barrier. Only a few dozen entities globally can deploy that capital with acceptable risk. GPT-4’s performance on complex reasoning tasks, Claude’s extended context windows of up to 200,000 tokens, Gemini’s multimodal capabilities, these represent genuine breakthroughs. But the economic moat isn’t obvious to me (yet).

Evans uses an extended metaphor: automation that works disappears. In the 1950s, automatic elevators were AI. Today they’re just elevators. As Larry Tesler noted in 1970,

AI is whatever machines can’t do yet. Once it works, it’s just software.

The question: will LLMs follow this pattern, or is this different?

Current deployment shows clear winners but also real constraints. Software development has seen massive adoption, with GitHub reporting that 92% of developers now use AI coding tools. Marketing has found immediate uses generating ad assets at scale. Customer support has attracted investment, though with the caveat that LLMs produce plausible answers, not necessarily correct ones. Beyond these areas, adoption looks scattered. Deloitte surveys from June 2025 show that roughly 20% of U.S. consumers use generative AI chatbots daily, with another 34% using them weekly or monthly. Enterprise deployment is further behind. McKinsey data shows most AI “agents” remain in pilot or experimental stages. A quarter of CIOs have launched something. Forty percent don’t expect production deployment until 2026 or later.

But I think here’s where Evans’ “we don’t know” approach misses something important. Consulting firms are booking billions in AI contracts right now. Accenture alone expects $3 billion in GenAI bookings for fiscal 2025. The revenue isn’t coming from the models. It’s coming from integration projects, change management, and process redesign. The pitch is simple: your competitors are moving on this, you can’t afford to wait. If your competitors are investing and you’re not, you risk being left behind. If everyone invests and AI delivers modest gains, you’ve maintained relative position. If everyone invests and AI delivers nothing, you’ve wasted money but haven’t lost competitive ground. Evans notes that cloud adoption took 20 years to reach 30% of enterprise workloads and is still growing. New technology always takes longer than advocates expect. His most useful analogy is spreadsheets. VisiCalc in the late 1970s transformed accounting. If you were an accountant, you had to have it. If you were a lawyer, you thought “that’s nice for my accountant.” ChatGPT today has the same dynamic. Certain people with certain jobs find it immediately essential. Everyone else sees a demo and doesn’t know what to do with the blank prompt. This is right, and it suggests we’re early. But it doesn’t tell us where value will accumulate.

The standard pattern for deploying technology goes in stages: (1) Absorb it (make it a feature, automate obvious tasks). (2) Innovate (create new products, unbundle incumbents). (3) Disrupt (redefine what the market is). We’re mostly in stage one. Stage two is happening in pockets. Y Combinator’s recent batches are overwhelmingly AI-focused, betting on thousands of new companies unbundling existing software (startups are attacking specific enterprise problems like converting COBOL to Java or reconfiguring telco billing systems). Stage three remains speculative. From an economic perspective, there’s the automation question: do you do the same work with fewer people, or more work with the same people? This echoes debates about labor-augmenting technical change in economics. Companies whose competitive advantage was “we can afford to hire enough people to do this” face real pressure. Companies whose advantage was unique data, customer relationships, or distribution may get stronger. This is standard economic analysis of labor-augmenting technical change, and it probably holds here too.

All current recommendation systems work by capturing and analyzing user behavior at scale. Netflix needs millions of users watching millions of hours to train its recommendation algorithm. Amazon needs billions of purchases. The network effect comes from data scale. What if LLMs can bypass this? What if an LLM can provide useful recommendations by reasoning about conceptual relationships rather than requiring massive behavioral datasets? If I ask for “books like Pirsig’s Zen and the Art of Motorcycle Maintenance but more focused on Eastern philosophy,” a sufficiently capable LLM might answer well without needing to observe 100 million readers. It understands (or appears to understand) the conceptual space. I’m uncertain whether LLMs can do this reliably by the end of 2025. The fundamental question is whether they reason or pattern-match at a very sophisticated level. Recent research suggests LLMs may rely more on statistical correlations than true reasoning. If it’s mostly pattern-matching, they still need the massive datasets and we’re back to conventional network effects. If they can actually reason over conceptual spaces, that’s different. That would unbundle data network effects from recommendation quality. Recommendation quality would depend on model capability, not data scale. And if model capability is commoditizing, then the value in recommendations flows to whoever owns customer relationships and distribution, not to whoever has the most data or the best model. I lean toward thinking LLMs are sophisticated pattern-matchers rather than reasoners, which means traditional network effects still apply. But this is one area where I’m genuinely waiting to see more evidence.

Now, on AGI. The Silicon Valley consensus, articulated by Sutskever, Altman, Musk, and others, is that we’re on a clear path to artificial general intelligence in the next few years, possibly by 2027 or 2028. The argument goes: scaling laws continue to hold, we’re seeing emergent capabilities at each scale jump, and there’s no obvious wall before we reach human-level performance across all cognitive domains. I remain unconvinced. Not because I think AGI is impossible, but because the path from “really good at pattern completion and probabilistic next-token prediction” to “general reasoning and planning capabilities” seems less straightforward than the AI CEOs suggest. Current LLMs still fail in characteristic ways on tasks requiring actual causal reasoning, spatial reasoning, or planning over extended horizons. They’re getting better, but the improvement curve on these specific capabilities looks different from the improvement curve on language modeling perplexity. That suggests to me that we might need architectural innovations beyond just scaling, and those are harder to predict.

But let’s say I’m wrong. Let’s say AGI arrives by 2028. Even then, I find it hard to model why this would be tremendously economically beneficial specifically to the companies that control the models. Here’s why: we already have multiple competing frontier models (ChatGPT, Claude, Gemini, Microsoft’s offerings, and now DeepSeek). If AGI arrives, it likely arrives for multiple players at roughly the same time, given how quickly capabilities diffuse in this space. Multiple competing AGIs means price competition. Price competition in a product with near-zero marginal cost means prices collapse toward marginal cost. Where does economic value flow in that scenario? It flows to the users of AI, not the providers. Engineering firms using AGI for materials development capture value through better materials. Pharmaceutical companies using AGI for drug discovery capture value through better drugs. Retailers using AGI for inventory management capture value through better margins. The AGI providers compete with each other to offer the capability at the lowest price. This is basic microeconomics. You capture value when you have market power, either through monopoly, through differentiation, or through control of a scarce input. If models are commodities or near-commodities, model providers have none of these.

The counterargument is that one provider achieves escape velocity and reaches AGI first with enough of a lead that they establish dominance before others catch up. This is the OpenAI/Microsoft theory of the case. Maybe. But the evidence so far suggests capability leads are measured in months, not years. GPT-4 launched in March 2023 with a substantial lead. Within six months, Claude 2 was comparable. Within a year, multiple models clustered around similar capability. The diffusion is fast. Another counterargument is vertical integration. Maybe the hyperscalers that control cloud infrastructure plus model development plus customer relationships plus application distribution can capture value even if models themselves commoditize. This is more plausible, essentially the AWS playbook. Amazon didn’t make money by having the best database. They made money by owning the infrastructure, the customer relationships, and the entire stack from hardware to application platform. Microsoft is clearly pursuing this strategy with Azure plus OpenAI plus Copilot plus Office integration. Google has Search plus Cloud plus Gemini plus Workspace. This could work, but it’s a different thesis than “we have the best model.” It’s “we control the distribution and can bundle.”

Evans shows a scatter plot (Slide 34) of model benchmark scores from standard evaluations like MMLU and HumanEval. Leaders change weekly. The gaps are small. Meanwhile, consumer awareness doesn’t track model quality. ChatGPT dominates with over 700 million weekly active users not because it has the best model anymore, but because it got there first and built brand. If models are commodities, value moves up the stack to product design, distribution, vertical integration, and customer relationships. This is exactly what happened with databases. Oracle didn’t win because they had the best database engine. They won through enterprise sales, support contracts, and ecosystem lock-in. Microsoft didn’t beat them with a better database. They won by bundling SQL Server with Windows Server and offering acceptable performance at a lower price. The SaaS pattern suggests something similar happens here. The model becomes an input. The applications built on top, the customer relationships, the distribution, those become the valuable assets. Why do I think this pattern applies rather than, say, the search pattern where Google maintained dominance despite no fundamental technical moat? Two reasons: (1) Search had massive data network effects. Every search improved the algorithm, and Google’s scale meant they improved faster. LLMs have weaker data network effects because the pretraining data is largely static and publicly available, and fine-tuning data requirements are smaller. (2) Search had winner-take-all dynamics through defaults and single-answer demand. You pick one search engine and use it for everything. AI applications look more diverse. You might use different models for different tasks, or your applications might switch between models transparently based on price and performance. The switching costs are lower.

So where does this leave us? The technology exists and the underlying capabilities are real. But I think the current evidence points toward a world where value flows to applications and customer relationships, and where the $400 billion the hyperscalers are spending buys them competitive positioning rather than monopoly. The integrators are making money now by helping enterprises navigate uncertainty. Some of that will produce real productivity gains. Much of it is expensive signaling and competitive positioning. The startups unbundling existing software will see mixed results, the ones that succeed will do so by owning distribution or solving really specific problems where switching costs are high, not by having better access to AI. The biggest uncertainty is whether the hyperscalers can use vertical integration to capture value anyway, or whether the applications layer fragments and value flows to thousands of specialized companies. That depends less on AI capabilities and more on competitive dynamics, regulation, and whether enterprises prefer integrated platforms or best-of-breed solutions. My guess is we end up somewhere in between. The hyperscalers maintain strong positions through bundling and infrastructure control. A long tail of specialized applications captures value in specific verticals. The model providers themselves, unless they’re also infrastructure providers, struggle to capture value proportional to the capability they’re creating. But I’m genuinely uncertain, and that uncertainty is where the interesting bets are.

What makes Evans’ presentation valuable is precisely what frustrated me about it initially: his refusal to collapse uncertainty prematurely. I’ve spent this entire post arguing for a specific view of how value will flow in AI markets, but Evans is right that we’re pattern-matching from incomplete data. Every previous platform shift looked obvious in retrospect and uncertain in real time. The PC revolution, the internet boom, mobile, they all had credible skeptics who turned out wrong and credible bulls who were right for the wrong reasons. Evans’ discipline in laying out the full range of possibilities, from commodity to monopoly to something entirely new, is the intellectually honest position. I’ve made specific bets here because that’s useful for readers trying to navigate the space, but I’m more confident in my framework than in my conclusions. His presentation remains the best map of the territory. Go watch it, even if you end up disagreeing with how much certainty is warranted.

Original presentation linked in this post’s title.

Weather Forecasts Have Improved a Lot

Reading the press release for Google DeepMind’s WeatherNext 2, I wondered: have weather forecasts actually improved over the past years?

Turns out they have, dramatically. A four-day forecast today matches the accuracy of a one-day forecast from 30 years ago. Hurricane track errors that once exceeded 400 nautical miles for 72-hour forecasts now sit below 80 miles. The European Centre for Medium-Range Weather Forecasts reports three-day forecasts now reach 97% accuracy, with seven-day forecasts approaching that threshold.

Google’s new model accelerates this trend. The hurricane model performed remarkably well this season when tested against actual paths. WeatherNext 2 generates forecasts 8 times faster than its predecessor with resolution down to one hour. Each prediction takes under a minute on a single TPU compared to hours on a supercomputer using physics-based models. The speed comes from a smarter training approach. WeatherNext 2 (along with neuralgcm) uses a continuous ranked probability score (CRPS) objective rather than the L2 losses common in earlier neural weather models. The method adds random noise to parameters and trains the model to minimize L1 loss while maximizing differences between ensemble members with different noise initializations.

This matters because L2 losses blur predictions when models roll out autoregressively over multiple time steps. Spatial features degrade and the model truncates extremes. Models trained with L2 losses struggle to forecast high-impact extreme weather at moderate lead times. The CRPS objective preserves the sharp spatial features and extreme values needed for cyclone tracking and heat wave prediction. These improvements stem from better satellite and ground station data, faster computers running higher-resolution models, and improved communication through apps and online services. AI systems like WeatherNext 2 and Pangu-Weather (which performs forecasts up to 10,000 times faster than traditional methods) are accelerating progress that has been building for decades.

GLP-1 Receptor Agonists in ASUD Treatment

Alcohol and other substance use disorders (ASUDs) are complex, multifaceted, but treatable medical conditions with widespread medical, psychological, and societal consequences. However, treatment options remain limited, therefore the discovery and development of new treatments for ASUDs is critical. Glucagon-like peptide-1 receptor agonists (GLP-1RAs), currently approved for the treatment of type 2 diabetes mellitus, obesity, and obstructive sleep apnea, have recently emerged as potential new pharmacotherapies for ASUDs.

This development matters most for people struggling with substance use disorders who have few effective treatment options. It also matters for manufacturers like Novo Nordisk facing patent expiration pressures on Ozempic. The research into GLP-1RAs for addiction treatment is early but notable given the limited pharmacotherapy options currently available for ASUDs. In February 2025, researchers at UNC published results from the first randomized controlled trial of semaglutide for ASUD treatment. The phase 2 trial enrolled 48 non-treatment-seeking adults with AUD and administered low-dose semaglutide (0.25 mg/week for 4 weeks, 0.5 mg/week for 4 weeks - standard dosing for weight loss reaches 2.4 mg per week) over 9 weeks. Participants on semaglutide consumed less alcohol in controlled laboratory settings and reported fewer drinks per drinking day in their normal lives. They also reported less craving for alcohol. Heavy drinking episodes declined more sharply in the semaglutide group compared to placebo over the nine-week trial. Despite the low doses, effect sizes for some drinking outcomes exceeded those typically seen with naltrexone, one of the few FDA-approved medications for alcohol use disorder. While larger trials are needed to confirm these results, the early evidence suggests GLP-1 may offer a meaningful treatment option for a condition where new therapies have been approved at a rate of roughly one every 25 years.

Original paper linked in this post’s title.

Damodaran on Gold's 2025 Surge

Aswath Damodaran’s latest analysis into gold’s 2025 surge walks through gold’s contradictory nature as a collectible rather than an asset with cash flows, showing why it’s impossible to “value” gold in the traditional sense, yet entirely possible to understand what drives its pricing.

Even though gold is outperforming almost all other assets in my portfolio this year I fundamentally don’t like holding it. I’m a Buffett disciple: gold is an unproductive asset that generates no earnings, pays no dividends.

But Damodaran’s framework helps to understand why tolerating it anyway might be worth it. It’s less an investment than insurance against the tail risks of hyperinflation and catastrophic market dislocations, scenarios where correlations go to one and traditional diversification fails. The dissonance between what I believe intellectually (productive assets compound wealth) and what I’m actually doing (holding some gold anyway) probably says more about 2025’s macro uncertainty than any principled investment thesis.

Damodaran’s blog linked in this post’s title.

The Bicycle Needs Riding to be Understood

Some concepts are easy to grasp in the abstract. Boiling water: apply heat and wait. Others you really need to try. You only think you understand how a bicycle works, until you learn to ride one.

You should write an LLM agent—not because they’re revolutionary, but because the bicycle needs riding to be understood. Having built agents myself, Ptacek’s central insight resonates: the behavior surprises in specific ways, particularly around how models scale effort with complexity before inexplicably retreating.

Ptacek walks through building a functioning agent in roughly 50 lines of Python, demonstrating how an LLM with ping access autonomously chose multiple Google endpoints without explicit instruction, a moment that crystallizes both promise and unpredictability. His broader point matches my experience: context engineering isn’t mystical but straightforward programming—managing token budgets, orchestrating sub-agents, balancing explicit loops against emergent behavior. The open problems in agent design—titrating nondeterminism, connecting to ground truth, allocating tokens—remain remarkably accessible to individual experimentation, each iteration taking minutes rather than requiring institutional resources.

Blog by Thomas Ptacek linked in this post’s title.

AI Models as Standalone P&Ls

Microsoft reported earnings for the quarter ended Sept. […] buried in its financial filings were a couple of passages suggesting that OpenAI suffered a net loss of $11.5 billion or more during the quarter.

For every dollar of revenue, they’re allegedly spending roughly $5 to deliver the product. What initially sounds like a joke about “making it up on volume” points to a more fundamental problem facing OpenAI and its competitors. AI companies are locked into continuously releasing more powerful (and expensive) models. If they stop, open-source alternatives will catch up and offer equivalent capabilities at substantially lower costs. This creates an uncomfortable dynamic. If your current model requires spending more than you earn just to fund the next generation, the path to profitability becomes unclear—perhaps impossible.

Anthropic CEO Dario Amodei (everybody’s favorite AI CEO) recently offered a different perspective in a conversation with Stripe co-founder John Collison. He argues that treating each model as an independent business unit reveals a different picture than conventional accounting suggests.

Let’s say in 2023, you train a model that costs $100 million, and then you deploy it in 2024 and it makes $200 million of revenue.

So far, this looks profitable, a solid 2x return on the training investment. But here’s where it gets complicated.

Meanwhile, because of the scaling laws, in 2024, you also train a model that costs $1 billion. If you look in a conventional way at the profit and loss of the company you’ve lost $100 million the first year, you’ve lost $800 million the second year, and you’ve lost $8 billion in the third year, so it looks like it’s getting worse and worse.

The pattern continues:

In 2025, you get $2 billion of revenue from that $1 billion model trained the previous year.

Again, viewed in isolation, this model returned 2x its training cost.

And you spend $10 billion to train the model for the following year.

The losses appear to accelerate dramatically, from $100 million to $800 million to $8 billion.

This is where Amodei’s reframing becomes interesting.

If you consider each model to be a company, the model that was trained in 2023 was profitable. You paid $100 million and then it made $200 million of revenue."

He also acknowledges there are inference costs (the actual computing expenses of running the model for users) but suggests these don’t fundamentally change the picture in his simplified example. His core argument:

If every model was a company, the model in this example is actually profitable. What’s going on is that at the same time as you’re reaping the benefits from one company, you’re founding another company that’s much more expensive and requires much more upfront R&D investment.

This is essentially an argument that AI companies are building a portfolio of profitable products, but the accounting makes it look terrible because each successive “product” costs 10x more than the last to develop. The losses stem from overlapping these profitable cycles while exponentially increasing investment scale. But this framework only works if two critical assumptions hold: (1) Each model consistently returns roughly 2x its training cost in revenue, and (2) The improvements from spending 10x more justify that investment—meaning customers will pay enough more for the better model to maintain that 2x return.

Amodei outlines two ways this resolves:

So the way that it’s going to shake out is this will keep going up until the numbers go very large and the models can’t get larger, and, you know, then it’ll be a large, very profitable business.

In this first scenario, scaling hits physical or practical limits. You’ve maxed out available compute, data, or capability improvements. Training costs plateau because you literally can’t build a meaningfully larger model. At that point, companies stop needing exponentially larger investments and begin harvesting profits from their final-generation models. The second scenario is less optimistic:

Or at some point the models will stop getting better, right? The march to AGI will be halted for some reason.

If the improvements stop delivering proportional returns before reaching natural limits, companies face what Amodei calls overhang.

And then perhaps there’ll be some overhang, so there’ll be a one-time, ‘Oh man, we spent a lot of money and we didn’t get anything for it,’ and then the business returns to whatever scale it was at.

What Amodei’s framework doesn’t directly address is the open-source problem. If training Model C costs $10 billion but open-source alternatives reach comparable performance six months later, that 2x return window might not materialize. The entire argument depends on maintaining a significant capability lead that customers will pay premium prices for. There’s also the question of whether the 2x return assumption holds as models become more expensive. The jump from $100 million to $1 billion to $10 billion in training costs assumes that customers will consistently value the improvements enough to double revenue.

Working with Models

There was this “I work with Models” joke which I first heard years ago from an analyst working on a valuation model (see my previous post). I guess it has become more relevant than ever:

This monograph presents the core principles that have guided the development of diffusion models, tracing their origins and showing how diverse formulations arise from shared mathematical ideas. Diffusion modeling starts by defining a forward process that gradually corrupts data into noise, linking the data distribution to a simple prior through a continuum of intermediate distributions.

If you want to get into this topic in the first place, be sure to check out Stefano Ermon’s CS236 Deep Generative Models Course. Lecture recordings of the full course can also be found on YouTube.

Original paper linked in this post’s title.

Pozsar's Bretton Woods III: Three Years Later [2/2]

Start by reading Pozsar’s Bretton Woods III: The Framework [1/2]

Now, what actually happened in the three years since Pozsar published this framework? (1) Dollar reserve diversification is happening, but gradual: Foreign central bank Treasury holdings declined from peaks exceeding $7.5 trillion to levels below $7 trillion. This represents steady diversification away from dollar-denominated assets, though not a dramatic collapse. (2) Gold has performed strongly: From roughly $1'900/oz when Pozsar published his dispatches to peaks above $4'000/oz today, gold has appreciated substantially, consistent with increased demand for “outside money.” (3) Alternative payment systems are developing: Various nations continue building infrastructure for non-dollar trade settlement. While these systems remain in preliminary stages rather than fully operational alternatives to SWIFT, development timelines could speed up following specific triggering events. (4) The dollar itself has remained strong: Perhaps surprisingly given predictions of dollar weakness, the dollar achieved its best performance against a basket of major currencies since 2015 in 2024. The DXY index (which tracks the dollar against major trading partners) fell about 11% this year, marking the end of this decade-long rally. (5) Commodity collateral is increasingly important: Research on commodities as collateral shows that under capital controls and collateral constraints, investors import commodities and pledge them as collateral. Higher collateral demands increase commodity prices and affect the inventory-convenience yield relationship.

One of Pozsar’s more provocative arguments concerns China’s strategic options. With approximately $3 trillion in foreign exchange reserves heavily weighted toward dollars and Treasuries, China faces the same calculus as any holder of large dollar reserves: what is the risk these could be frozen? Pozsar outlined two theoretical paths for China: (1) Sell Treasuries to purchase commodities directly (especially discounted Russian commodities), thereby converting financial claims into physical resources. (2) Print renminbi to purchase commodities, creating a “eurorenminbi” market parallel to the eurodollar system.

The first option provides inflation control for China (securing physical resources) while potentially raising yields in Treasury markets. The second option represents a more fundamental challenge to dollar dominance, the birth of an alternative offshore currency market backed by commodity reserves rather than financial reserves. In practice, we’ve seen elements of both. China has increased commodity imports from Russia substantially. The internationalization of the renminbi has progressed, though more slowly than some expected, constrained by China’s capital controls and the relative underdevelopment of its financial markets compared to dollar markets.

Regardless of whether “Bretton Woods III” emerges exactly as described, several insights from Pozsar’s framework appear durable. (1) Central banks control the nominal domain, not the real domain: Monetary policy can influence demand, manage liquidity, and stabilize financial markets. It cannot conjure physical resources, build supply chains, or speed up energy transitions. (2) Physical infrastructure matters for financial markets: The number of VLCCs, the capacity of the Suez Canal, the efficiency of port facilities, these real-world constraints bind financial flows. Understanding the infrastructure underlying commodity movements provides insight into funding market dynamics. (3) Collateralization is changing: The trend toward commodity-backed finance, warehouse receipt systems, and physical collateral reflects both technological improvements (better monitoring and verification) and strategic shifts (diversification away from pure financial claims). As the FSB noted in 2023, banks play a vital role in the commodities ecosystem, providing not just credit but clearing services and intermediation between commodity firms and central counterparties. (4) Geopolitical risk affects monetary arrangements: The weaponization of reserve assets, however justified in specific circumstances, changes the risk calculation for all reserve holders. This doesn’t mean immediate de-dollarization, but it does mean persistent, gradual diversification.

So what can we take from this for today: (1) Funding market stresses may be more persistent: If commodity traders require more financing for longer durations due to less efficient trade routes, and if banks face balance sheet constraints from regulatory requirements or QT, term funding premia may remain elevated relative to overnight rates. The FRA-OIS spread, the spread between forward rate agreements and overnight indexed swaps, becomes a window into these dynamics. (2) Cross-currency basis swaps signal more than rate differentials: Persistent deviations from covered interest parity reflect structural factors: global trade reconfiguration, reserve diversification, and the changing geography of dollar funding demand. These aren’t temporary anomalies to be arbitraged away but potentially persistent features of the new system. (3) Commodity volatility has monetary policy implications that are difficult to manage: When commodity prices surge due to supply disruptions rather than demand strength, central banks face an ugly tradeoff: tighten policy to control inflation headlines while risking recession, or accommodate the price shock and accept higher inflation. Unlike demand-driven inflation, supply-driven commodity inflation doesn’t respond well to rate hikes. (4) Infrastructure bottlenecks matter: Just as G-SIB constraints around year-end affect money market functioning, shipping capacity constraints and logistical bottlenecks affect commodity prices and, through them, inflation. Monitoring the “real plumbing,” freight rates, port congestion, pipeline capacity, provides early warning signals for inflation pressures.

Perhaps the most valuable way to engage with Bretton Woods III is not as a prediction to be validated or refuted, but as a framework for thinking about the intersection of geopolitics, commodities, and money. It forces attention to questions that are easy to overlook: (a) How do physical constraints on commodity flows affect financial market plumbing? (b) What risks do reserve holders face that aren’t captured in traditional financial risk metrics? (c) Where do central bank powers end and other forms of power, military, diplomatic, infrastructural, begin? (d) How do the “real” and “nominal” domains interact during periods of stress?

The current environment shows elements consistent with the framework: gradual reserve diversification, persistent commodity volatility, funding market stresses related to term commodity financing, and increasing focus on supply chain resilience over pure efficiency. It also shows elements inconsistent with it: dollar strength, the slow pace of alternative systems, and the resilience of dollar-based financial infrastructure. What seems clear is that the assumptions underlying Bretton Woods II, that dollar reserves are nearly risk-free, that globalized supply chains should be optimized for cost above all else, that central banks can manage most monetary disturbances, are being questioned in ways they weren’t five years ago. Whether that questioning leads to a new monetary order or simply a modified version of the current one remains to be seen. But Pozsar’s framework provides a useful lens for watching the process unfold, connecting developments in commodity markets, funding markets, and geopolitical arrangements into a coherent story about how the global financial system actually works.

Pozsar’s full Money Notes series is available through his website, and Perry Mehrling’s course Economics of Money and Banking provides excellent background on the “money view” that underpins this analysis.

Pozsar's Bretton Woods III: The Framework [1/2]

In March 2022, as Western nations imposed unprecedented sanctions following Russia’s invasion of Ukraine, Zoltan Pozsar published a series of dispatches that would become some of the most discussed pieces in financial markets that year. The core thesis was stark: we were witnessing the birth of “Bretton Woods III,” a fundamental shift in how the global monetary system operates. Nearly three years later, with more data on de-dollarization trends, commodity market dynamics, and structural changes in global trade, it’s worth revisiting this framework.

I first heard of Pozsar at Credit Suisse during the 2019 repo market disruptions and the March 2020 funding crisis, when his framework explained market dynamics in a way I have never seen it before. Before joining Credit Suisse as a short-term rate strategist, Pozsar spent years at the Federal Reserve (where he created the map of the shadow banking system, which prompted the G20 to initiate regulatory measures in this area) and the U.S. Treasury. His work focuses on what he calls the “plumbing” of financial markets, the often-overlooked mechanisms through which money actually flows through the system. His intellectual approach draws heavily from Perry Mehrling’s “money view,” which treats money as having four distinct prices rather than being a simple unit of account.

Pozsar’s Bretton Woods III framework rests on a straightforward distinction. “Inside money” refers to claims on institutions: Treasury securities, bank deposits, central bank reserves. “Outside money” refers to commodities like gold, oil, wheat, metals that have intrinsic value independent of any institution’s promise.

Bretton Woods I (1944-1971) was backed by gold, outside money. The U.S. dollar was convertible to gold at a fixed rate, and other currencies were pegged to the dollar. When this system collapsed in 1971, Bretton Woods II emerged: a system where dollars were backed by U.S. Treasury securities, inside money. Countries accumulated dollar reserves, primarily in the form of Treasuries, to support their currencies and facilitate international trade.

Pozsar’s argument: the moment Western nations froze Russian foreign exchange reserves, the assumed risk-free nature of these dollar holdings changed fundamentally. What had been viewed as having negligible credit risk suddenly carried confiscation risk. For any country potentially facing future sanctions, the calculus of holding large dollar reserve positions shifted. Hence Bretton Woods III: a system where countries increasingly prefer holding reserves in the form of commodities and gold, outside money that cannot be frozen by another government’s decision.

To understand Pozsar’s analysis, we need to understand his analytical framework. Perry Mehrling teaches that money has four prices: (1) Par: The one-for-one exchangeability of different types of money. Your bank deposit should convert to cash at par. Money market fund shares should trade at $1. When par breaks, as it did in 2008 when money market funds “broke the buck,” the payments system itself is threatened. (2) Interest: The price of future money versus money today. This is the domain of overnight rates, term funding rates, and the various “bases” (spreads) between different funding markets. When covered interest parity breaks down and cross-currency basis swaps widen, it signals stress in the ability to transform one currency into another over time. (3) Exchange rate: The price of foreign money. How many yen or euros does a dollar buy? Fixed exchange rate regimes can collapse when countries lack sufficient reserves, as happened across Southeast Asia in 1997. (4) Price level: The price of commodities in terms of money. How much does oil, wheat, or copper cost? This determines not just headline inflation but feeds through into the price of virtually everything in the economy.

Central banks have powerful tools for managing the first three prices. They can provide liquidity to preserve par, influence interest rates through policy, and intervene in foreign exchange markets. But the fourth price, the price level, particularly when driven by commodity supply shocks, is far harder to control. As Pozsar puts it: “You can print money, but not oil to heat or wheat to eat.”

Pozsar’s contribution was to extend Mehrling’s framework into what he calls the “real domain,” the physical infrastructure underlying commodity flows. For each of the three non-commodity prices of money, there’s a parallel in commodity markets: (1) Foreign exchange ↔ Foreign cargo: Just as you exchange currencies, you exchange dollars for foreign-sourced commodities. (2) Interest (time value of money) ↔ Shipping: Just as lending has a time dimension, moving commodities from port A to port B takes time and requires financing. (3) Par (stability) ↔ Protection: Just as central banks protect the convertibility of different money forms, military and diplomatic power protects commodity shipping routes.

This mapping reveals something important: commodity markets have their own “plumbing” that works parallel to financial plumbing. And when this real infrastructure gets disrupted, it creates stresses that purely monetary policy cannot resolve.

One of the most concrete examples in Pozsar’s March 2022 dispatches illustrates this intersection between finance and physical reality. Consider what happens when Russian oil exports to Europe are disrupted and must be rerouted to Asia. Previously, Russian oil traveled roughly 1-2 weeks from Baltic ports to European refineries on Aframax carriers (ships carrying about 600,000 barrels). The financing required was relatively short-term, a week or two. Post-sanctions, the same oil must travel to Asian buyers. But the Baltic ports can’t accommodate Very Large Crude Carriers (VLCCs), which carry 2 million barrels. So the oil must first be loaded onto Aframax vessels, sailed to a transfer point, transferred ship-to-ship to VLCCs, then shipped to Asia, a journey of roughly four months.

The same volume of oil, moved the same distance globally, now requires: (a) More ships (Aframax vessels for initial transport plus VLCCs for long-haul). (b) More time (4 months instead of 1-2 weeks). (c) More financing (commodity traders must borrow for much longer terms). (d) More capital tied up by banks (longer-duration loans against volatile commodities).

Pozsar estimated this rerouting alone would encumber approximately 80 VLCCs, roughly 10% of global VLCC capacity, in permanent use. The financial implication: banks’ liquidity coverage ratios (LCRs) increase because they’re extending more term credit to finance these longer shipping durations. When commodity trading requires more financing for longer durations, it competes with other demands for bank balance sheet. If this happens simultaneously with quantitative tightening (QT), when the central bank is draining reserves from the system, funding stresses become more likely. As Pozsar noted: “In 2019, o/n repo rates popped because banks got to LCR and they stopped lending reserves. In 2022, term credit to commodity traders may dry up because QT will soon begin in an environment where banks’ LCR needs are going up, not down.”

One aspect of the framework that deserves more attention relates to dollar funding for non-U.S. banks. According to recent Dallas Fed research, banks headquartered outside the United States hold approximately $16 trillion in U.S. dollar assets, comparable in magnitude to the $22 trillion held by U.S.-based institutions. The critical difference: U.S. banks have access to the Federal Reserve’s emergency liquidity facilities during periods of stress. Foreign banks do not have a U.S. dollar lender of last resort. During the COVID-19 crisis, the Fed expanded dollar swap lines to foreign central banks precisely to address this vulnerability, about $450 billion, roughly one-sixth of the Fed’s balance sheet expansion in early 2020. The structural dependency on dollar funding creates ongoing vulnerabilities. When dollars become scarce globally, whether due to Fed policy tightening, shifts in risk sentiment, or disruptions in commodity financing, foreign banks face balance sheet pressures that can amplify stress. The covered interest parity violations that Pozsar frequently discusses reflect these frictions: direct dollar borrowing and synthetic dollar borrowing through FX swaps theoretically should cost the same, but in practice, significant basis spreads persist.

Continue reading Pozsar’s Bretton Woods III: Three Years Later [2/2]

Everything is a DCF Model

A brilliant piece of writing from Michael Mauboussin and Dan Callahan at Morgan Stanley that was formative in what I personally believe when it comes to valuation.

[…] we want to suggest the mantra “everything is a DCF model.” The point is that whenever investors value a stake in a cash-generating asset, they should recognize that they are using a discounted cash flow (DCF) model. […] The value of those businesses is the present value of the cash they can distribute to their owners. This suggests a mindset that is very different from that of a speculator, who buys a stock in anticipation that it will go up without reference to its value. Investors and speculators have always coexisted in markets, and the behavior of many market participants is a blend of the two.

Original paper linked in this post’s title.