<?xml version="1.0" encoding="utf-8" standalone="yes"?><?xml-stylesheet type="text/xsl" href="/rss.xsl"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Philipp D. Dubach | Quantitative Finance &amp; AI Strategy</title><link>http://philippdubach.com/</link><description>Recent content on Philipp D. Dubach | Quantitative Finance &amp; AI Strategy</description><image><url>https://static.philippdubach.com/ograph/ograph-post.jpg</url><title>Philipp D. Dubach | Quantitative Finance &amp; AI Strategy</title><link>http://philippdubach.com/</link></image><generator>Hugo -- gohugo.io</generator><language>en-us</language><managingEditor>me@philippdubach.com (Philipp D. Dubach)</managingEditor><webMaster>me@philippdubach.com (Philipp D. Dubach)</webMaster><lastBuildDate>Mon, 13 Apr 2026 00:00:00 +0000</lastBuildDate><atom:link href="http://philippdubach.com/index.xml" rel="self" type="application/rss+xml"/><item><title>Do Not Disturb My Circles</title><link>http://philippdubach.com/posts/do-not-disturb-my-circles/</link><pubDate>Mon, 13 Apr 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/do-not-disturb-my-circles/</guid><description>&lt;blockquote&gt;
&lt;p&gt;If I&amp;rsquo;d had my way, we would have left it in the lab for longer and done more things like AlphaFold, maybe cured cancer or something like that.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That&amp;rsquo;s &lt;a href="https://en.wikipedia.org/wiki/Demis_Hassabis"&gt;Demis Hassabis&lt;/a&gt; (I cannot recomend watching
&lt;a href="https://www.youtube.com/watch?v=d95J8yzvjbQ"&gt;The Thinking Game&lt;/a&gt; enough and or read
&lt;a href="https://www.penguinrandomhouse.com/books/752231/the-infinity-machine-by-sebastian-mallaby/"&gt;The Infinity Machine&lt;/a&gt;), the CEO of Google DeepMind and a Nobel Prize winner, describing the future he didn&amp;rsquo;t get.&lt;/p&gt;
&lt;p&gt;He wanted a CERN for artificial intelligence. A decade or two of careful, methodical work. The world&amp;rsquo;s best scientists collaborating on each step toward general intelligence, understanding what they built before building the next thing. In the meantime, AI for science, narrow tools like AlphaFold, would ship real benefits: cures, new materials, maybe a crack at fusion. Not chatbots. He didn&amp;rsquo;t get that future. None of us did. Instead we got a commercial arms race, a $690 billion annual infrastructure buildout, and the greatest concentration of technical talent in human history pointed at making autocomplete better.&lt;/p&gt;
&lt;p&gt;This is a story about capital misallocation. But it&amp;rsquo;s also a very old story.&lt;/p&gt;
&lt;h2 id="geometry-in-the-sand"&gt;Geometry in the sand&lt;/h2&gt;
&lt;p&gt;In 214 BC, the Roman general Marcellus brought a fleet to Syracuse. Standing between Rome and the richest city in Sicily was one man: &lt;a href="https://en.wikipedia.org/wiki/Archimedes"&gt;Archimedes&lt;/a&gt;, the greatest scientist of the ancient world, a mathematician whose work on the lever, the screw, and the principles of buoyancy would outlast every empire he lived under.&lt;/p&gt;
&lt;p&gt;Archimedes did not want to build weapons. &lt;a href="https://en.wikipedia.org/wiki/Parallel_Lives"&gt;Plutarch&lt;/a&gt;, writing in the &lt;em&gt;Life of Marcellus&lt;/em&gt;, says Archimedes designed and contrived his machines &amp;ldquo;not as matters of any importance, but as mere amusements in geometry.&amp;rdquo; He regarded the whole business as ignoble, beneath the dignity of pure mathematics. But his patron King Hiero II needed defenses, and Archimedes was the only man who could provide them. So he built them. Catapults that could sink a ship at range. The &lt;a href="https://en.wikipedia.org/wiki/Claw_of_Archimedes"&gt;Claw of Archimedes&lt;/a&gt;, an iron grappling device that could lift a Roman galley out of the water and drop it. Possibly parabolic mirrors that focused sunlight to set ships on fire, though historians still debate that one.&lt;/p&gt;
&lt;p&gt;The machines worked. Plutarch writes that the Romans became so terrified that &amp;ldquo;whenever they saw a bit of rope or a stick of timber projecting over the wall, they cried &amp;lsquo;Archimedes is training some engine upon us,&amp;rsquo; and turned their backs and fled.&amp;rdquo; They held off Rome for two years.&lt;/p&gt;
&lt;p&gt;Then Syracuse fell anyway. In 212 BC, Roman soldiers breached the walls during a festival. A soldier found Archimedes drawing geometric figures in the sand. According to the tradition passed down through &lt;a href="https://en.wikipedia.org/wiki/Valerius_Maximus"&gt;Valerius Maximus&lt;/a&gt; and others, his last words were &lt;em&gt;&amp;ldquo;Noli turbare circulos meos&amp;rdquo;&lt;/em&gt;: do not disturb my circles.&lt;/p&gt;
&lt;p&gt;Marcellus had ordered Archimedes taken alive. The order didn&amp;rsquo;t matter. The soldier killed him. The geometry died with him. The war machines, the things Archimedes considered beneath his real work, survived in military engineering textbooks for centuries. His mathematical treatises survived only by accident, through a single Byzantine manuscript &lt;a href="https://en.wikipedia.org/wiki/Archimedes_Palimpsest"&gt;scraped and overwritten with prayer texts&lt;/a&gt; in the 13th century.&lt;/p&gt;
&lt;p&gt;I thought about this when I watched Demis Hassabis in a &lt;a href="https://www.youtube.com/watch?v=C0gErQtnNFE"&gt;recent interview with Cleo Abram&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id="the-conscription"&gt;The conscription&lt;/h2&gt;
&lt;p&gt;He had been building learning systems at DeepMind for years. The work was pointed at science. AlphaFold was the first proof that AI could crack fundamental problems in biology. Move 37, AlphaGo&amp;rsquo;s famous creative play against Lee Sedol in 2016, was the proof that AI systems could discover things no human had considered.&lt;/p&gt;
&lt;p&gt;Then ChatGPT happened. Google went code red. Hassabis, the man who wanted to solve protein folding and maybe crack fusion, became the man who runs all of Google&amp;rsquo;s AI, including the consumer products he&amp;rsquo;d never wanted to focus on.&lt;/p&gt;
&lt;p&gt;He&amp;rsquo;s candid about what was lost:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;My ideal was to approach the latter stages of building AGI using the scientific method, very carefully, very precisely, very thoughtfully, in a CERN-like way. That might take a decade, even two decades longer. But I think that would make sense given the enormity of what we&amp;rsquo;re dealing with.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And about the irony:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Language was a lot easier than we were all expecting. Even those of us who were obviously optimists about the whole technology. We thought maybe there would be one or two or three more breakthroughs needed. But it turned out transformers and some reinforcement learning on top was enough.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The ease of the advance was the thing that derailed the deeper work. Language models turned out to be good enough for consumer products, and consumer products generate revenue, and revenue attracts competition, and competition creates the arms race that now consumes everything. DeepMind had &amp;ldquo;fairly equivalent systems&amp;rdquo; to ChatGPT at the time, Hassabis says. They chose not to release them. That choice was taken from him.&lt;/p&gt;
&lt;h2 id="what-a-dollar-buys"&gt;What a dollar buys&lt;/h2&gt;
&lt;p&gt;The resource allocation case is simple enough to state in one line, though the implications are not.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.nature.com/articles/s41586-021-03819-2"&gt;AlphaFold 2&lt;/a&gt; trained on 128 Google TPUv3 chips for approximately 11 days. At &lt;a href="https://cloud.google.com/tpu/pricing"&gt;Google Cloud&amp;rsquo;s public pricing&lt;/a&gt; of roughly $32 per hour per TPU, the estimated training cost is somewhere under &lt;strong&gt;$1 million&lt;/strong&gt;. It predicted the three-dimensional structures of 200 million proteins. Over 3 million scientists now use it. A pharma executive told Hassabis that &amp;ldquo;almost every drug developed from now on will have probably used AlphaFold in its process.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;Now the other side of the ledger. &lt;a href="https://epoch.ai/blog/how-much-does-it-cost-to-train-frontier-ai-models/"&gt;GPT-4&amp;rsquo;s training cost&lt;/a&gt; an estimated &lt;strong&gt;$78 million&lt;/strong&gt;. &lt;a href="https://fortune.com/2024/04/18/google-gemini-cost-191-million-to-train-stanford-university-report-estimates/"&gt;Gemini Ultra ran to roughly &lt;strong&gt;$191 million&lt;/strong&gt;&lt;/a&gt;. OpenAI&amp;rsquo;s Orion &lt;a href="https://fortune.com/2025/02/25/what-happened-gpt-5-openai-orion-pivot-scaling-pre-training-llm-agi-reasoning/"&gt;exceeded &lt;strong&gt;$500 million&lt;/strong&gt;&lt;/a&gt; for a single training run, and the model was so disappointing they downgraded it from GPT-5 to GPT-4.5. OpenAI&amp;rsquo;s inference spending alone, just the cost of running the models after training, &lt;a href="https://aibusiness.com/language-models/ai-model-scaling-isn-t-over-it-s-entering-a-new-era"&gt;hit &lt;strong&gt;$2.3 billion in 2024&lt;/strong&gt;&lt;/a&gt;. That is 15 times what they spent training GPT-4.5.&lt;/p&gt;
&lt;p&gt;AlphaFold cost less to train than OpenAI spends on inference in a single day.&lt;/p&gt;
&lt;p&gt;Zoom out further. The Big 4 hyperscalers, Amazon, Alphabet, Meta, Microsoft, are guiding to &lt;a href="https://www.goldmansachs.com/insights/articles/why-ai-companies-may-invest-more-than-500-billion-in-2026"&gt;&lt;strong&gt;$610-665 billion&lt;/strong&gt;&lt;/a&gt; in capital expenditure for 2026. &lt;a href="https://www.goldmansachs.com/insights/articles/why-ai-companies-may-invest-more-than-500-billion-in-2026"&gt;Goldman Sachs projects&lt;/a&gt; cumulative 2025-2027 spending at $1.15 trillion. As I noted in &lt;a href="http://philippdubach.com/posts/peter-thiels-physics-department/"&gt;Peter Thiel&amp;rsquo;s Physics Department&lt;/a&gt;, Big Tech spends &lt;strong&gt;75 times&lt;/strong&gt; more on AI than the entire US federal science budget: $250 billion versus $3.3 billion per year. The DOE Genesis Mission, the flagship US government program for AI-driven scientific discovery, &lt;a href="https://www.energy.gov/science/articles/doe-announces-genesis-mission-advance-ai-science"&gt;received &lt;strong&gt;$320 million&lt;/strong&gt; in its first round&lt;/a&gt;. That is less than Meta spends on AI infrastructure in a single week.&lt;/p&gt;
&lt;p&gt;The infrastructure being built is not for protein folding. It is not for materials science or fusion plasma control or genomics. It is for chatbots, image generators, and coding assistants. &lt;a href="https://sequoiacap.com/article/ais-600b-question/"&gt;Sequoia&amp;rsquo;s David Cahn calculated&lt;/a&gt; the AI ecosystem needs &lt;strong&gt;$600 billion in annual revenue&lt;/strong&gt; to justify current infrastructure spending. It generates perhaps $80-120 billion. And nearly all of that revenue comes from commercial applications: subscriptions, API access, enterprise contracts for systems that summarize emails and draft marketing copy.&lt;/p&gt;
&lt;p&gt;The bottleneck for AI for science was never money. AlphaFold proved that. It was always about who works on what, and the chatbot economy answered that question for an entire generation of researchers.&lt;/p&gt;
&lt;h2 id="what-the-circles-produced"&gt;What the circles produced&lt;/h2&gt;
&lt;p&gt;When Hassabis&amp;rsquo;s teams were allowed to focus on science, when the circles were left undisturbed, this is what happened.&lt;/p&gt;
&lt;p&gt;In The Thinking Game there&amp;rsquo;s a moment that captures it perfectly. The original plan for AlphaFold was conventional: build a server, let scientists submit protein sequences one at a time, email back the predicted structures. Standard approach, used by the whole field for 40 years. Then Hassabis started doing arithmetic on his phone in the middle of the meeting. Two hundred million known proteins. One fold every ten seconds. How many TPUs do we have? He looked up and said something like, &amp;ldquo;&lt;a href="https://youtu.be/d95J8yzvjbQ?si=1VVejCeVhn_1_3m6&amp;amp;t=4495"&gt;Why don&amp;rsquo;t we just fold everything?&lt;/a&gt;&amp;rdquo;&lt;/p&gt;
&lt;p&gt;It would be, he realized, actually less work than building the server.&lt;/p&gt;
&lt;p&gt;So they folded everything. AlphaFold predicted the structures of &lt;strong&gt;200 million proteins&lt;/strong&gt; and put them in a &lt;a href="https://alphafold.ebi.ac.uk/"&gt;free database&lt;/a&gt;. The nuclear pore complex, one of the largest and most important proteins in the body, a donut-shaped gateway that controls nutrient flow in and out of the cell nucleus, was &lt;a href="https://www.science.org/doi/10.1126/science.abm9326"&gt;solved within months&lt;/a&gt; of AlphaFold&amp;rsquo;s release. Researchers working on neglected diseases, malaria, Chagas, leishmaniasis, diseases that affect hundreds of millions of people but attract little pharma funding, now get protein structures for free. Plant scientists working on climate-resilient crops can skip years of crystallography and go straight to the biology.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.isomorphiclabs.com/"&gt;Isomorphic Labs&lt;/a&gt;, the DeepMind spinoff, is running 18-19 drug programs across cardiovascular disease, cancer, and immunology. &lt;a href="http://philippdubach.com/posts/ai-can-now-design-drugs-in-seconds-we-still-cant-tell-you-if-they-work./"&gt;IsoDDE, its drug design engine&lt;/a&gt;, hits 50% on the hardest protein-ligand benchmarks versus 23% for AlphaFold 3. &lt;a href="https://deepmind.google/discover/blog/alphagenome-predicts-the-effects-of-dna-variation-on-gene-regulation/"&gt;AlphaGenome&lt;/a&gt; is decoding the 98% of the human genome that doesn&amp;rsquo;t code for proteins, the part where most disease-causing mutations hide. Jennifer Doudna, the CRISPR pioneer, asked Hassabis directly about combining AlphaGenome with CRISPR to identify and fix the exact genetic changes causing disease. His answer: &amp;ldquo;Still not probably good enough yet. But you can imagine a future version.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;&lt;a href="http://philippdubach.com/posts/the-last-architecture-designed-by-hand/"&gt;AlphaEvolve&lt;/a&gt; found a 23% speedup inside Gemini&amp;rsquo;s own architecture, recovering 0.7% of Google&amp;rsquo;s total compute. DeepMind&amp;rsquo;s fusion work &lt;a href="https://deepmind.google/blog/bringing-ai-to-the-next-generation-of-fusion-energy/"&gt;controlled plasma autonomously&lt;/a&gt; in a real tokamak. &lt;a href="https://deepmind.google/discover/blog/millions-of-new-materials-discovered-with-deep-learning/"&gt;GNoME&lt;/a&gt; identified 2.2 million new crystal structures, equivalent to roughly 800 years of prior human discovery in materials science.&lt;/p&gt;
&lt;p&gt;All of this on a fraction of the compute that powers the chatbot economy. I keep coming back to this: the entire portfolio of DeepMind&amp;rsquo;s scientific work, the Nobel Prize, the drug programs, the materials, the fusion experiments, consumed less compute than a single frontier chatbot burns through in inference costs per quarter.&lt;/p&gt;
&lt;h2 id="the-case-for-the-war-machines"&gt;The case for the war machines&lt;/h2&gt;
&lt;p&gt;I want to present the counterargument honestly, because it&amp;rsquo;s not trivial.&lt;/p&gt;
&lt;p&gt;The commercial race funded a compute buildout that wouldn&amp;rsquo;t exist without chatbot demand. $690 billion in 2026 capex built data centers that can, in principle, be repurposed for scientific workloads. The talent pipeline expanded: a generation of ML engineers entered the field because consumer AI products made it exciting and lucrative. Millions of users stress-tested these models in ways internal testing never could, revealing failure modes and edge cases that improve the underlying systems. Hassabis himself acknowledges this. In the HUGE* interview he listed the benefits: &amp;ldquo;lightning speed&amp;rdquo; progress, democratized access to cutting-edge AI &amp;ldquo;perhaps only 3 to 6 months behind what is actually in the labs,&amp;rdquo; and societal normalization that prepares people for bigger changes ahead.&lt;/p&gt;
&lt;p&gt;And there&amp;rsquo;s the funding argument. Google&amp;rsquo;s $132 billion in net income funds DeepMind. Gemini&amp;rsquo;s commercial revenue helps justify the research budget. Without the chatbot economy, would Alphabet spend billions on AI research at all?&lt;/p&gt;
&lt;p&gt;The strongest version of this argument goes: you can&amp;rsquo;t have the cathedral without the wool merchants. Bell Labs needed AT&amp;amp;T&amp;rsquo;s monopoly revenue. The Apollo program needed Cold War spending. Scientific breakthroughs don&amp;rsquo;t fund themselves. The commercial race, ugly as it is, is the mechanism that makes the science possible.&lt;/p&gt;
&lt;h2 id="why-the-steelman-breaks"&gt;Why the steelman breaks&lt;/h2&gt;
&lt;p&gt;I&amp;rsquo;ve thought about this for a while, and I think it&amp;rsquo;s wrong.&lt;/p&gt;
&lt;p&gt;Start with the compute argument. The infrastructure being built is overwhelmingly inference infrastructure: data centers optimized for running chatbot queries at scale, not for training scientific models. AlphaFold trains on 128 TPUs. It doesn&amp;rsquo;t need a $75 billion annual capex program. The buildout serves commercial demand. Calling it a foundation for scientific AI is like calling a shopping mall a foundation for particle physics because they both use electricity.&lt;/p&gt;
&lt;p&gt;The talent argument has the same problem. The pipeline filled, but it filled with the wrong skills and pointed in the wrong direction. &lt;a href="https://hai.stanford.edu/ai-index/2025-ai-index-report/research-and-development"&gt;Stanford HAI&amp;rsquo;s 2025 AI Index&lt;/a&gt; found that &lt;strong&gt;70%&lt;/strong&gt; of AI PhDs took private sector jobs in 2023, up from roughly 20% two decades ago. &lt;a href="https://www.nature.com/articles/d41586-026-00474-3"&gt;Bruce Schneier wrote in &lt;em&gt;Nature&lt;/em&gt;&lt;/a&gt; that the exodus threatens &amp;ldquo;innovation driven by curiosity rather than profit.&amp;rdquo; The ML engineers entering the field are optimizing RLHF, fine-tuning chat models, building prompt engineering toolchains, and competing on Chatbot Arena leaderboards. These are not the skills that fold proteins or control plasma. The talent that cracks drug discovery needs computational chemistry, molecular dynamics, quantum mechanics. The talent attracted by the chatbot boom is, for the most part, not that talent.&lt;/p&gt;
&lt;p&gt;The stress-testing argument is real but narrow. Millions of users proved that language models can summarize documents and brainstorm ideas. That tells you nothing about whether they can predict which genetic mutations cause disease. The applications share a model architecture but almost nothing else.&lt;/p&gt;
&lt;p&gt;And the funding argument, the one that seems hardest to dismiss, actually argues the opposite of what its proponents think. The best historical parallel is &lt;a href="https://en.wikipedia.org/wiki/Bell_Labs"&gt;Bell Labs&lt;/a&gt;. Founded in 1925 as the research arm of AT&amp;amp;T&amp;rsquo;s regulated telephone monopoly, Bell Labs produced the &lt;a href="https://en.wikipedia.org/wiki/Transistor"&gt;transistor&lt;/a&gt;, the &lt;a href="https://en.wikipedia.org/wiki/Laser"&gt;laser&lt;/a&gt;, &lt;a href="https://en.wikipedia.org/wiki/Unix"&gt;Unix&lt;/a&gt;, the &lt;a href="https://en.wikipedia.org/wiki/C_(programming_language)"&gt;C programming language&lt;/a&gt;, &lt;a href="https://en.wikipedia.org/wiki/Information_theory"&gt;information theory&lt;/a&gt;, and the discovery of &lt;a href="https://en.wikipedia.org/wiki/Cosmic_microwave_background"&gt;cosmic microwave background radiation&lt;/a&gt;. Ten Nobel Prizes. Five Turing Awards. &lt;a href="https://www.construction-physics.com/p/what-would-it-take-to-recreate-bell"&gt;Brian Potter in &lt;em&gt;Construction Physics&lt;/em&gt;&lt;/a&gt; calls the conditions &amp;ldquo;unrepeatable&amp;rdquo;: a vertically integrated monopoly that could afford to fund research with no immediate commercial return.&lt;/p&gt;
&lt;p&gt;Then AT&amp;amp;T was broken up in 1984. Commercial competition arrived. What happened next is instructive: the research workforce &lt;a href="https://en.wikipedia.org/wiki/Bell_Labs"&gt;dropped from roughly 1,300 to 500 by 2002&lt;/a&gt;. Only one post-divestiture employee won a Nobel Prize. Bell Labs was passed from AT&amp;amp;T to Lucent to Alcatel to Nokia, each owner less interested in fundamental research than the last. By 2008, &lt;a href="https://pmc.ncbi.nlm.nih.gov/articles/PMC8792522/"&gt;four physicists remained&lt;/a&gt; in basic research. By 2016, what had been the most productive research institution in human history was a division of a Finnish telecom company.&lt;/p&gt;
&lt;p&gt;The irony is precise: the people who argue that commercial pressure funds great science are citing a lab that produced its greatest work under monopoly protection &lt;em&gt;from&lt;/em&gt; commercial pressure, and died the moment that protection was removed.&lt;/p&gt;
&lt;p&gt;Hassabis&amp;rsquo;s vision, the CERN model, is the Bell Labs model. Let fundamental research breathe. Shield it from quarterly earnings. Fund it with patient capital. He had that at DeepMind, funded by Google&amp;rsquo;s search advertising monopoly, insulated from product deadlines, free to spend six years building AlphaGo before it produced a single dollar of revenue. Then the commercial race consumed the insulation.&lt;/p&gt;
&lt;p&gt;The funding was already there. What he lost was the institutional focus.&lt;/p&gt;
&lt;h2 id="the-circles"&gt;The circles&lt;/h2&gt;
&lt;p&gt;Archimedes held off Rome for two years. Then the soldier came. The war machines didn&amp;rsquo;t save Syracuse. They bought time, and that time ran out.&lt;/p&gt;
&lt;p&gt;I don&amp;rsquo;t think the chatbot era saved AI for science. I think it ate the oxygen. The talent went to RLHF optimization. The compute went to inference farms. The institutional attention went to quarterly product launches. Hassabis is now simultaneously building the war machines and drawing the circles: running Gemini and funding Isomorphic, shipping chatbots and folding proteins. That he manages both is remarkable. But it&amp;rsquo;s a compromise, and the compromise has a cost measured in drug programs that don&amp;rsquo;t exist, diseases that aren&amp;rsquo;t being studied, materials that haven&amp;rsquo;t been found.&lt;/p&gt;
&lt;p&gt;The question is not whether chatbots are useful. They are. I use them constantly. The question is whether future historians will look at 2023-2026 and see a period when the most capable scientific tool in human history was mostly pointed at drafting emails and generating stock photos, and wonder what we were thinking. The way we look at that Roman soldier: someone who destroyed something more valuable than he could understand.&lt;/p&gt;
&lt;p&gt;In the interview, Hassabis is asked what he would want said at his funeral. His answer was immediate:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I would hope that they would say that my life was of benefit and service to humanity.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The circles are still there, drawn in the sand between product launches.&lt;/p&gt;</description></item><item><title>The Geometry of Who Knows What</title><link>http://philippdubach.com/posts/the-geometry-of-who-knows-what/</link><pubDate>Mon, 13 Apr 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/the-geometry-of-who-knows-what/</guid><description>&lt;p&gt;&lt;em&gt;Investing at the Edge of Knowledge, Part 3 · &lt;a href="http://philippdubach.com/posts/three-kinds-of-not-knowing/"&gt;Start with Part 1&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&amp;ldquo;One of these days in your travels, a guy is going to show you a brand-new deck of cards on which the seal is not yet broken. Then this guy is going to offer to bet you that he can make the jack of spades jump out of this brand-new deck of cards and squirt cider in your ear. But, son, you do not accept this bet, because as sure as you stand there, you&amp;rsquo;re going to wind up with an ear full of cider.&amp;rdquo;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Zeckhauser opens his &lt;a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2205821"&gt;2006 paper&lt;/a&gt; with this advice from Sky Masterson&amp;rsquo;s father in &lt;em&gt;Guys and Dolls&lt;/em&gt;. The lesson is as old as markets: if someone offers you a bet where they seem to know something you don&amp;rsquo;t, they probably do. Don&amp;rsquo;t take that bet.&lt;/p&gt;
&lt;p&gt;But Zeckhauser&amp;rsquo;s point isn&amp;rsquo;t the lesson. It&amp;rsquo;s the exception. What happens when nobody has the marked deck? When the ambiguity is shared, when neither side can enumerate the states of the world, the Sky Masterson rule stops applying, and the investors who keep following it anyway leave money on the table.&lt;/p&gt;
&lt;p&gt;In &lt;a href="http://philippdubach.com/posts/three-kinds-of-not-knowing/"&gt;Part 1&lt;/a&gt; I laid out Zeckhauser&amp;rsquo;s taxonomy: risk, uncertainty, and ignorance as three distinct problems. In &lt;a href="http://philippdubach.com/posts/ambiguity-by-design/"&gt;Part 2&lt;/a&gt; I examined why investors flee the third box, the mechanism of ambiguity aversion. This piece asks the question that follows: when you&amp;rsquo;re facing someone on the other side of a trade, how do you figure out whether they know something you don&amp;rsquo;t?&lt;/p&gt;
&lt;h2 id="the-two-matrices"&gt;The two matrices&lt;/h2&gt;
&lt;p&gt;Zeckhauser draws two matrices that I think are the most underappreciated diagrams in the paper.&lt;/p&gt;
&lt;p&gt;The first covers investing under uncertainty, where the possible states are known but probabilities are hard. It&amp;rsquo;s a 2x2: Easy or Hard for You to Estimate Value crossed with Easy or Hard for Others. Box A (easy for both) is the standard competitive market: lots of participants, tight spreads, no edge for anyone. Box B (easy for you, hard for others) is where you&amp;rsquo;re the informed party: think a biotech scientist evaluating a drug trial readout. Box C (hard for you, easy for others) is the danger zone, the other side has the marked deck, and the Sky Masterson rule applies in full. Box D (hard for both) is where it gets interesting. Neither side has an information advantage. Both are operating under genuine uncertainty. Buffett&amp;rsquo;s earthquake reinsurance sits here.&lt;/p&gt;
&lt;p&gt;The second matrix covers investing under ignorance, where even the possible states are unknown. It&amp;rsquo;s simpler: a 2x1. Unknown to You and Known to Others (Box E) versus Unknown to You and Unknown to Others (Box F). Box E is dangerous. Box F is opportunity.&lt;/p&gt;
&lt;p&gt;The point most people miss is about misidentification. Most investors assume they&amp;rsquo;re in Box C or Box E: the other side knows more. This assumption is the legacy of standard information asymmetry models in finance, where &lt;a href="https://www.sfu.ca/~wainwrig/Econ400/akerlof.pdf"&gt;Akerlof&amp;rsquo;s lemons problem (1970)&lt;/a&gt; and the Glosten-Milgrom bid-ask spread model (1985) trained a generation to worry about adverse selection. Those worries are justified in Boxes A through C and in Box E. But in Box D and Box F, you&amp;rsquo;re not facing an informed counterparty. You&amp;rsquo;re facing someone equally confused, or someone who has left the market entirely because they can&amp;rsquo;t tolerate the confusion.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.cs.princeton.edu/courses/archive/spr09/cos444/papers/BazermanSamuelson83.pdf"&gt;Bazerman and Samuelson (1983)&lt;/a&gt; showed that even in clean experimental settings, people are terrible at accounting for why the other side is willing to trade. Their winner&amp;rsquo;s curse experiments found that bidders consistently failed to discount for the fact that winning an auction is bad news about your estimate&amp;rsquo;s accuracy. In a UU world, this failure compounds. You can&amp;rsquo;t compute the conditional expectation of the asset&amp;rsquo;s value given that the other side is selling, because neither of you can define the state space over which that expectation would be calculated.&lt;/p&gt;
&lt;p&gt;The practical question is always: am I in Box C or Box D? Am I in Box E or Box F? And the answer is almost never available from the data. It&amp;rsquo;s a judgment call, informed by what you know about the seller&amp;rsquo;s constraints, their institutional context, and whether the source of the ambiguity is private or shared.&lt;/p&gt;
&lt;h2 id="institutional-blindness-as-structural-opportunity"&gt;Institutional blindness as structural opportunity&lt;/h2&gt;
&lt;p&gt;The California Earthquake Authority story is Zeckhauser&amp;rsquo;s best illustration, and it deserves the full telling.&lt;/p&gt;
&lt;p&gt;In the late 1990s, California needed reinsurance for earthquake risk. The authority offered a &lt;strong&gt;$1 billion&lt;/strong&gt; slice at premiums that worked out to roughly five times actuarial value. Wall Street said no. Not because investment banks thought the Earthquake Authority possessed secret seismological knowledge. Nobody has an informational edge over the reinsurer when it comes to tectonic plate movement. The ambiguity was shared: Box F.&lt;/p&gt;
&lt;p&gt;Wall Street said no because their internal processes required probability estimates that didn&amp;rsquo;t exist. Compliance teams required distributional assumptions about tail risk that nobody could provide. Risk models required defined scenarios, and &amp;ldquo;catastrophic earthquake in the next 12 months&amp;rdquo; didn&amp;rsquo;t fit neatly into any existing framework. The honest assessment, &amp;ldquo;we have no idea about the probability, but the price is very high,&amp;rdquo; didn&amp;rsquo;t fit the form. Buffett took the entire slice.&lt;/p&gt;
&lt;p&gt;This is Zeckhauser&amp;rsquo;s Maxim H: &amp;ldquo;Do not engage in the heuristic reasoning that just because you do not know the risk, others do.&amp;rdquo; The Wall Street banks weren&amp;rsquo;t outcompeted by someone with better information. They were outcompeted by someone with fewer institutional constraints. Buffett could hold a position that was impossible to model because he answered to shareholders who trusted his judgment, not to compliance officers who required his models.&lt;/p&gt;
&lt;p&gt;Generalize this, and you get a structural feature of UU markets that doesn&amp;rsquo;t go away. Fiduciary duty requires estimable risk. Compliance models require defined scenarios. Career risk creates what Zeckhauser calls Monday Morning Quarterback (MMQ) risk: the danger that a bad outcome on a good decision destroys your reputation. Professional investors face a permanent bias toward the risk box (known probabilities) and away from the ignorance box (unknown states). This isn&amp;rsquo;t a market inefficiency waiting to be arbitraged. It&amp;rsquo;s an institutional constant. And it creates a permanent supply of mispriced assets for those without the same constraints.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://projecteuclid.org/journals/annals-of-statistics/volume-4/issue-6/Agreeing-to-Disagree/10.1214/aos/1176343654.full"&gt;Aumann (1976)&lt;/a&gt; proved that rational agents with common priors who share their posterior beliefs must converge: they cannot &amp;ldquo;agree to disagree.&amp;rdquo; The theorem is elegant and, in UU markets, irrelevant. Aumann assumes common priors and known state spaces. In Box 3, both assumptions fail. The state space is undefined, so there are no common priors to start from. Disagreement in UU isn&amp;rsquo;t a puzzle to be resolved by more information exchange. It&amp;rsquo;s the default condition. Two equally rational investors can look at the same situation and reach opposite conclusions without either one being wrong, because they&amp;rsquo;re not disagreeing about probabilities. They&amp;rsquo;re disagreeing about what world they&amp;rsquo;re in.&lt;/p&gt;
&lt;h2 id="the-advantage-versus-selection-formula"&gt;The advantage-versus-selection formula&lt;/h2&gt;
&lt;p&gt;Zeckhauser offers a framework for deciding when to invest despite potential adverse selection. Your expected return depends on three things: your absolute advantage (&lt;em&gt;a&lt;/em&gt;), the probability the other side is better informed (&lt;em&gt;p&lt;/em&gt;), and the selection factor (&lt;em&gt;s&lt;/em&gt;), how much their information hurts you. Invest when the combination exceeds the cost of entry.&lt;/p&gt;
&lt;p&gt;The formula matters less than the logic behind it. A large absolute advantage provides insurance against adverse selection. Zeckhauser&amp;rsquo;s Maxim E: &amp;ldquo;A significant absolute advantage offers some protection against potential selection.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;What counts as absolute advantage? Complementary skills are the classic answer: the real estate developer who creates value a passive investor cannot, the venture capitalist whose operational expertise and network make the company worth more than the sum of its capital. Their return isn&amp;rsquo;t compensation for bearing risk. It&amp;rsquo;s a share of value they helped create. Sidecar investors, Zeckhauser&amp;rsquo;s term for those who invest alongside skilled operators, earn excess returns because access to these deals is limited and the value creation is real.&lt;/p&gt;
&lt;p&gt;But complementary skills aren&amp;rsquo;t the only form of advantage. In the &lt;a href="http://philippdubach.com/posts/the-saaspocalypse-paradox/"&gt;SaaSpocalypse&lt;/a&gt;, the &amp;ldquo;absolute advantage&amp;rdquo; for a buyer at IGV $80 was time horizon. If you could hold for three to five years, tolerate the MMQ risk of further drawdowns, and ignore the career consequences of looking wrong for a few quarters, you had a structural edge over institutional sellers who couldn&amp;rsquo;t do the same. That&amp;rsquo;s not analytical skill. It&amp;rsquo;s constraint arbitrage. And constraint arbitrage is a legitimate form of absolute advantage, because fiduciary requirements and career incentives are structural features that won&amp;rsquo;t disappear next quarter.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2205848"&gt;Larry Summers (2006)&lt;/a&gt; raises the obvious objection to the sidecar concept: &amp;ldquo;identifying skilled UU managers may be no easier than picking market-beating investments directly.&amp;rdquo; The sidecar doesn&amp;rsquo;t solve the epistemological problem. It relocates it from asset selection to manager selection. How do you know the driver is skilled rather than lucky?&lt;/p&gt;
&lt;p&gt;&lt;a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2205858"&gt;Richard Robb (2006)&lt;/a&gt; pushes further. He argues that UU knowledge is &amp;ldquo;uncommunicable.&amp;rdquo; If a mechanism for generating excess returns could be expressed as a process, someone would have arbitraged it away. Ricardo, on the eve of Waterloo, might have said &amp;ldquo;British Government bonds offer a high reward for the risk.&amp;rdquo; But what would it look like for that statement to be proven false? The claim is unfalsifiable because it lives in the ignorance box where probability statements don&amp;rsquo;t have clear empirical content. If the sidecar driver can&amp;rsquo;t explain their edge in terms you can evaluate, how do you distinguish skill from survivorship bias?&lt;/p&gt;
&lt;p&gt;I think both objections are correct and both miss something. They&amp;rsquo;re correct that sidecar investing doesn&amp;rsquo;t eliminate the evaluation problem. But they miss that the evaluation problem has different difficulty levels depending on context. Evaluating whether a real estate developer can build and lease a building is easier than evaluating whether a macro hedge fund can predict interest rates. Evaluating whether Buffett&amp;rsquo;s insurance math is sound is easier than evaluating whether a biotech startup&amp;rsquo;s drug candidate works. The sidecar concept isn&amp;rsquo;t &amp;ldquo;trust someone blindly.&amp;rdquo; It&amp;rsquo;s &amp;ldquo;invest alongside someone whose edge you can partly verify, in situations where your own analytical advantage is zero.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;Knowing the geometry of who-knows-what is necessary but not sufficient. You&amp;rsquo;ve identified a Box D or Box F opportunity. You&amp;rsquo;ve assessed your absolute advantage. You&amp;rsquo;ve decided the other side isn&amp;rsquo;t better informed. Now you need to decide how much to bet. In a UU world, the most famous formula for position sizing, the Kelly Criterion, breaks down in the ways you&amp;rsquo;d expect. That&amp;rsquo;s Part 4 &lt;em&gt;(coming soon)&lt;/em&gt;.&lt;/p&gt;</description></item><item><title>Why Lilly's Weight Loss Pill Isn't a Peptide</title><link>http://philippdubach.com/posts/why-lillys-weight-loss-pill-isnt-a-peptide/</link><pubDate>Thu, 09 Apr 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/why-lillys-weight-loss-pill-isnt-a-peptide/</guid><description>&lt;blockquote&gt;
&lt;p&gt;Novo Nordisk spent decades and $1.8 billion learning how to get a peptide past the gut. Eli Lilly looked at the same problem and decided to skip it entirely.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Your gastrointestinal tract is a 30-foot disassembly line for proteins. Acid denatures them, pepsin cleaves them, trypsin finishes the job, and the mucus layer blocks whatever survives. Sean Geiger&amp;rsquo;s excellent &lt;a href="https://seangeiger.substack.com/p/a-brief-history-of-oral-peptides"&gt;history of oral peptides&lt;/a&gt; traces the full arc: the first attempt at oral insulin was in 1922. Over a hundred years and thirteen companies later, no oral insulin exists.&lt;/p&gt;
&lt;p&gt;Novo Nordisk spent decades and $1.8 billion acquiring the technology to get around this problem. The result, approved in December 2025 as &lt;a href="https://www.endocrinologyadvisor.com/news/fda-approves-oral-wegovy-for-weight-management/"&gt;oral Wegovy for obesity&lt;/a&gt;, is a pill that destroys 99% of its own active ingredient before the remaining fraction reaches the bloodstream. The oral 25mg daily dose uses roughly 280x more semaglutide than the equivalent weekly injection. This is the best that peptide oral delivery can do. Eli Lilly decided to skip it entirely, building Foundayo, a small molecule oral obesity drug that isn&amp;rsquo;t a peptide at all. That divergence in approach will determine who captures the majority of a market that &lt;a href="https://www.goldmansachs.com/insights/articles/anti-obesity-drug-market"&gt;Goldman Sachs projects&lt;/a&gt; at $100+ billion by 2030 and that &lt;a href="https://www.jpmorgan.com/insights/global-research/current-events/obesity-drugs"&gt;J.P. Morgan estimates&lt;/a&gt; will reach 30 million US users within five years.&lt;/p&gt;
&lt;h2 id="oral-semaglutide"&gt;Oral semaglutide&lt;/h2&gt;
&lt;p&gt;Sean Geiger&amp;rsquo;s &lt;a href="https://seangeiger.substack.com/p/a-brief-history-of-oral-peptides"&gt;history of oral peptides&lt;/a&gt; traces the science well. The technology that makes oral semaglutide possible is SNAC (salcaprozate sodium), a permeation enhancer developed by Emisphere Technologies starting in the 1990s. Novo partnered with Emisphere in 2007 and &lt;a href="https://www.novonordisk.com/content/nncorp/global/en/news-and-media/news-and-ir-materials/news-details.html?id=916472"&gt;acquired the company outright in 2020&lt;/a&gt;. SNAC does three things simultaneously: it buffers local stomach pH to suppress pepsin, prevents semaglutide from clumping into inactive oligomers, and temporarily fluidizes gastric cell membranes so the drug can cross. The &lt;a href="https://www.ema.europa.eu/en/documents/assessment-report/rybelsus-epar-public-assessment-report_en.pdf"&gt;EMA&amp;rsquo;s public assessment report&lt;/a&gt; puts the resulting bioavailability at roughly 0.4 to 1%. The &lt;a href="https://www.accessdata.fda.gov/drugsatfda_docs/label/2024/213051s018lbl.pdf"&gt;FDA label&lt;/a&gt; confirms: the vast majority of each dose is destroyed.&lt;/p&gt;
&lt;p&gt;This creates a problem that&amp;rsquo;s easy to state and hard to solve. If you need 280x more API per equivalent dose, your manufacturing cost structure looks nothing like the injectable. A &lt;a href="https://www.fastcompany.com/91071415/your-1000-per-month-ozempic-costs-5-to-make-says-study"&gt;Yale/King&amp;rsquo;s College study published in JAMA&lt;/a&gt; found injectable semaglutide costs $0.89 to $4.73 per month to manufacture at the API level. Scale that by 280x and you get oral API costs somewhere in the range of $770 to $1,460 per year, according to &lt;a href="https://themedicinemaker.com/issues/2026/articles/january/oral-glp-1s-won-t-win-on-convenience-they-ll-win-on-cmc/"&gt;The Medicine Maker&amp;rsquo;s January 2026 analysis&lt;/a&gt;. Still below the selling price. But the margin compression is real, and SNAC itself is a costly excipient.&lt;/p&gt;
&lt;a href="#lightbox-oral-bioavailability-trap-png-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/oral-bioavailability-trap.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/oral-bioavailability-trap.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/oral-bioavailability-trap.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/oral-bioavailability-trap.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/oral-bioavailability-trap.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/oral-bioavailability-trap.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/oral-bioavailability-trap.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/oral-bioavailability-trap.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/oral-bioavailability-trap.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/oral-bioavailability-trap.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/oral-bioavailability-trap.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/oral-bioavailability-trap.png"
alt="Oral semaglutide bioavailability trap: 280x more API per dose than injectable Wegovy, SNAC achieves only 1% absorption, while Eli Lilly Foundayo bypasses the peptide oral delivery problem"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;SNAC is also oddly specific. &lt;a href="https://seangeiger.substack.com/p/a-brief-history-of-oral-peptides"&gt;Geiger notes&lt;/a&gt; that Novo tried it with liraglutide, a closely related GLP-1 analog, and it failed because liraglutide forms oligomers that SNAC can&amp;rsquo;t break apart. After over three decades of work, exactly two FDA-approved oral peptide drugs using permeation enhancers exist: Rybelsus/oral Wegovy (SNAC) and Mycapssa (oral octreotide for acromegaly, a different enhancer called TPE). That&amp;rsquo;s the entire commercial output of the field.&lt;/p&gt;
&lt;h2 id="foundayo-lillys-structural-advantage"&gt;Foundayo: Lilly&amp;rsquo;s structural advantage&lt;/h2&gt;
&lt;p&gt;Eli Lilly&amp;rsquo;s &lt;a href="https://investor.lilly.com/news-releases/news-release-details/lillys-oral-glp-1-orforglipron-superior-oral-semaglutide-head"&gt;orforglipron&lt;/a&gt;, approved by the FDA on April 1, 2026 under the brand name &lt;a href="https://investor.lilly.com/news-releases/news-release-details/fda-approves-lillys-foundayotm-orforglipron-only-glp-1-pill"&gt;Foundayo&lt;/a&gt;, is not an oral peptide. It&amp;rsquo;s a non-peptide small molecule GLP-1 receptor agonist that activates the same receptor through a different mechanism. Discovered by Chugai Pharmaceutical and licensed by Lilly in 2018, orforglipron requires no SNAC, no fasting window, no cold chain storage, and is manufactured through standard chemical synthesis rather than solid-phase peptide synthesis. The bioavailability problem doesn&amp;rsquo;t apply because the molecule was designed from the ground up to survive the gut.&lt;/p&gt;
&lt;p&gt;The clinical data backs this up. In &lt;a href="https://investor.lilly.com/news-releases/news-release-details/lillys-oral-glp-1-orforglipron-superior-oral-semaglutide-head"&gt;ACHIEVE-3&lt;/a&gt; (1,698 patients with type 2 diabetes, 52 weeks), orforglipron at 12mg and 36mg was superior to oral semaglutide on both HbA1c reduction and weight loss: the first head-to-head victory over Novo&amp;rsquo;s oral product. In &lt;a href="https://www.appliedclinicaltrialsonline.com/view/eli-lilly-oral-glp1-orforglipron-efficacy-safety-injectable-phaseiii-trial"&gt;ATTAIN-2&lt;/a&gt; (obesity with type 2 diabetes), orforglipron delivered 10.5% weight loss at 72 weeks versus 2.2% on placebo. And in &lt;a href="https://investor.lilly.com/news-releases/news-release-details/lillys-orforglipron-helped-people-maintain-weight-loss-after"&gt;ATTAIN-MAINTAIN&lt;/a&gt;, patients who switched from injectable Wegovy or Mounjaro to oral orforglipron maintained their weight within 0.9 kg over 52 weeks. A pill that holds the gains of an injection.&lt;/p&gt;
&lt;p&gt;Lilly &lt;a href="https://investor.lilly.com/news-releases/news-release-details/lillys-oral-glp-1-orforglipron-successful-third-phase-3-trial"&gt;submitted the NDA&lt;/a&gt; with a priority review voucher and received &lt;a href="https://investor.lilly.com/news-releases/news-release-details/fda-approves-lillys-foundayotm-orforglipron-only-glp-1-pill"&gt;FDA approval on April 1, 2026&lt;/a&gt;, the fastest approval of a new molecular entity since 2002. Foundayo is available starting at $149 per month for self-pay patients, with savings card prices as low as $25 per month. The company is investing &lt;a href="https://cen.acs.org/pharmaceuticals/pharmaceutical-chemicals/Lilly-pour-65-billion-GLP/103/web/2025/09"&gt;$6.5 billion in a dedicated oral manufacturing facility&lt;/a&gt; and $27 billion total in US manufacturing capacity.&lt;/p&gt;
&lt;a href="#lightbox-peptide-vs-small-molecule-png-2" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/peptide-vs-small-molecule.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/peptide-vs-small-molecule.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/peptide-vs-small-molecule.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/peptide-vs-small-molecule.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/peptide-vs-small-molecule.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/peptide-vs-small-molecule.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/peptide-vs-small-molecule.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/peptide-vs-small-molecule.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/peptide-vs-small-molecule.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/peptide-vs-small-molecule.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/peptide-vs-small-molecule.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/peptide-vs-small-molecule.png"
alt="Foundayo orforglipron vs oral Wegovy semaglutide comparison: peptide plus SNAC approach versus small molecule across bioavailability, manufacturing cost, fasting requirements, and ACHIEVE-3 clinical results"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;h2 id="70-billion-duopoly-and-its-widening-crack"&gt;$70 billion duopoly and its widening crack&lt;/h2&gt;
&lt;p&gt;The gap is widening. Combined GLP-1 revenue from Novo and Lilly hit roughly $70 billion in 2025. But the composition shifted. Lilly&amp;rsquo;s tirzepatide franchise (Mounjaro plus Zepbound) &lt;a href="https://www.fiercepharma.com/pharma/even-pricing-headwinds-eli-lilly-expects-sales-continue-surge-2026"&gt;generated $36.5 billion&lt;/a&gt;, with Zepbound alone growing 175% year-over-year. Novo&amp;rsquo;s semaglutide franchise came in around $33 billion, with growth decelerating to roughly 10% in constant exchange rates. &lt;a href="https://www.cnbc.com/2026/02/04/eli-lilly-novo-nordisk-earnings-glp1-market.html"&gt;Lilly&amp;rsquo;s US market share hit 57%&lt;/a&gt; by mid-2025, up from 41% a year earlier. Novo&amp;rsquo;s share fell to 43%.&lt;/p&gt;
&lt;p&gt;The stock market has been ruthless in pricing this shift. Novo trades at roughly $48 per ADR share, down 65% from its June 2024 peak of $142, a loss exceeding $350 billion in market cap. The company &lt;a href="https://www.cnbc.com/2026/02/04/eli-lilly-novo-nordisk-earnings-glp1-market.html"&gt;guided for a 5 to 13% revenue decline in 2026&lt;/a&gt;, driven by patent expirations in Canada, Brazil, and China, plus pricing pressure from the Trump administration&amp;rsquo;s drug pricing framework. CagriSema, Novo&amp;rsquo;s most important pipeline asset, &lt;a href="https://www.biopharmadive.com/news/novo-nordisk-cagrisema-obesity-drug-study-results/735854/"&gt;disappointed twice&lt;/a&gt;: 22.7% weight loss in REDEFINE 1 (below the company&amp;rsquo;s own 25% guidance) and 15.7% in REDEFINE 2. &lt;a href="https://www.cnbc.com/2024/12/20/novo-nordisk-shares-plunge-22percent-after-cagrisema-obesity-drug-trial-results.html"&gt;Novo&amp;rsquo;s stock plunged 20% on the first readout alone&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Lilly, by contrast, &lt;a href="https://www.fiercepharma.com/pharma/even-pricing-headwinds-eli-lilly-expects-sales-continue-surge-2026"&gt;guided 2026 revenue at $80 to $83 billion&lt;/a&gt;, a 25% increase, and &lt;a href="https://finance.yahoo.com/quote/LLY/"&gt;trades near $1,044&lt;/a&gt; with a market cap around $1 trillion, the first pharma company to reach that level. Forward P/E: roughly 30x versus Novo&amp;rsquo;s 12.5x. That 2.4x valuation premium reflects a simple thesis: Lilly has the better drug (Zepbound &lt;a href="https://www.nejm.org/doi/full/10.1056/NEJMoa2416394"&gt;showed 47% greater weight loss&lt;/a&gt; than Wegovy in the SURMOUNT-5 head-to-head), the better oral pipeline, and the longer patent runway (tirzepatide patents extend into the mid-2030s versus &lt;a href="https://www.trademarkia.com/news/patents/when-does-the-ozempic-patent-expire"&gt;semaglutide&amp;rsquo;s core US patent expiring December 2031&lt;/a&gt;, with biosimilar competition likely following shortly after).&lt;/p&gt;
&lt;a href="#lightbox-novo-vs-lilly-duopoly-png-3" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/novo-vs-lilly-duopoly.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/novo-vs-lilly-duopoly.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/novo-vs-lilly-duopoly.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/novo-vs-lilly-duopoly.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/novo-vs-lilly-duopoly.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/novo-vs-lilly-duopoly.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/novo-vs-lilly-duopoly.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/novo-vs-lilly-duopoly.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/novo-vs-lilly-duopoly.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/novo-vs-lilly-duopoly.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/novo-vs-lilly-duopoly.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/novo-vs-lilly-duopoly.png"
alt="Novo Nordisk vs Eli Lilly GLP-1 duopoly: Lilly at 2.4x Novo forward PE, 57% US market share, revenue and patent runway comparison"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;The Hims &amp;amp; Hers saga sits at the chaotic edge of all this. HIMS &lt;a href="https://finance.yahoo.com/news/nvo-lly-stocks-slide-hims-142700745.html"&gt;launched a $49 per month compounded oral semaglutide pill&lt;/a&gt; on February 5, 2026, using unproven liposomal technology with no published bioavailability data. Within four days, &lt;a href="https://markets.financialcontent.com/stocks/article/marketminute-2026-2-9-the-glp-1-gold-rush-hits-a-wall-novo-nordisk-sues-hims-and-hers-as-fda-crackdown-triggers-20-stock-crash"&gt;HHS had referred the company to the DOJ&lt;/a&gt;, Novo had &lt;a href="https://www.gurufocus.com/news/8587678/novo-nordisk-nvo-shares-plunge-amid-competition-from-hims-hers-hims"&gt;filed a patent infringement lawsuit&lt;/a&gt;, and HIMS had suspended the product. Novo&amp;rsquo;s CEO alleged independent testing of compounded samples showed impurity levels as high as 86%. What happens when the incentive to undercut $1,000-per-month pricing collides with the actual difficulty of making peptide drugs work orally.&lt;/p&gt;
&lt;h2 id="does-oral-delivery-commoditize-glp-1"&gt;Does Oral Delivery Commoditize GLP-1&lt;/h2&gt;
&lt;p&gt;Does oral delivery commoditize GLP-1s, or does it expand the market so dramatically that even with pricing pressure, the opportunity grows Early evidence already supports the expansion thesis: &lt;a href="https://www.cnbc.com/2026/04/07/novo-nordisks-wegovy-pill-launch-draws-new-wave-of-patients-to-glp-1s.html"&gt;Novo&amp;rsquo;s oral Wegovy pill uptake is running roughly 10x higher&lt;/a&gt; than the original injectable Wegovy launch, drawing in new patients rather than converting existing injection users.&lt;/p&gt;
&lt;p&gt;The statin precedent is the strongest data point we have. After generic atorvastatin launched in 2011, total statin use &lt;a href="https://pmc.ncbi.nlm.nih.gov/articles/PMC10203693/"&gt;expanded from 31 million to 92 million Americans&lt;/a&gt; by 2019, a &lt;strong&gt;197% increase&lt;/strong&gt;. Total prescription volume grew 77%. The per-unit price collapsed, but total market volume more than compensated. Updated clinical guidelines, lower copays, and reduced patient resistance combined to pull in millions of people who would never have started therapy at the original price and delivery format.&lt;/p&gt;
&lt;p&gt;Current penetration is absurdly low: &lt;a href="https://icer.org/wp-content/uploads/2025/04/Affordable-Access-to-GLP-1-Obesity-Medications-_-ICER-White-Paper-_-04.09.2025.pdf"&gt;fewer than 5% of eligible US adults&lt;/a&gt; are on anti-obesity medication therapy, against 104 million with obesity. At statin-like penetration rates of 35% or higher, that&amp;rsquo;s a 5 to 10x expansion. Persistence data reinforces the point: only &lt;a href="https://www.primetherapeutics.com/documents/d/primetherapeutics/prime-therapeutics_glp-1-therapy-to-treat-obesity-among-members-without-diabetes_three-year-persistence"&gt;32% of obesity patients persist at one year and 15% at two years&lt;/a&gt;. Side effects account for 43.7% of discontinuation, financial barriers for 30.9%. Adherence collapses when the friction is high. An oral weight loss pill that&amp;rsquo;s cheaper, eliminates the injection barrier, and has no fasting restrictions (orforglipron) attacks all three.&lt;/p&gt;
&lt;h2 id="oral-glp-1-pipeline"&gt;Oral GLP-1 pipeline&lt;/h2&gt;
&lt;p&gt;The rest of the oral GLP-1 pipeline is worth tracking but the outcomes are uncertain. &lt;a href="https://www.prnewswire.com/news-releases/viking-therapeutics-announces-positive-top-line-results-from-phase-2-venture-oral-dosing-trial-of-vk2735-tablet-formulation-in-patients-with-obesity-302533355.html"&gt;Viking&amp;rsquo;s oral VK2735&lt;/a&gt; showed rapid weight loss in Phase 2 (up to 12.2% at 13 weeks) but a &lt;a href="https://www.biopharmadive.com/news/viking-oral-obesity-drug-results-study-discontinuationsdata-dropout/758019/"&gt;38% discontinuation rate&lt;/a&gt; at the highest dose sent the stock down 37%. &lt;a href="https://ir.structuretx.com/news-releases/news-release-details/structure-therapeutics-reports-positive-topline-data-access"&gt;Structure Therapeutics&amp;rsquo; aleniglipron&lt;/a&gt; posted 15.3% placebo-adjusted weight loss at 36 weeks in Phase 2b, competitive numbers with no plateau, and has $786 million in cash to fund Phase 3. &lt;a href="https://www.statnews.com/2025/04/14/pfizer-discontinue-danuglipron-glp-1-obesity-liver-toxicity/"&gt;Pfizer&amp;rsquo;s danuglipron was killed&lt;/a&gt; by liver toxicity in April 2025, the second Pfizer oral GLP-1 failure. &lt;a href="https://ir.ternspharma.com/news-releases/news-release-details/terns-pharmaceuticals-reports-topline-12-week-data-its-phase-2"&gt;Terns Pharmaceuticals also exited&lt;/a&gt; after weak Phase 2 data and liver enzyme elevations. Behind them, Novo&amp;rsquo;s oral amycretin, a GLP-1/amylin dual agonist, enters Phase 3 in 2026 and could offer best-in-class weight loss if the oral formulation holds up. Oral small molecule GLP-1 development has a meaningful failure rate, and Foundayo&amp;rsquo;s clean safety profile across multiple Phase 3 trials is not something I&amp;rsquo;d assume the next entrant can replicate.&lt;/p&gt;
&lt;a href="#lightbox-oral-glp1-pipeline-png-5" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/oral-glp1-pipeline.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/oral-glp1-pipeline.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/oral-glp1-pipeline.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/oral-glp1-pipeline.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/oral-glp1-pipeline.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/oral-glp1-pipeline.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/oral-glp1-pipeline.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/oral-glp1-pipeline.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/oral-glp1-pipeline.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/oral-glp1-pipeline.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/oral-glp1-pipeline.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/oral-glp1-pipeline.png"
alt="Oral GLP-1 pipeline 2026: Foundayo approved, aleniglipron Phase 3, VK2735 Phase 2, oral amycretin Phase 3, with Pfizer and Terns programs killed by liver toxicity"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;The thing that makes this market so interesting is that almost every important variable is in motion at the same time: form factor (injection to pill), pricing structure ($1,000 per month to $149 to potentially lower), patent protection (expiring internationally, holding domestically), competitive dynamics (Novo decelerating, Lilly sprinting with Foundayo, Hims imploding), and the macro question of Medicare coverage. I&amp;rsquo;m more confident in the structural thesis, that oral GLP-1s expand the market through a Jevons-like dynamic, than I am in picking the right entry point for any individual stock. But if forced to bet on which company is best positioned for that expansion, the answer seems clear. Lilly built the molecule that doesn&amp;rsquo;t need to fight the gut. Novo built one that fights and mostly loses.&lt;/p&gt;
&lt;p&gt;At 30x forward earnings for Lilly and 12.5x for Novo, there&amp;rsquo;s a version of this where Novo is the contrarian value play and Lilly is priced for perfection. I don&amp;rsquo;t think that&amp;rsquo;s the right framing. I think Novo is cheap because it has structural problems, the worst kind of cheap, and Lilly is expensive because it has structural advantages, the best kind of expensive.&lt;/p&gt;
&lt;aside class="disclaimer" role="note" aria-label="Disclaimer"&gt;
&lt;div class="disclaimer-content"&gt;&lt;p&gt;&lt;strong&gt;Disclaimer:&lt;/strong&gt; All opinions expressed are my own. This is not investment, financial, tax, or legal advice. Past performance does not indicate future results. Do your own research and consult qualified professionals before making financial decisions. No liability accepted for any losses.&lt;/p&gt;&lt;/div&gt;
&lt;/aside&gt;</description></item><item><title>Ambiguity by Design</title><link>http://philippdubach.com/posts/ambiguity-by-design/</link><pubDate>Wed, 08 Apr 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/ambiguity-by-design/</guid><description>&lt;p&gt;&lt;em&gt;Investing at the Edge of Knowledge, Part 2 · &lt;a href="http://philippdubach.com/posts/three-kinds-of-not-knowing/"&gt;Start with Part 1&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Ellsberg&amp;rsquo;s urn experiment is one of the cleanest results in decision theory. &lt;a href="https://academic.oup.com/qje/article-abstract/75/4/643/1913802"&gt;Daniel Ellsberg (1961)&lt;/a&gt; put two urns in front of subjects. Urn A: 50 red balls, 50 black. Urn B: 100 balls, red and black, ratio unknown. Pay $100 if you draw the right color. Most people chose Urn A, the known 50/50 bet. Fine so far. But here&amp;rsquo;s the problem: they chose Urn A regardless of which color they were betting on. Bet on red? Prefer Urn A. Bet on black? Still prefer Urn A. This is incoherent. If you think Urn B has fewer red balls (making you avoid it for a red bet), you should prefer it for a black bet. The subjects weren&amp;rsquo;t estimating probabilities at all. They were fleeing the &lt;em&gt;feeling&lt;/em&gt; of not knowing the probability. Ellsberg proved that people make systematically different choices when probabilities are unknown versus known, even when the unknown probabilities carry no actual informational disadvantage. &lt;a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2205821"&gt;Richard Zeckhauser&amp;rsquo;s&lt;/a&gt; contribution was to ask&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;What happens to prices when an entire market makes this choice simultaneously?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id="the-experimental-evidence"&gt;The experimental evidence&lt;/h2&gt;
&lt;p&gt;Ellsberg&amp;rsquo;s result spawned a body of work that, six decades later, has only strengthened the original finding. &lt;a href="https://academic.oup.com/qje/article-abstract/110/3/585/1859203"&gt;Fox and Tversky (1995)&lt;/a&gt; added a twist that matters enormously for financial markets. Their &amp;ldquo;comparative ignorance hypothesis&amp;rdquo; showed that ambiguity aversion intensifies when people can compare themselves to someone who appears more knowledgeable. In a non-comparative setting, where subjects evaluated an ambiguous bet in isolation, ambiguity aversion largely disappeared. But the moment subjects could compare their knowledge to someone else&amp;rsquo;s, the aversion came roaring back.&lt;/p&gt;
&lt;p&gt;In markets, there is always someone who appears more confident. Every sell-side note, every CNBC segment, every hedge fund manager interviewed at Davos projects certainty that you don&amp;rsquo;t feel. The comparative ignorance effect is permanently activated in financial markets. You don&amp;rsquo;t just feel uncertain. You feel uncertain relative to someone who seems to know, and the gap between their apparent confidence and your honest confusion is what drives the exit decision.&lt;/p&gt;
&lt;p&gt;Zeckhauser&amp;rsquo;s own experimental evidence in the 2006 paper extends this further. He ran lottery choice experiments comparing willingness to bet on standard probabilistic gambles versus events with unknown and unknowable (UU) outcomes (to familiarize yourself with this framework &lt;a href="http://philippdubach.com/posts/three-kinds-of-not-knowing/"&gt;start with Part 1&lt;/a&gt;.) People refused to distinguish between small probabilities of UU events even when the expected value difference was large. The feeling of not-knowing overwhelmed the arithmetic of expected value. Separately, he documented that individuals explicitly warned about overconfidence are still surprised &lt;strong&gt;35%&lt;/strong&gt; of the time on quantities where they should be surprised only &lt;strong&gt;2%&lt;/strong&gt; of the time. We simultaneously know less than we think (overconfidence) and refuse to act on what we do know when probabilities are ambiguous (ambiguity aversion).&lt;/p&gt;
&lt;h2 id="is-ambiguity-aversion-rational"&gt;Is ambiguity aversion rational?&lt;/h2&gt;
&lt;p&gt;This turns out to be a harder question than it looks, and the answer matters for how you think about the mispricing mechanism. The case for &amp;ldquo;yes, it&amp;rsquo;s rational&amp;rdquo; is surprisingly strong. &lt;a href="https://www.sciencedirect.com/science/article/abs/pii/0304406889900189"&gt;Gilboa and Schmeidler (1989)&lt;/a&gt; proved that a decision maker who evaluates bets by the worst-case probability in their set of plausible priors is behaving in a way that satisfies all the standard axioms of rational choice except one: the Sure-Thing Principle that Ellsberg&amp;rsquo;s experiment violates. Their maxmin expected utility model says: if you don&amp;rsquo;t know the probability, evaluate the bet as if the probability is the worst one consistent with your information. This is formally coherent. It&amp;rsquo;s also roughly what a good risk manager does when facing an uncertain tail risk. &lt;a href="https://link.springer.com/article/10.1007/s102030200006"&gt;Bewley (2002)&lt;/a&gt;, as I discussed in &lt;a href="http://philippdubach.com/posts/three-kinds-of-not-knowing/"&gt;Part 1&lt;/a&gt;, showed that dropping the completeness axiom produces a framework where inertia, refusing to act, is the rational response when you cannot rank the alternatives. If you can&amp;rsquo;t tell which bet is better, sticking with the status quo isn&amp;rsquo;t lazy. It&amp;rsquo;s defensible.&lt;/p&gt;
&lt;p&gt;The case for &amp;ldquo;no, it&amp;rsquo;s a bias&amp;rdquo; rests on the Ellsberg experiment itself. The subjects preferred a known 50% chance over an unknown chance that they could bet on either side of. There is no informational disadvantage. The probability they&amp;rsquo;re fleeing might be 50%, might be 30%, might be 70%, but since they can bet on either color, the expected value is the same regardless. The aversion is to the experience of not-knowing, not to any actual asymmetry in the bet. That looks more like a bug than a feature.&lt;/p&gt;
&lt;p&gt;I think the answer is &amp;ldquo;it depends,&amp;rdquo; and the distinction matters. Ambiguity aversion is rational when there might be a better-informed party on the other side of the trade. If you&amp;rsquo;re buying a stock and you suspect the seller knows something you don&amp;rsquo;t, demanding a discount for your ignorance is not a bias. It&amp;rsquo;s adverse selection protection. But ambiguity aversion is irrational when you can establish that nobody knows more than you do. When the ambiguity is universal, when the entire market is confused because the state space itself is new, the discount demanded by ambiguity-averse investors is a pricing error, not a risk premium.&lt;/p&gt;
&lt;p&gt;This is where I land: ambiguity aversion is a sensible default that gets systematically overweighted in specific situations. The skill is the distinction. And the distinction is judgment, not math.&lt;/p&gt;
&lt;h2 id="discomfort-as-information"&gt;Discomfort as information&lt;/h2&gt;
&lt;p&gt;Zeckhauser&amp;rsquo;s most counterintuitive move in the paper is turning ambiguity aversion from a problem into a signal. His Speculation 1 states it directly: &amp;ldquo;UUU investments drive off speculators, which creates the potential for an attractive low price.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;The logic is recursive. Your discomfort when facing an ambiguous situation tells you something, but not about the asset. It tells you about the competitive field. If you&amp;rsquo;re uncomfortable, most other potential buyers have already left. The very thing that makes you want to sell, the feeling of not-knowing, is the same thing that has thinned the competition and compressed the price. David Ricardo buying British government bonds on the eve of Waterloo was uncomfortable. Warren Buffett writing earthquake reinsurance for the California Earthquake Authority at roughly five times actuarial value was comfortable only because he had done this inference before: the discomfort of everyone else was the opportunity itself. Zeckhauser&amp;rsquo;s Maxim G puts it memorably&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Discounting for ambiguity is a natural tendency that should be overcome, just as should be overeating.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Both ambiguity aversion and overeating are evolved heuristics that served us well in ancestral environments and poorly in modern ones. In a small tribal group where the unknown reliably correlated with danger, fleeing ambiguity kept you alive. In a financial market where ambiguity-averse institutional capital mechanically exits positions it can&amp;rsquo;t model, the same instinct creates a systematic transfer of wealth from the ambiguity-averse to the ambiguity-tolerant.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://scholar.harvard.edu/files/iris_bohnet/files/trust_risk_and_betrayal.pdf"&gt;Bohnet and Zeckhauser (2004)&lt;/a&gt; identified a related mechanism they called &amp;ldquo;betrayal aversion.&amp;rdquo; People demand stronger odds when a betraying human rather than indifferent nature determines the outcome. In markets, this manifests as an extra discount demanded when the ambiguity involves a counterparty who might be exploiting your ignorance. The mere possibility that someone on the other side knows more amplifies the ambiguity premium beyond what the uncertainty alone would justify.&lt;/p&gt;
&lt;p&gt;Now apply all of this to a more recent example, the SaaSpocalypse. I &lt;a href="http://philippdubach.com/posts/the-saaspocalypse-paradox/"&gt;wrote about the details&lt;/a&gt; elsewhere, but the relevant point here is the mechanism. When Anthropic released the Claude Cowork plugins in late January, institutional investors didn&amp;rsquo;t sit down and estimate the probability that AI would replace CRM. They faced something worse: they couldn&amp;rsquo;t define what &amp;ldquo;replacing CRM&amp;rdquo; would even mean. The state space was undefined, as I argued in &lt;a href="http://philippdubach.com/posts/three-kinds-of-not-knowing/"&gt;Part 1&lt;/a&gt;. And when the state space is undefined, the entire institutional machinery for processing uncertainty breaks down simultaneously.&lt;/p&gt;
&lt;p&gt;Fiduciary duty requires estimable risk. Compliance models require defined scenarios. Portfolio managers face career risk: losing money on a position you can&amp;rsquo;t explain is a firing offense; missing a rally in something you sold is merely embarrassing. The institutional constraints compound the ambiguity aversion. Each layer of oversight demands a model, and the model requires defined states, and the states don&amp;rsquo;t exist yet. The rational response for any individual institutional actor was to sell. The collective result was an IGV drawdown of &lt;strong&gt;32%&lt;/strong&gt; while sector earnings grew &lt;strong&gt;17%&lt;/strong&gt;, an RSI of &lt;strong&gt;18&lt;/strong&gt;, and &lt;strong&gt;$2 trillion&lt;/strong&gt; in evaporated market cap.&lt;/p&gt;
&lt;p&gt;The sellers weren&amp;rsquo;t acting on information. They were acting on ambiguity aversion, amplified by comparative ignorance (everyone else seemed to be selling too), amplified by career risk (nobody gets fired for selling software before the AI disruption), amplified by betrayal aversion (maybe the AI insiders knew something the market didn&amp;rsquo;t). Stack these amplifiers on top of Ellsberg&amp;rsquo;s basic finding, and you get a price that reflects the intensity of collective discomfort rather than any assessment of fundamentals.&lt;/p&gt;
&lt;p&gt;Zeckhauser describes the investor&amp;rsquo;s challenge with a bridge analogy: you have to make peace with good decisions that lead to bad outcomes. Buying the IGV at $80 with an 18 RSI and 17% earnings growth is, on the framework, a good decision. If it drops to $70 first, that doesn&amp;rsquo;t make it a bad decision. But making that distinction under ambiguity is not an analytical skill. It&amp;rsquo;s a temperamental one. It requires accepting that &amp;ldquo;I don&amp;rsquo;t know&amp;rdquo; is not disqualifying and that the discomfort you feel is shared, priced in, and possibly overpriced. That&amp;rsquo;s harder than any calculation.&lt;/p&gt;
&lt;p&gt;Knowing that ambiguity aversion creates mispricing is the easy part. The hard part is what comes next: when you&amp;rsquo;re facing someone on the other side of a trade in a UU world, how do you figure out whether they know something you don&amp;rsquo;t, or whether they&amp;rsquo;re just less uncomfortable than you are? That&amp;rsquo;s the domain of sidecar investing and strategic inference. That&amp;rsquo;s Part 3 &lt;em&gt;(coming soon)&lt;/em&gt;.&lt;/p&gt;</description></item><item><title>Three Kinds of Not-Knowing</title><link>http://philippdubach.com/posts/three-kinds-of-not-knowing/</link><pubDate>Sat, 04 Apr 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/three-kinds-of-not-knowing/</guid><description>&lt;p&gt;&lt;em&gt;Investing at the Edge of Knowledge, Part 1&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;David Ricardo made a fortune buying British government bonds four days before the Battle of Waterloo. He was not a military analyst. He had no basis to compute the odds of Napoleon&amp;rsquo;s defeat, or victory, or any of the ambiguous outcomes in between. But he understood something that most of his contemporaries did not: the nature of his own ignorance was the same as everyone else&amp;rsquo;s, the seller was desperate, competition was thin, and the pounds he&amp;rsquo;d gain if Wellington won were worth far more than the pounds he&amp;rsquo;d lose if Wellington fell.&lt;/p&gt;
&lt;p&gt;Ricardo&amp;rsquo;s edge was not information. It was a correct assessment of what kind of not-knowing he was facing.&lt;/p&gt;
&lt;p&gt;That distinction, between different kinds of not-knowing, is mostly absent from finance. Richard Zeckhauser, the Frank P. Ramsey Professor of Political Economy at Harvard, made it the foundation of his 2006 paper &amp;ldquo;&lt;a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2205821"&gt;Investing in the Unknown and Unknowable&lt;/a&gt;,&amp;rdquo; published in &lt;em&gt;Capitalism and Society&lt;/em&gt;. The paper takes no derivatives and runs no regressions. What it does instead is more valuable: it provides a taxonomy of not-knowing, and then shows why the category that finance theory handles worst is the one where the biggest fortunes have been made.&lt;/p&gt;
&lt;p&gt;This is Part 1 of a five-part series that works through Zeckhauser&amp;rsquo;s framework and extends it. The goal is not a literature review. It&amp;rsquo;s an attempt to build a working vocabulary for the kind of investing that modern portfolio theory was never designed to address.&lt;/p&gt;
&lt;h2 id="the-taxonomy"&gt;The taxonomy&lt;/h2&gt;
&lt;p&gt;Zeckhauser presents three categories of not-knowing. Each demands different skills. Each rewards a different kind of investor. And the jump between them is not a smooth gradient. It&amp;rsquo;s a cliff.&lt;/p&gt;
&lt;a href="#lightbox-three-boxes-of-not-knowing-png-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/three-boxes-of-not-knowing.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/three-boxes-of-not-knowing.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/three-boxes-of-not-knowing.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/three-boxes-of-not-knowing.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/three-boxes-of-not-knowing.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/three-boxes-of-not-knowing.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/three-boxes-of-not-knowing.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/three-boxes-of-not-knowing.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/three-boxes-of-not-knowing.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/three-boxes-of-not-knowing.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/three-boxes-of-not-knowing.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/three-boxes-of-not-knowing.png"
alt="Zeckhauser&amp;#39;s three categories of not-knowing in investing: risk with known distributions, uncertainty with unknown probabilities, and ignorance where states are undefined"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;The first box is risk. Probabilities are known, distributions of returns are known, and the challenge is optimization. This is the world of the capital asset pricing model, of mean-variance portfolios, of the efficient frontier. You hold a 60/40 stock-bond portfolio and rebalance quarterly. The math is clean. The Nobel Prizes were awarded. Finance education lives here.&lt;/p&gt;
&lt;p&gt;The second box is uncertainty. You can identify the possible states of the world, but you can&amp;rsquo;t assign reliable probabilities. A corporate bond analyst looking at incomplete financials knows the company might default or might not, knows the recovery rate might be 40 cents or 60 cents, but can&amp;rsquo;t compute a precise probability for either. The skill that pays here is Bayesian estimation: forming the best prior you can from limited data, updating as information arrives, and having the temperament to act on imperfect beliefs. This is harder than Box 1, but it&amp;rsquo;s still recognizable territory. Decision theory was built for it.&lt;/p&gt;
&lt;p&gt;The third box is ignorance. Zeckhauser abbreviates it UU: unknown and unknowable. Here, even the identity of possible future states is undefined. You don&amp;rsquo;t have a distribution to estimate because you can&amp;rsquo;t enumerate what you&amp;rsquo;re estimating over. The question isn&amp;rsquo;t &amp;ldquo;what&amp;rsquo;s the probability of outcome X?&amp;rdquo; It&amp;rsquo;s &amp;ldquo;what even is X?&amp;rdquo; This is where Ricardo was standing at Waterloo. This is where Warren Buffett was standing in 1996 when he wrote a &lt;a href="https://www.berkshirehathaway.com/letters/1996.html"&gt;$1.5 billion reinsurance policy&lt;/a&gt; for the California Earthquake Authority at a premium far above actuarial estimates, coverage that the capital markets had failed to place. The New York financial community couldn&amp;rsquo;t model the risk. Buffett&amp;rsquo;s insight was that nobody could, that the Authority was not better informed about seismic activity than he was, and that the price was absurdly favorable given the symmetry of ignorance.&lt;/p&gt;
&lt;p&gt;The boxes are not a spectrum. You don&amp;rsquo;t get from Box 2 to Box 3 by adding more uncertainty. You get there when the state space itself is undefined. In Box 2, you might not know whether a company will default, but you know that &amp;ldquo;default&amp;rdquo; and &amp;ldquo;no default&amp;rdquo; are the relevant categories. In Box 3, you don&amp;rsquo;t even know the categories. That&amp;rsquo;s a qualitative difference, not a quantitative one.&lt;/p&gt;
&lt;h2 id="why-finance-forgot-the-third-box"&gt;Why finance forgot the third box&lt;/h2&gt;
&lt;p&gt;The strange thing is that the third box was identified a century ago. Twice, independently, in the same year.&lt;/p&gt;
&lt;p&gt;Frank Knight published &lt;a href="https://oll.libertyfund.org/titles/knight-risk-uncertainty-and-profit"&gt;&lt;em&gt;Risk, Uncertainty and Profit&lt;/em&gt;&lt;/a&gt; in 1921. His central argument, the origin of what economists now call Knightian uncertainty, was that entrepreneurial profit is compensation for bearing true uncertainty: situations where probabilities cannot be meaningfully calculated. Risk, in Knight&amp;rsquo;s framework, is insurable. Uncertainty is not. The distinction is not about the degree of confidence in your estimate. It&amp;rsquo;s about whether the concept of a probability estimate even applies.&lt;/p&gt;
&lt;p&gt;John Maynard Keynes published &lt;a href="https://archive.org/details/treatiseonprobab007528mbp"&gt;&lt;em&gt;A Treatise on Probability&lt;/em&gt;&lt;/a&gt; the same year. His angle was different but convergent. Keynes introduced the idea of the &amp;ldquo;weight of evidence&amp;rdquo;: a thin body of evidence yields low weight even when the point estimate looks reasonable. In his &lt;a href="https://academic.oup.com/qje/article-abstract/51/2/209/1939387"&gt;1937 &lt;em&gt;Quarterly Journal of Economics&lt;/em&gt; article&lt;/a&gt;, he made the distinction explicit: &amp;ldquo;By &amp;lsquo;uncertain&amp;rsquo; knowledge, let me explain, I do not mean merely to distinguish what is known for certain from what is only probable. The game of roulette is not subject, in this sense, to uncertainty.&amp;rdquo; Roulette is risky. The future of interest rates, the price of copper twenty years out, the obsolescence of a technology: these are uncertain in the deeper sense. The distinction mattered to Keynes, and it should matter to anyone building a portfolio.&lt;/p&gt;
&lt;p&gt;Both arguments lost. The discipline moved toward formalization, and formalization required calculable probabilities. The efficient markets hypothesis, rational expectations, CAPM, Black-Scholes: all of these live in Box 1 or assume that Box 2 can be reduced to Box 1 with sufficient data and computing power. This isn&amp;rsquo;t a criticism of these models within their domain. They&amp;rsquo;re brilliant engineering for the problems they were designed to solve. It&amp;rsquo;s a claim about the boundaries of that domain, and about how much of real-world investing sits outside it.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://doi.org/10.1086/261461"&gt;LeRoy and Singell (1987)&lt;/a&gt; offered a provocative reinterpretation in the &lt;em&gt;Journal of Political Economy&lt;/em&gt;: Knight&amp;rsquo;s real distinction, they argued, was about insurability, not probability. Uncertainty describes situations where insurance markets collapse because of moral hazard and adverse selection, not simply because probabilities are subjective. This reading is more radical than the standard one. It says the breakdown isn&amp;rsquo;t epistemic (we don&amp;rsquo;t know enough) but structural (the market itself can&amp;rsquo;t price the risk). That structural breakdown is precisely what happened in 1996 when Wall Street couldn&amp;rsquo;t write the California earthquake policy, and again in 2025 when insurance markets in parts of the American Southeast and West simply stopped functioning.&lt;/p&gt;
&lt;p&gt;Kay and King picked up this thread in their 2020 book &lt;a href="https://wwnorton.com/books/9781324004776"&gt;&lt;em&gt;Radical Uncertainty&lt;/em&gt;&lt;/a&gt;, arguing that the conflation of risk and uncertainty has caused systematic mismanagement across finance and policy. Their prescription is &amp;ldquo;narrative reasoning&amp;rdquo; rather than probabilistic optimization for decisions facing genuine uncertainty. I&amp;rsquo;m not sure narrative reasoning is sufficient, but I&amp;rsquo;m confident that probabilistic optimization is insufficient. The honest position is somewhere in between, and Zeckhauser&amp;rsquo;s framework gives you the vocabulary to think about where.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://link.springer.com/article/10.1007/s102030200006"&gt;Bewley (2002)&lt;/a&gt; formalized the problem differently. Working from a &lt;a href="https://elischolar.library.yale.edu/cowles-discussion-paper-series/1050/"&gt;1986 Cowles Foundation paper&lt;/a&gt;, he dropped the completeness axiom from expected utility theory. In standard theory, you can always rank alternatives: you prefer A to B, or B to A, or you&amp;rsquo;re indifferent. Bewley said: sometimes you simply can&amp;rsquo;t rank them. When alternatives are incomparable, sticking with the status quo is rational, not a bias. This gives mathematical expression to something practitioners know in their bones: there&amp;rsquo;s a difference between &amp;ldquo;I&amp;rsquo;m going to hold because I think the price will go up&amp;rdquo; and &amp;ldquo;I&amp;rsquo;m going to hold because I have no coherent basis for predicting what will happen and the cost of acting without a basis is higher than the cost of staying put.&amp;rdquo;&lt;/p&gt;
&lt;h2 id="why-knightian-uncertainty-is-growing"&gt;Why Knightian uncertainty is growing&lt;/h2&gt;
&lt;p&gt;This hundred-year-old taxonomy feels more relevant in 2026 than it did in 2006. Technological change creates entirely new categories of outcomes faster than models can absorb them. The state space itself is expanding.&lt;/p&gt;
&lt;p&gt;I wrote recently about &lt;a href="http://philippdubach.com/posts/the-saaspocalypse-paradox/"&gt;the SaaSpocalypse paradox&lt;/a&gt;: the market simultaneously pricing AI capex failure and AI destroying all enterprise software, when both cannot be true. That sell-off is a textbook example of Zeckhauser&amp;rsquo;s third box. The problem wasn&amp;rsquo;t that investors struggled to estimate the probability of known outcomes. The problem was that the outcomes themselves were undefined. What does &amp;ldquo;CRM&amp;rdquo; mean when AI agents replace human users? What does &amp;ldquo;per-seat licensing&amp;rdquo; mean when the number of seats might go to zero or might multiply by ten as agents proliferate? What does &amp;ldquo;enterprise software moat&amp;rdquo; mean when the moat was always the trained-user interface and the interface is now natural language? These aren&amp;rsquo;t questions with difficult probability estimates. They&amp;rsquo;re questions where the categories haven&amp;rsquo;t been invented yet.&lt;/p&gt;
&lt;p&gt;Nobody in January 2026 could enumerate the states of the world for enterprise software post-Claude Cowork plugins. Not &amp;ldquo;the probabilities were hard to estimate.&amp;rdquo; The states themselves were undefined. That&amp;rsquo;s not Box 2. That&amp;rsquo;s Box 3.&lt;/p&gt;
&lt;p&gt;And Box 3 is where the IGV software ETF fell &lt;strong&gt;32%&lt;/strong&gt; from its September peak, where hedge funds made &lt;strong&gt;$24 billion&lt;/strong&gt; shorting the sector, where the RSI hit &lt;strong&gt;18&lt;/strong&gt; (the most oversold reading in the ETF&amp;rsquo;s history), and where earnings growth continued at &lt;strong&gt;17%&lt;/strong&gt;. The disconnect between operating results and market prices is exactly what Zeckhauser&amp;rsquo;s framework predicts: when the state space is undefined, investors who require defined state spaces to make decisions leave the market. Their departure compresses prices beyond what any fundamental analysis would justify. The mispricing lives in the gap between what the asset is worth and what institutions are able to hold.&lt;/p&gt;
&lt;p&gt;This pattern will recur. AI is not the last technology that will generate new categories of outcomes that nobody anticipated. Every time it happens, the same sequence plays out: Box 3 conditions emerge, institutions flee because their models require Box 1 or Box 2 inputs, prices overshoot, and unconstrained investors who understand the nature of their own ignorance pick up the pieces. Zeckhauser wrote his paper two decades ago. The mechanism he described has, if anything, accelerated.&lt;/p&gt;
&lt;p&gt;The taxonomy tells you what kind of problem you&amp;rsquo;re facing. It doesn&amp;rsquo;t tell you what to do about it. That requires understanding why most investors run from Box 3, and whether running is rational. That&amp;rsquo;s Part 2 &lt;em&gt;(coming soon)&lt;/em&gt;.&lt;/p&gt;</description></item><item><title>On-Device AI Models Will Be The New Reason to Upgrade Your Phone</title><link>http://philippdubach.com/posts/on-device-ai-models-will-be-the-new-reason-to-upgrade-your-phone/</link><pubDate>Wed, 25 Mar 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/on-device-ai-models-will-be-the-new-reason-to-upgrade-your-phone/</guid><description>&lt;p&gt;The iPhone 17 runs a &lt;a href="https://machinelearning.apple.com/research/introducing-apple-foundation-models"&gt;3 billion parameter language model on-device&lt;/a&gt; at 30 tokens per second. Obviously, the average consumer has no idea what that sentence means, and Apple hasn&amp;rsquo;t figured out how to make them care.&lt;/p&gt;
&lt;p&gt;I believe that&amp;rsquo;s about to change. Apple now has &lt;a href="https://9to5mac.com/2026/03/25/new-details-on-apple-google-ai-deal-revealed-including-gemini-changes-report/"&gt;complete access to Google&amp;rsquo;s Gemini model&lt;/a&gt; in its own data centers, with &lt;a href="https://www.theinformation.com/newsletters/ai-agenda/apple-can-distill-googles-big-gemini-model"&gt;the ability to distill it into smaller models&lt;/a&gt; built for iPhones and iPads. Knowledge distillation works like this: you take a large model, have it perform tasks with detailed reasoning, then feed those reasoning traces to a smaller model until the student learns to mimic the teacher. The smaller model ends up far more capable than if you&amp;rsquo;d trained it from scratch on the same data. Apple can now do this with the full Gemini, not just their own in-house models, and the distilled output runs locally. No internet required.&lt;/p&gt;
&lt;p&gt;Smartphones haven&amp;rsquo;t had a real upgrade story in years. The camera is great. The screen is great. The processor was fast enough three generations ago. &lt;a href="https://www.sellcell.com/blog/how-often-do-people-upgrade-their-phone/"&gt;Battery life has overtaken price as the top purchase driver&lt;/a&gt; for the first time. The global &lt;a href="https://sqmagazine.co.uk/smartphone-statistics/"&gt;replacement cycle has stretched to 3.5 years&lt;/a&gt;. People hold onto their phones because nothing about the new one feels different enough. &lt;a href="https://www.deloitte.com/us/en/insights/industry/technology/technology-media-and-telecom-predictions/2025/gen-ai-on-smartphones.html"&gt;Deloitte&amp;rsquo;s 2025 TMT Predictions report&lt;/a&gt; frames on-device generative AI as the feature that could break this cycle, if the experience delivers on the promise. On-device AI might become the next reason.&lt;/p&gt;
&lt;h2 id="the-spec"&gt;The spec&lt;/h2&gt;
&lt;p&gt;In the late 1990s it was megahertz: Intel and AMD raced clock speeds past the point where consumers could distinguish real-world performance differences, but the number on the box still drove purchases. Then it was megapixels. Samsung shipped a &lt;a href="https://semiconductor.samsung.com/news-events/tech-blog/isocell-hp3-200mp-image-sensor-for-epic-details/"&gt;200 MP camera sensor&lt;/a&gt; knowing that most phones use 16-to-1 pixel binning to output a &lt;strong&gt;12.5 MP&lt;/strong&gt; image by default.&lt;/p&gt;
&lt;p&gt;Parameters could be next. The &lt;a href="https://www.apple.com/iphone-17/specs/"&gt;iPhone 17&amp;rsquo;s standard A19 chip&lt;/a&gt; has 8GB of RAM. The &lt;a href="https://www.apple.com/iphone-17-pro/specs/"&gt;Pro gets 12GB&lt;/a&gt; with faster memory bandwidth, which determines how large a model the phone can run and how quickly. Samsung&amp;rsquo;s 2026 flagships with the &lt;a href="https://semiconductor.samsung.com/processor/mobile-processor/exynos-2600/"&gt;Exynos 2600 hit &lt;strong&gt;80 TOPS&lt;/strong&gt;&lt;/a&gt; on a 2nm process, more than double the prior generation. These are already the numbers in press releases. It&amp;rsquo;s not hard to imagine an Apple keynote where someone says, with rehearsed enthusiasm, that the iPhone 18 Pro runs a 7 billion parameter model while the standard model is limited to 3 billion.&lt;/p&gt;
&lt;p&gt;The difference from previous spec wars is that this one might actually correlate with user experience. Megahertz past a certain threshold didn&amp;rsquo;t make Word open faster. Megapixels past 12 MP didn&amp;rsquo;t make photos look better on a phone screen. But a 7 billion parameter model running locally outperforms a 3 billion one on nearly every task. It handles longer documents, follows more complex instructions, holds better conversational context.&lt;/p&gt;
&lt;h2 id="breaking-the-stalemate"&gt;Breaking the stalemate&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://www.gartner.com/en/newsroom/press-releases/2025-09-09-gartner-says-worldwide-generative-artificial-intelligence-smartphone-end-user-spending-to-total-us-dollars-298-billion-by-the-end-of-2025"&gt;Gartner projects&lt;/a&gt; GenAI smartphone spending will reach &lt;strong&gt;$393 billion&lt;/strong&gt; in 2026, up 32% from &lt;strong&gt;$298 billion&lt;/strong&gt; in 2025. &lt;a href="https://my.idc.com/getdoc.jsp?containerId=prUS52478124"&gt;IDC reports&lt;/a&gt; GenAI smartphone shipments growing &lt;strong&gt;73%&lt;/strong&gt; year over year. &lt;a href="https://finance.yahoo.com/news/exclusive-samsung-double-mobile-devices-030312758.html"&gt;Samsung has publicly committed&lt;/a&gt; to 800 million AI-enabled devices by end of 2026, doubling its 2025 footprint. &lt;a href="https://www.cnbc.com/2024/12/13/apple-is-a-top-pick-for-2025-as-ai-will-drive-iphone-upgrade-cycle-morgan-stanley-says.html"&gt;Morgan Stanley&amp;rsquo;s latest survey&lt;/a&gt; found iPhone upgrade intentions at &lt;strong&gt;37%&lt;/strong&gt;, an all-time high, with FY26 shipment forecasts of 260 million units sitting 3% above Street consensus.&lt;/p&gt;
&lt;p&gt;On-device AI creates hard hardware requirements in a way that camera improvements and screen upgrades never did. You cannot run a 3 billion parameter model on an iPhone 14. The Neural Engine isn&amp;rsquo;t powerful enough and the memory bandwidth isn&amp;rsquo;t there. &lt;a href="https://support.apple.com/en-us/121115"&gt;Apple Intelligence requires an A17 Pro or later&lt;/a&gt;, which means the feature itself creates an upgrade floor. Every year that floor rises. When Apple ships distilled Gemini models that need the A19 Pro&amp;rsquo;s 12GB of RAM, every phone older than 2025 is locked out.&lt;/p&gt;
&lt;p&gt;The Gemini deal matters for the hardware cycle because of the distillation pipeline. Apple doesn&amp;rsquo;t need to build frontier-scale models from scratch. They can take Gemini&amp;rsquo;s best capabilities, run them through distillation, and compress the results into models sized for their hardware tiers. A 3 billion parameter model for the standard iPhone. A 5 billion version for the Pro. Maybe a 10 billion model for a future iPad Pro with enough memory and thermal headroom.&lt;/p&gt;
&lt;p&gt;Google is playing a similar game from the other side. The original &lt;a href="https://en.wikipedia.org/wiki/Gemini_(language_model)"&gt;Gemini Nano shipped at 1.8 billion parameters&lt;/a&gt;; the updated Nano-2 rose to 3.25 billion. Samsung&amp;rsquo;s &lt;a href="https://news.samsung.com/global/samsung-unveils-galaxy-s26-series-the-most-intuitive-galaxy-ai-phone-yet"&gt;Galaxy S26 ships with on-device Gemini&lt;/a&gt; running on NPUs that are 39% faster than the prior generation. On-device models get larger every hardware generation. Each generation&amp;rsquo;s models don&amp;rsquo;t run well on older hardware. You see where this goes.&lt;/p&gt;
&lt;p&gt;I find it plausible that within two product cycles, on-device model capability becomes the primary differentiator between phone tiers and between generations. The data isn&amp;rsquo;t there yet: &lt;a href="https://www.twice.com/research/the-smartphone-upgrade-cycle-slows"&gt;only 17% of Americans&lt;/a&gt; say AI is a major purchase influence today, Apple Intelligence &lt;a href="https://finance.yahoo.com/markets/stocks/articles/morgan-stanley-stark-message-investors-164700952.html"&gt;ranked seventh globally&lt;/a&gt; as a reason to upgrade in Morgan Stanley&amp;rsquo;s survey, and &lt;a href="https://www.phonearena.com/news/is-the-ai-boom-destroying-your-next-flagship-phones-value_id176913"&gt;over 40% of users&lt;/a&gt; have privacy concerns about smartphone AI, with half unwilling to pay extra for it. But you can&amp;rsquo;t tell the difference between a 48 MP photo and a 12 MP photo on your phone screen. You can absolutely tell the difference between an AI assistant that understands your question and one that doesn&amp;rsquo;t. The feedback loop is immediate and personal. If the bigger model actually works better, and if the distillation pipeline from Gemini delivers real capability gains, the upgrade incentive is self-reinforcing. People will upgrade not because the spec sheet says they should, but because they tried their friend&amp;rsquo;s phone and the AI was better.&lt;/p&gt;
&lt;p&gt;Whether this arrives with iOS 27 this fall or takes another generation to mature, I don&amp;rsquo;t know. But the next reason to buy a new phone will much more likely be the model than the camera.&lt;/p&gt;</description></item><item><title>AI Can Now Design Drugs in Seconds; We Still Can't Tell You If They Work.</title><link>http://philippdubach.com/posts/ai-can-now-design-drugs-in-seconds-we-still-cant-tell-you-if-they-work./</link><pubDate>Wed, 18 Mar 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/ai-can-now-design-drugs-in-seconds-we-still-cant-tell-you-if-they-work./</guid><description>&lt;blockquote&gt;
&lt;p&gt;No AI-discovered drug has ever received FDA approval. That sentence should sit uncomfortably next to every headline about Alphabet&amp;rsquo;s drug discovery spinoff.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;On February 10, &lt;a href="https://www.isomorphiclabs.com/articles/the-isomorphic-labs-drug-design-engine-unlocks-a-new-frontier"&gt;Isomorphic Labs&lt;/a&gt;, the Google DeepMind spinoff focused on computational drug design, released IsoDDE: its Drug Design Engine. This isn&amp;rsquo;t a model or an AlphaFold upgrade. IsoDDE is a unified in silico drug discovery system that runs protein structure prediction, ligand binding, affinity estimation, and pocket identification in concert, generating in seconds what used to take days of physics-based simulation. On the hardest molecular prediction tasks, the &amp;ldquo;Runs N&amp;rsquo; Poses&amp;rdquo; benchmark designed to test generalization to unfamiliar proteins, IsoDDE hits a &lt;strong&gt;50%&lt;/strong&gt; success rate. AlphaFold 3 manages roughly &lt;strong&gt;23%&lt;/strong&gt;. On antibody-antigen modeling, IsoDDE beats AlphaFold 3 by 2.3× and the open-source Boltz-2 by 19.8×. On binding affinity prediction, it achieves a Pearson correlation of 0.85, beating the physics-based gold standard FEP+ at 0.78. &lt;a href="#lightbox-isodde-benchmark-performance-png-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/isodde-benchmark-performance.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/isodde-benchmark-performance.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/isodde-benchmark-performance.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/isodde-benchmark-performance.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/isodde-benchmark-performance.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/isodde-benchmark-performance.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/isodde-benchmark-performance.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/isodde-benchmark-performance.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/isodde-benchmark-performance.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/isodde-benchmark-performance.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/isodde-benchmark-performance.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/isodde-benchmark-performance.png"
alt="IsoDDE benchmark performance: 50% protein-ligand prediction vs AlphaFold 3 at 23%, 2.3x antibody-antigen improvement, 0.85 binding affinity correlation vs FEP&amp;#43; at 0.78"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
I would assume that these are large enough improvements that the computational bottleneck in drug design may no longer be the binding question.&lt;/p&gt;
&lt;h2 id="what-pharma-believes"&gt;What pharma believes&lt;/h2&gt;
&lt;p&gt;Isomorphic has signed partnerships with &lt;a href="https://www.prnewswire.com/news-releases/isomorphic-labs-announces-strategic-multi-target-research-collaboration-with-lilly-302027392.html"&gt;Eli Lilly&lt;/a&gt;, &lt;a href="https://www.prnewswire.com/news-releases/isomorphic-labs-announces-strategic-multi-target-research-collaboration-with-novartis-302027387.html"&gt;Novartis&lt;/a&gt;, and &lt;a href="https://pharmaphorum.com/news/jj-bets-isomorphic-ai-powered-drug-hunt"&gt;Johnson &amp;amp; Johnson&lt;/a&gt; worth a combined &lt;strong&gt;$4 billion+&lt;/strong&gt; in potential value. But look at the structure. Lilly paid $45 million upfront against $1.7 billion in milestones. Novartis paid $37.5 million upfront against $1.2 billion. That&amp;rsquo;s a 50:1 ratio between what pharma promises in biobucks and what it actually wires. &lt;a href="#lightbox-isomorphic-deal-structure-png-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/isomorphic-deal-structure.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/isomorphic-deal-structure.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/isomorphic-deal-structure.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/isomorphic-deal-structure.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/isomorphic-deal-structure.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/isomorphic-deal-structure.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/isomorphic-deal-structure.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/isomorphic-deal-structure.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/isomorphic-deal-structure.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/isomorphic-deal-structure.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/isomorphic-deal-structure.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/isomorphic-deal-structure.png"
alt="Isomorphic Labs pharma deal structure: Eli Lilly $45M upfront vs $1.7B milestones, Novartis $37.5M vs $1.2B&amp;#43;, totaling $4B headline value against $82.5M cash"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
This ratio is standard across AI drug discovery deals in 2025. Pharma is enthusiastic enough to sign but cautious enough to make nearly all the economics contingent on clinical results that don&amp;rsquo;t exist yet. The upfront payments fund research. The milestone payments are structured so that pharma loses almost nothing if the drugs fail. The royalties only matter if a drug reaches blockbuster status, which for an AI-designed molecule has never happened.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.isomorphiclabs.com/articles/isomorphic-labs-announces-novartis-collaboration-expansion"&gt;Novartis expanded its partnership in February 2025&lt;/a&gt;, doubling the number of programs to six, targeting what Novartis described as &amp;ldquo;particularly challenging&amp;rdquo; and previously undruggable targets, on the same financial terms. That&amp;rsquo;s a positive signal: it means internal results impressed Novartis scientists enough to commit more targets. The J&amp;amp;J deal, announced January 2026, goes further, covering small molecules, antibodies, peptides, and molecular glues. But &amp;ldquo;expanded partnerships&amp;rdquo; and &amp;ldquo;approved drugs&amp;rdquo; remain separated by the most unforgiving filter in business: human biology.&lt;/p&gt;
&lt;h2 id="phase-ii-wall"&gt;Phase II wall&lt;/h2&gt;
&lt;p&gt;Most commentary on AI drug discovery stops too early. &lt;a href="https://www.sciencedirect.com/science/article/pii/S135964462400134X"&gt;Jayatunga et al. (2024)&lt;/a&gt;, in the first systematic analysis of AI-discovered drugs in clinical trials, showed AI-discovered molecules achieving &lt;strong&gt;80-90%&lt;/strong&gt; success rates in Phase I trials, well above the historical 40-65% average. AI is good at designing molecules that are safe and have decent pharmacokinetic properties: they get absorbed, distributed, metabolized, and excreted the way you&amp;rsquo;d want. Phase I is mostly about safety. AI passes it.&lt;/p&gt;
&lt;p&gt;But Phase II is about efficacy. Does the drug actually treat the disease? And here the numbers are sobering: AI-discovered drugs show roughly 40% Phase II success rates, which is &lt;a href="https://www.science.org/content/blog-post/ai-drugs-so-far"&gt;about the same as traditionally discovered drugs&lt;/a&gt;. AI has not yet demonstrated it can predict whether a molecule will work in a patient, only that it can predict whether a molecule will be tolerable in a patient. &lt;a href="#lightbox-ai-drug-phase2-wall-png-2" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/ai-drug-phase2-wall.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/ai-drug-phase2-wall.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/ai-drug-phase2-wall.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/ai-drug-phase2-wall.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai-drug-phase2-wall.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/ai-drug-phase2-wall.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/ai-drug-phase2-wall.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/ai-drug-phase2-wall.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai-drug-phase2-wall.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/ai-drug-phase2-wall.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/ai-drug-phase2-wall.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/ai-drug-phase2-wall.png"
alt="AI drug clinical trial success rates: 80-90% Phase I vs 40-65% traditional, but roughly 40% Phase II for both AI and traditional, projecting 9-18% end-to-end vs historical 5-10%"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
If both trends hold, end-to-end success rates could rise from the historical 5-10% to something like 9-18%. That would roughly double R&amp;amp;D productivity, which in a trillion-dollar industry is worth an enormous amount. &lt;a href="https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/the-economic-potential-of-generative-ai-the-next-productivity-frontier"&gt;McKinsey estimates&lt;/a&gt; generative AI could generate $60-110 billion annually in economic value for pharma and medical products. But it&amp;rsquo;s a far cry from the narrative that generative AI will &amp;ldquo;solve&amp;rdquo; drug discovery. It would make drug development somewhat cheaper and faster. An improvement, not a revolution.&lt;/p&gt;
&lt;p&gt;The counterargument, and it&amp;rsquo;s a reasonable one, is that IsoDDE represents a qualitative leap that could crack the efficacy problem. Its ability to model induced fits, where proteins reshape to accommodate a drug, and to identify cryptic binding pockets, like the cereblon site that took experimentalists 15 years to find, means it&amp;rsquo;s capturing biological dynamics that earlier AI systems missed entirely. If better structural understanding translates to better efficacy prediction, the Phase II wall might eventually come down.&lt;/p&gt;
&lt;p&gt;I find this plausible but unproven. We&amp;rsquo;ll know more when Isomorphic&amp;rsquo;s first candidates enter trials, targeted for late 2026.&lt;/p&gt;
&lt;h2 id="where-isomorphic-fits-in-the-competitive-stack"&gt;Where Isomorphic fits in the competitive stack&lt;/h2&gt;
&lt;p&gt;Isomorphic&amp;rsquo;s competitive position is unusual. It leads on computational benchmarks but trails on clinical progress. &lt;a href="https://insilico.com/"&gt;Insilico Medicine&lt;/a&gt; has the most advanced clinical portfolio: its IPF drug ISM001-055 (now called rentosertib) reached Phase IIa with &lt;a href="https://www.nature.com/articles/s41591-025-03743-2"&gt;positive results published in &lt;em&gt;Nature Medicine&lt;/em&gt; in June 2025&lt;/a&gt;, and Insilico has 10+ IND approvals across 31 programs. &lt;a href="https://ir.recursion.com/news-releases/news-release-details/recursion-and-exscientia-two-leaders-ai-drug-discovery-space"&gt;Recursion Pharmaceuticals&lt;/a&gt;, which &lt;a href="https://pharmaphorum.com/news/ai-biotechs-exscientia-and-recursion-agree-688m-merger"&gt;absorbed Exscientia in a $688 million merger&lt;/a&gt;, takes a different approach entirely, running millions of phenomics experiments weekly on 65 petabytes of biological imaging data. Both companies own wet-lab infrastructure that Isomorphic lacks.&lt;/p&gt;
&lt;p&gt;What Isomorphic has: the AlphaFold lineage, Alphabet-scale compute, and a unified architecture where each prediction task informs the others. On talent, the company appears to be doing well: 4.7/5 on Glassdoor, 100% CEO approval. They hired Dr. Ben Wolf as CMO in June 2025, formerly at Relay Therapeutics with FDA approval experience for Ayvakit and Gavreto. They opened a Cambridge, Massachusetts office. These are the moves of a company staffing up for clinical reality, not just publishing papers.&lt;/p&gt;
&lt;p&gt;The open-source threat is real but manageable in the near term. &lt;a href="https://techcrunch.com/2026/01/16/from-openais-offices-to-a-deal-with-eli-lilly-how-chai-discovery-became-one-of-the-flashiest-names-in-ai-drug-development/"&gt;Chai Discovery&lt;/a&gt; (backed by OpenAI at a &lt;a href="https://techcrunch.com/2025/12/15/openai-backed-biotech-firm-chai-discovery-raises-130m-series-b-at-1-3b-valuation/"&gt;$1.3 billion valuation&lt;/a&gt;, now partnered with Lilly on biologics) and &lt;a href="https://www.genengnews.com/topics/artificial-intelligence/pharma-bets-big-on-ai-platforms-with-flurry-of-new-year-deals/"&gt;Boltz&lt;/a&gt; (partnered with Pfizer) are both making progress. But the gap between IsoDDE&amp;rsquo;s numbers and the best open-source alternatives is wide enough that Isomorphic has time, maybe 18-24 months, to convert its computational lead into clinical evidence before the field catches up.&lt;/p&gt;
&lt;h2 id="alphabets-asymmetric-position"&gt;Alphabet&amp;rsquo;s asymmetric position&lt;/h2&gt;
&lt;p&gt;For Alphabet, Isomorphic is a rounding error that could become a franchise. The Other Bets segment posted a $3.6 billion operating loss in 2025. Alphabet&amp;rsquo;s net income was $132 billion. The &lt;a href="https://www.isomorphiclabs.com/articles/isomorphic-labs-announces-600m-external-investment-round"&gt;$600 million funding round&lt;/a&gt; led by Thrive Capital in March 2025 suggests the company understands the urgency of getting to the clinic, but Alphabet can sustain this bet indefinitely while the underlying science matures, and that patience is itself a competitive advantage most biotech startups don&amp;rsquo;t have. But does better computation translate to better medicine? IsoDDE&amp;rsquo;s benchmarks are the best evidence so far that AI can model molecular interactions at this resolution. But Demis Hassabis &lt;a href="https://www.isomorphiclabs.com/our-tech"&gt;said it himself&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We know we&amp;rsquo;re never going to solve drug design with AlphaFold alone. We&amp;rsquo;ll need half a dozen more breakthroughs of that magnitude.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;IsoDDE might be one of those breakthroughs. The clinical data, when it arrives, will tell us whether it&amp;rsquo;s the kind that matters.&lt;/p&gt;</description></item><item><title>The Last Architecture Designed by Hand</title><link>http://philippdubach.com/posts/the-last-architecture-designed-by-hand/</link><pubDate>Mon, 16 Mar 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/the-last-architecture-designed-by-hand/</guid><description>&lt;blockquote&gt;
&lt;p&gt;I bet there is another new architecture to find that is gonna be as big of a gain as transformers were over LSTMs.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Sam Altman, the CEO of the company most invested in the transformer is telling a room of students it isn&amp;rsquo;t the final form. So what comes after the transformer? He&amp;rsquo;s probably right that something will, and the evidence is no longer anecdotal. Several recent papers have proved that the transformer&amp;rsquo;s worst properties are structural, not engineering problems to be fixed with better data or more compute, but mathematical lower bounds.&lt;/p&gt;
&lt;p&gt;The transformer, born from the 2017 paper &lt;a href="https://arxiv.org/abs/1706.03762"&gt;&amp;ldquo;Attention Is All You Need,&amp;rdquo;&lt;/a&gt; took us from barely-coherent GPT-2 to GPT-4 in five years. An extraordinary run. But &lt;a href="https://arxiv.org/abs/2209.04881"&gt;Duman Keles et al.&lt;/a&gt; proved that O(n²) attention complexity isn&amp;rsquo;t an implementation detail. It&amp;rsquo;s a necessary lower bound unless a foundational conjecture in complexity theory turns out to be wrong. Double the context, quadruple the cost. The KV cache for a 70B model at one-million-token context eats roughly &lt;strong&gt;320 GB&lt;/strong&gt; of GPU memory. Most hardware can&amp;rsquo;t hold it.&lt;/p&gt;
&lt;a href="#lightbox-last-architecture-quadratic-attention-1-png-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/last-architecture-quadratic-attention-1.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/last-architecture-quadratic-attention-1.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/last-architecture-quadratic-attention-1.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/last-architecture-quadratic-attention-1.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/last-architecture-quadratic-attention-1.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/last-architecture-quadratic-attention-1.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/last-architecture-quadratic-attention-1.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/last-architecture-quadratic-attention-1.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/last-architecture-quadratic-attention-1.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/last-architecture-quadratic-attention-1.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/last-architecture-quadratic-attention-1.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/last-architecture-quadratic-attention-1.png"
alt="Quadratic attention scaling: a 4x4 attention matrix requires 16 computations while an 8x8 matrix requires 64, showing how doubling context quadruples cost in transformer architectures"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;The problems run deeper than compute costs. &lt;a href="https://arxiv.org/abs/2311.14648"&gt;Kalai and Vempala&lt;/a&gt; proved that any calibrated language model &lt;em&gt;must&lt;/em&gt; hallucinate at a certain rate. A &lt;a href="https://arxiv.org/abs/2509.04664"&gt;2025 follow-up&lt;/a&gt; goes further: no computable LLM can be universally correct on unbounded queries. Not fixable with better training data. Not fixable with RLHF. A statistical property of how these models generate text.&lt;/p&gt;
&lt;p&gt;On reasoning: &lt;a href="https://arxiv.org/abs/2305.18654"&gt;Dziri et al.&lt;/a&gt; showed transformers collapse multi-step reasoning into pattern matching. Performance drops exponentially as task complexity rises. GPT-4 gets &lt;strong&gt;59%&lt;/strong&gt; on 3-digit multiplication. &lt;a href="https://arxiv.org/abs/2603.10123"&gt;Chowdhury&lt;/a&gt; proved the &amp;ldquo;lost in the middle&amp;rdquo; problem, models performing 20-30% worse on information buried mid-context, is a geometric property of the architecture itself. Present at initialization already, before any training occurs.&lt;/p&gt;
&lt;p&gt;These are theorems. The architecture that runs every frontier AI system has a ceiling, and the ceiling is proved.&lt;/p&gt;
&lt;h2 id="the-post-transformer-stack-is-already-in-production"&gt;The post-transformer stack is already in production&lt;/h2&gt;
&lt;p&gt;A &lt;a href="https://arxiv.org/abs/2510.05364"&gt;survey by Fichtl et al.&lt;/a&gt; checked the top 10 models on every major benchmark. Zero were non-transformer. The transformer is still winning on the leaderboards. But the field is moving toward hybrid architectures. Over &lt;strong&gt;60%&lt;/strong&gt; of frontier models released in 2025 already use Mixture of Experts. &lt;a href="https://arxiv.org/abs/2412.19437"&gt;DeepSeek-V3&lt;/a&gt; has 671B total parameters but activates only 37B per token. It trained for &lt;strong&gt;2.788 million H800 GPU hours&lt;/strong&gt;, a fraction of what a comparable dense model would require, and matched frontier closed-source performance. By late 2025, &lt;a href="https://c3.unu.edu/blog/inside-deepseeks-end-of-year-ai-breakthrough-what-the-new-models-deliver"&gt;DeepSeek-V3.2 reportedly hit GPT-5-level performance at 90% lower training cost&lt;/a&gt;. MoE doesn&amp;rsquo;t replace the transformer. It changes the economics so radically that it&amp;rsquo;s arguably the single biggest practical advance since the original architecture.&lt;/p&gt;
&lt;a href="#lightbox-last-architecture-moe-routing-1-png-2" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/last-architecture-moe-routing-1.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/last-architecture-moe-routing-1.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/last-architecture-moe-routing-1.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/last-architecture-moe-routing-1.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/last-architecture-moe-routing-1.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/last-architecture-moe-routing-1.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/last-architecture-moe-routing-1.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/last-architecture-moe-routing-1.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/last-architecture-moe-routing-1.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/last-architecture-moe-routing-1.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/last-architecture-moe-routing-1.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/last-architecture-moe-routing-1.png"
alt="Mixture of Experts routing: an input token passes through a router that activates only 2 of 8 expert blocks, meaning DeepSeek-V3 uses just 37B of its 671B total parameters per token"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;The more interesting part is what happens when you blend attention with state space models. &lt;a href="https://goombalab.github.io/blog/2024/mamba2-part1-model/"&gt;Gu and Dao (2024)&lt;/a&gt; proved SSMs and attention are mathematically dual: two views of the same computation. That theoretical result is showing up in production. &lt;a href="https://www.ai21.com/jamba/"&gt;AI21&amp;rsquo;s Jamba&lt;/a&gt; runs a 1:7 attention-to-Mamba ratio and gets &lt;strong&gt;256K&lt;/strong&gt; context at &lt;strong&gt;3x&lt;/strong&gt; throughput over Mixtral. Alibaba&amp;rsquo;s Qwen3-Next shipped the first top-tier model with a hybrid backbone: &lt;a href="https://github.com/rasbt/LLMs-from-scratch/blob/main/ch04/08_deltanet/README.md"&gt;Gated DeltaNet&lt;/a&gt; for linear attention at a 3:1 ratio with full attention. Microsoft&amp;rsquo;s Phi-4-mini-flash-reasoning is 75% Mamba layers with &lt;strong&gt;10x&lt;/strong&gt; throughput at &lt;strong&gt;2-3x&lt;/strong&gt; lower latency.&lt;/p&gt;
&lt;a href="#lightbox-last-architecture-hybrid-layer-stack-1-png-3" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/last-architecture-hybrid-layer-stack-1.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/last-architecture-hybrid-layer-stack-1.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/last-architecture-hybrid-layer-stack-1.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/last-architecture-hybrid-layer-stack-1.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/last-architecture-hybrid-layer-stack-1.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/last-architecture-hybrid-layer-stack-1.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/last-architecture-hybrid-layer-stack-1.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/last-architecture-hybrid-layer-stack-1.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/last-architecture-hybrid-layer-stack-1.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/last-architecture-hybrid-layer-stack-1.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/last-architecture-hybrid-layer-stack-1.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/last-architecture-hybrid-layer-stack-1.png"
alt="Hybrid layer stack comparison: a traditional transformer uses 8 attention layers while Jamba uses a 1:7 attention-to-Mamba ratio, achieving 256K context at 3x throughput with the same quality"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;Diffusion language models are the wild card. &lt;a href="https://arxiv.org/abs/2502.09992"&gt;LLaDA&lt;/a&gt;, the first 8B-parameter diffusion LLM, treats text generation as denoising rather than sequential token prediction. It matches Llama3-8B and does something no autoregressive model can: it solves the &amp;ldquo;reversal curse,&amp;rdquo; outperforming GPT-4o on reversal tasks. &lt;a href="https://medium.com/@ML-today/diffusion-models-for-language-from-early-promise-to-a-bold-new-frontier-with-llada-and-the-rise-of-ee80c7ffb8fa"&gt;Gemini Diffusion&lt;/a&gt; hit &lt;strong&gt;1,479 tokens per second&lt;/strong&gt;. Over 50 papers on diffusion LLMs appeared in 2025. If parallel generation works reliably at scale, inference economics change completely.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://arxiv.org/pdf/2510.05364"&gt;Alman and Yu&lt;/a&gt; proved there are tasks where every subquadratic alternative has a fundamental theoretical gap. That&amp;rsquo;s the strongest mathematical argument for why hybrids, not clean replacements, are what comes next.&lt;/p&gt;
&lt;h2 id="the-search-is-no-longer-human-speed"&gt;The search is no longer human-speed&lt;/h2&gt;
&lt;p&gt;The part of this I find most interesting is the recursion. AI systems are now running the search for their own architectural successors.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://deepmind.google/blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/"&gt;AlphaEvolve&lt;/a&gt; an evolutionary coding agent built on Gemini 2.0 found a way to multiply 4x4 complex matrices in 48 scalar multiplications: the first improvement on Strassen&amp;rsquo;s 56-year-old bound. Across &lt;a href="https://www.infoq.com/news/2025/05/google-alpha-evolve/"&gt;50+ open math problems&lt;/a&gt;, it matched the best known solutions 75% of the time and beat them 20% of the time. The recursive part: AlphaEvolve found a &lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/alphaevolve-on-google-cloud"&gt;23% speedup on a kernel inside Gemini&amp;rsquo;s own architecture&lt;/a&gt;, cutting Gemini&amp;rsquo;s training time by 1% and recovering &lt;strong&gt;0.7%&lt;/strong&gt; of Google&amp;rsquo;s total compute. Gemini making Gemini faster.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.marktechpost.com/2026/03/08/andrej-karpathy-open-sources-autoresearch-a-630-line-python-tool-letting-ai-agents-run-autonomous-ml-experiments-on-single-gpus/"&gt;Karpathy&amp;rsquo;s AutoResearch&lt;/a&gt;, released March 7, 2026, is a 630-line Python script that lets an AI agent modify training code, run 5-minute experiments, check results, and iterate. He pointed it at his own highly-tuned &amp;ldquo;Time to GPT-2&amp;rdquo; codebase. The agent found about 20 additive improvements that transferred to larger models, cutting the metric by &lt;strong&gt;11%&lt;/strong&gt;. &lt;a href="https://officechai.com/ai/andrej-karpathys-autoresearch-project-lets-agents-run-100-ai-research-experiments-while-you-sleep/"&gt;Shopify CEO Tobi Lutke tried it overnight&lt;/a&gt;: 37 experiments, 19% validation improvement, a 0.8B model outperforming a 1.6B one. &lt;a href="https://github.com/SakanaAI/AI-Scientist-v2"&gt;Sakana AI&amp;rsquo;s AI Scientist v2&lt;/a&gt; went further and produced the first AI-authored paper accepted through standard peer review. &lt;a href="https://controlai.news/p/the-ultimate-risk-recursive-self"&gt;OpenAI said publicly in late 2025&lt;/a&gt; that it&amp;rsquo;s researching how to safely build AI systems capable of recursive self-improvement. Two years ago this was a thought experiment.&lt;/p&gt;
&lt;h2 id="what-the-hardware-decides"&gt;What the hardware decides&lt;/h2&gt;
&lt;p&gt;The transformer won not because attention was theoretically prettier than recurrence. It won because it parallelized well on GPUs. Whatever comes next has to clear the same bar.&lt;/p&gt;
&lt;p&gt;Pre-training scaling for dense transformers is flattening. &lt;a href="https://fortune.com/2025/02/25/what-happened-gpt-5-openai-orion-pivot-scaling-pre-training-llm-agi-reasoning/"&gt;OpenAI spent at least $500 million per major training run on Orion&lt;/a&gt;. The model hit GPT-4 performance after 20% of training; the remaining 80% gave diminishing returns. They downgraded it from GPT-5 to GPT-4.5. &lt;a href="https://artificialintelligencemonaco.substack.com/p/ilya-sutskever-on-superintelligence"&gt;Sutskever&lt;/a&gt; at NeurIPS 2024: &amp;ldquo;Pre-training as we know it will end. The data is not growing because we have but one internet.&amp;rdquo; His startup SSI has &lt;a href="https://www.arturmarkus.com/ilya-sutskevers-ssi-raises-1b-at-30b-valuation-with-zero-revenue-6x-jump-in-5-months-redefines-ai-investment-logic/"&gt;raised to a $32 billion valuation with about 20 employees and zero revenue&lt;/a&gt;. A bet that the next leap requires something architecturally new.&lt;/p&gt;
&lt;p&gt;But test-time compute opened a different axis entirely. OpenAI&amp;rsquo;s o3 hit &lt;strong&gt;87.5%&lt;/strong&gt; on ARC-AGI, beating most humans. DeepSeek-R1 matched o1-level reasoning at &lt;strong&gt;70%&lt;/strong&gt; lower cost. &lt;a href="https://aibusiness.com/language-models/ai-model-scaling-isn-t-over-it-s-entering-a-new-era"&gt;OpenAI&amp;rsquo;s inference spending reached $2.3 billion in 2024&lt;/a&gt;: &lt;strong&gt;15x&lt;/strong&gt; what they spent training GPT-4.5. &lt;a href="https://www.dwarkesh.com/p/dario-amodei"&gt;Dario Amodei&lt;/a&gt; at Morgan Stanley in March 2026: &amp;ldquo;We do not see hitting the wall. We don&amp;rsquo;t see a wall.&amp;rdquo; He&amp;rsquo;s talking about this axis, inference-time compute and RL from verifiable rewards, not about pre-training bigger dense models. The Densing Law now shows capability per parameter doubling every &lt;strong&gt;3.5 months&lt;/strong&gt; through better data, MoE, and distillation. Last year&amp;rsquo;s frontier, matched with a fraction of the parameters.&lt;/p&gt;
&lt;p&gt;Inference demand is projected to &lt;a href="https://v-chandra.github.io/on-device-llms/"&gt;exceed training demand by 118x&lt;/a&gt;. Global data center power is heading toward &lt;a href="https://www.iea.org/reports/energy-and-ai/executive-summary"&gt;945 TWh by 2030&lt;/a&gt;, roughly Japan&amp;rsquo;s total electricity consumption. An architecture that scores 2x better on benchmarks but runs 3x worse at inference won&amp;rsquo;t win. What ships is whatever fits the hardware. The transformer isn&amp;rsquo;t going away. It&amp;rsquo;s becoming one component in a larger stack: attention for recall, SSMs for cheap sequence processing, MoE for capacity, maybe diffusion for parallel output. &lt;a href="https://www.ai21.com/jamba/"&gt;Jamba&lt;/a&gt;, &lt;a href="https://arxiv.org/html/2411.13676v1"&gt;Hymba&lt;/a&gt;, and Qwen3-Next already ship this way. That&amp;rsquo;s not a prediction. It&amp;rsquo;s what&amp;rsquo;s in production.&lt;/p&gt;
&lt;p&gt;How fast the stack evolves is the open question. The answer, given AlphaEvolve and AutoResearch and AI Scientist v2, is faster than any previous architectural transition. I don&amp;rsquo;t know whether the transformer remains the dominant layer for two years or five. But I&amp;rsquo;m fairly confident that whatever comes next, humans won&amp;rsquo;t have designed it alone.&lt;/p&gt;</description></item><item><title>MCP vs A2A in 2026: How the AI Protocol War Ends</title><link>http://philippdubach.com/posts/mcp-vs-a2a-in-2026-how-the-ai-protocol-war-ends/</link><pubDate>Sun, 15 Mar 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/mcp-vs-a2a-in-2026-how-the-ai-protocol-war-ends/</guid><description>&lt;p&gt;On March 26, 2025, Sam Altman posted the following &lt;a href="https://x.com/sama/status/1904957253456941061"&gt;three sentences&lt;/a&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;people love MCP and we are excited to add support across our products.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;MCP is Anthropic&amp;rsquo;s Model Context Protocol. OpenAI is Anthropic&amp;rsquo;s most direct competitor. Altman was endorsing a rival&amp;rsquo;s standard. That post may be the most significant event in enterprise AI infrastructure this year. When your main competitor adopts your protocol, the war is close to over. I&amp;rsquo;ve been watching this play out since &lt;a href="https://www.anthropic.com/news/model-context-protocol"&gt;Anthropic launched MCP in November 2024&lt;/a&gt;, and I want to work through what&amp;rsquo;s happening: who controls what, what &amp;ldquo;interoperability&amp;rdquo; means in practice, and whether any of this follows patterns we&amp;rsquo;ve seen before.&lt;/p&gt;
&lt;h2 id="what-is-mcp"&gt;What is MCP&lt;/h2&gt;
&lt;p&gt;MCP is a client-server protocol, licensed MIT, built on JSON-RPC 2.0. The mental model is simple: an AI agent (the host) connects through a client to MCP servers that expose tools, data sources, and context. Instead of building a bespoke integration every time Claude or GPT needs to talk to Salesforce, GitHub, or your internal database, you build one MCP server. Any compatible host can then use it.&lt;/p&gt;
&lt;p&gt;The problem it solves, which explains why it spread so fast, is that without a standard like this, integration complexity grows quadratically. Every new AI model times every new tool equals a new custom integration. MCP tries to make it linear.&lt;/p&gt;
&lt;p&gt;By December 2025, &lt;a href="https://www.anthropic.com/news/donating-the-model-context-protocol-and-establishing-of-the-agentic-ai-foundation"&gt;Anthropic&amp;rsquo;s own count&lt;/a&gt; put the public MCP server ecosystem at &lt;strong&gt;10,000+&lt;/strong&gt; active servers and &lt;strong&gt;97 million&lt;/strong&gt; monthly SDK downloads across the Python and TypeScript SDKs. &lt;a href="https://github.blog/news-insights/octoverse/octoverse-a-new-developer-joins-github-every-second-as-ai-leads-typescript-to-1/"&gt;GitHub&amp;rsquo;s 2025 Octoverse report&lt;/a&gt; flagged MCP as a standout, hitting &lt;strong&gt;37,000 stars&lt;/strong&gt; in eight months. The unofficial registry mcp.so lists over 18,000 servers. Official SDKs now cover ten languages, including Python, TypeScript, Java, C#, Go, Kotlin, Rust, and Swift.&lt;/p&gt;
&lt;p&gt;The companies building MCP integrations: Microsoft, Salesforce, Cloudflare, GitHub, Stripe, Atlassian, Figma, Snowflake, Databricks, New Relic. At &lt;a href="https://blog.cloudflare.com/mcp-demo-day/"&gt;Cloudflare&amp;rsquo;s MCP Demo Day in May 2025&lt;/a&gt;, Asana, PayPal, Sentry, and Webflow all shipped remote servers in a single afternoon. Gartner predicts 75% of API gateway vendors will have MCP features by 2026.&lt;/p&gt;
&lt;p&gt;OpenAI&amp;rsquo;s adoption went beyond Altman&amp;rsquo;s post. MCP support rolled out across their Agents SDK (March 2025), &lt;a href="https://openai.com/index/new-tools-and-features-in-the-responses-api/"&gt;Responses API (May 2025)&lt;/a&gt;, &lt;a href="https://openai.com/index/introducing-gpt-realtime/"&gt;Realtime API (August 2025)&lt;/a&gt;, and &lt;a href="https://help.openai.com/en/articles/12584461-developer-mode-and-mcp-apps-in-chatgpt-beta"&gt;ChatGPT Developer Mode (September 2025)&lt;/a&gt;. The two companies later &lt;a href="http://blog.modelcontextprotocol.io/posts/2025-11-21-mcp-apps/"&gt;co-authored the MCP Apps Extension&lt;/a&gt;. You don&amp;rsquo;t see that often between direct competitors.&lt;/p&gt;
&lt;p&gt;One performance claim circulates in blog posts and marketing materials: that organizations implementing MCP report &amp;ldquo;40–60% faster agent deployment times.&amp;rdquo; I have not found a primary source for this. No survey, no case study, no named company. I&amp;rsquo;d treat it as marketing content until someone produces the underlying data.&lt;/p&gt;
&lt;h2 id="googles-a2a-fills-a-different-layer"&gt;Google&amp;rsquo;s A2A fills a different layer&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://developers.googleblog.com/en/a2a-a-new-era-of-agent-interoperability/"&gt;Google launched A2A, the Agent-to-Agent protocol, at Cloud Next on April 9, 2025&lt;/a&gt;, five months after MCP. Google didn&amp;rsquo;t position A2A as MCP replacement. They called it a complement. I think that&amp;rsquo;s honest, but it takes a minute to see why.&lt;/p&gt;
&lt;p&gt;MCP connects an agent to tools; A2A connects agents to each other, the two protocols produce different behavior.&lt;/p&gt;
&lt;p&gt;When an MCP host calls an MCP server, it knows exactly what it&amp;rsquo;s getting: structured tool descriptions, specific function signatures, predictable outputs. The agent can see inside the tool. A2A works differently. Agents remain opaque to each other. An A2A agent publishes an &amp;ldquo;Agent Card,&amp;rdquo; a JSON metadata document at a well-known URL, describing its capabilities and authentication requirements. Other agents discover it, negotiate tasks through a defined lifecycle (submitted, working, input-required, completed), and collaborate without sharing memory or internal state.&lt;/p&gt;
&lt;p&gt;Google&amp;rsquo;s own documentation uses a repair shop analogy. MCP is how the mechanic uses diagnostic equipment. A2A is how the customer talks to the shop manager, or how the manager coordinates with a parts supplier. It works: both conversations happen in a real repair shop, and cutting either one doesn&amp;rsquo;t simplify anything.&lt;/p&gt;
&lt;p&gt;A2A &lt;a href="https://developers.googleblog.com/en/a2a-a-new-era-of-agent-interoperability/"&gt;launched with 50+ partner organizations&lt;/a&gt; and &lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/agent2agent-protocol-is-getting-an-upgrade"&gt;grew to 150+ by July 2025&lt;/a&gt;. The list includes Atlassian, Salesforce, SAP, ServiceNow, McKinsey, BCG, Accenture. &lt;a href="https://developers.googleblog.com/en/google-cloud-donates-a2a-to-linux-foundation/"&gt;Google donated A2A to the Linux Foundation in June 2025&lt;/a&gt;. &lt;a href="https://lfaidata.foundation/communityblog/2025/08/29/acp-joins-forces-with-a2a-under-the-linux-foundations-lf-ai-data/"&gt;IBM&amp;rsquo;s competing Agent Communication Protocol merged into A2A in August&lt;/a&gt;, with IBM&amp;rsquo;s engineers joining the technical steering committee. As of February 2026, A2A has roughly &lt;strong&gt;21,900 GitHub stars&lt;/strong&gt;, about 40% of MCP&amp;rsquo;s total. &lt;a href="#lightbox-mcp-vs-a2a-protocol-race-png-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/mcp-vs-a2a-protocol-race.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/mcp-vs-a2a-protocol-race.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/mcp-vs-a2a-protocol-race.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/mcp-vs-a2a-protocol-race.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/mcp-vs-a2a-protocol-race.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/mcp-vs-a2a-protocol-race.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/mcp-vs-a2a-protocol-race.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/mcp-vs-a2a-protocol-race.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/mcp-vs-a2a-protocol-race.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/mcp-vs-a2a-protocol-race.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/mcp-vs-a2a-protocol-race.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/mcp-vs-a2a-protocol-race.png"
alt="Exhibit comparing MCP and A2A protocol adoption: MCP leads with 37,000 GitHub stars, 18,000&amp;#43; public servers, 97M monthly SDK downloads, and 10 SDK languages versus A2A at 21,900 stars, no public registry, and 3 languages"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;h2 id="what-history-can-tell-us-about-how-this-ends"&gt;What history can tell us about how this ends&lt;/h2&gt;
&lt;p&gt;AI agent protocol wars have a consistent pattern. The winner is almost never the technically superior option. It&amp;rsquo;s the one that ships first and gets adopted before anyone can catch up.&lt;/p&gt;
&lt;p&gt;TCP/IP and OSI are the canonical example. The OSI model, published by ISO in 1983, was architecturally more rigorous than TCP/IP&amp;rsquo;s four-layer stack. It had real institutional backing: the US Commerce Department published its GOSIP mandate in August 1988, with formal enforcement beginning in 1990. European governments followed. OSI still lost. TCP/IP won because it had running code, freely available implementations bundled with BSD Unix workstations, while OSI remained elegant theory trapped in committee processes. By 1994 the outcome was obvious. David Clark&amp;rsquo;s &lt;a href="https://groups.csail.mit.edu/ana/People/DDC/future_ietf_92.pdf"&gt;IETF motto captures why&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We reject kings, presidents and voting. We believe in rough consensus and running code.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;VHS versus Betamax is the other lesson people cite, often incorrectly. Betamax had better picture quality. VHS won anyway, and the usual explanation is the movie library. That&amp;rsquo;s part of it. But JVC openly licensed VHS to manufacturers across the industry, which drove prices down and built a content ecosystem Sony couldn&amp;rsquo;t match. By 1987, &lt;a href="https://en.wikipedia.org/wiki/Videotape_format_war"&gt;VHS held 90% of the US VCR market&lt;/a&gt;. Sony conceded in 1988 by manufacturing VHS players. Ecosystem breadth, once established, creates a gravitational field that technical superiority alone can&amp;rsquo;t escape.&lt;/p&gt;
&lt;p&gt;USB is a more recent example with a twist. The consortium, Compaq, DEC, IBM, Intel, Microsoft, NEC, Nortel, formed in 1994 and &lt;a href="https://ethw.org/Milestones:Universal_Serial_Bus_(USB),_1996"&gt;shipped USB 1.0 in January 1996&lt;/a&gt;. Adoption was sluggish until &lt;a href="https://en.wikipedia.org/wiki/IMac_G3"&gt;Apple shipped the iMac G3 in August 1998&lt;/a&gt; with only USB ports, forcing the entire peripheral industry to follow. One player is so central to the ecosystem that their adoption forces everyone else&amp;rsquo;s hand. OpenAI adopting MCP in March 2025 is MCP&amp;rsquo;s iMac moment.&lt;/p&gt;
&lt;p&gt;But USB also offers a warning. USB-C&amp;rsquo;s physical connector won universally, then the underlying protocol fragmented. The same connector could carry anything from USB 2.0 to USB4, 5W to 240W of power, depending on what you plugged together. &lt;a href="https://single-market-economy.ec.europa.eu/sectors/electrical-and-electronic-engineering-industries-eei/radio-equipment-directive-red/one-common-charging-solution-all_en"&gt;The EU eventually legislated convergence through its Radio Equipment Directive, which took effect December 28, 2024&lt;/a&gt;. A standard can win and still fragment when nobody governs the details. &lt;a href="#lightbox-standards-war-precedents-png-2" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/standards-war-precedents.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/standards-war-precedents.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/standards-war-precedents.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/standards-war-precedents.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/standards-war-precedents.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/standards-war-precedents.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/standards-war-precedents.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/standards-war-precedents.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/standards-war-precedents.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/standards-war-precedents.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/standards-war-precedents.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/standards-war-precedents.png"
alt="Exhibit comparing historical standards wars: TCP/IP versus OSI decided by running code, VHS versus Betamax decided by open licensing, USB decided by Apple iMac catalyst event, all paralleling MCP ecosystem-first trajectory"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;h2 id="what-now"&gt;What now?&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://www.linuxfoundation.org/press/linux-foundation-announces-the-formation-of-the-agentic-ai-foundation"&gt;The Linux Foundation&amp;rsquo;s Agentic AI Foundation (AAIF), launched December 9, 2025&lt;/a&gt; with Anthropic, OpenAI, and Block as co-founders, &lt;a href="https://www.linuxfoundation.org/press/agentic-ai-foundation-welcomes-97-new-members"&gt;now has 146 member organizations&lt;/a&gt;, including JPMorgan Chase, American Express, Autodesk, Red Hat, and Huawei. A2A has its own Linux Foundation governance body. MCP sits within AAIF. Both are under the same umbrella, but they&amp;rsquo;re not the same project.&lt;/p&gt;
&lt;p&gt;This is the governance structure you typically see after a standards war has been decided in principle but before the implementation details have been hammered out. Think of the W3C in 1994, not the W3C in 1998. For anyone making architectural decisions right now, the practical question isn&amp;rsquo;t MCP versus A2A. Most major enterprise platforms already support both. Salesforce, SAP, IBM, Microsoft, and AWS have committed to both. The question is sequencing and depth.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://research.isg-one.com/analyst-perspectives/a2a-v-mcp-why-ai-agents-need-both"&gt;ISG analyst David Menninger&lt;/a&gt; put it clearly: &amp;ldquo;MCP first for sharing context; then A2A for dynamic interaction among agents.&amp;rdquo; That&amp;rsquo;s the sequence I&amp;rsquo;d follow. MCP is the more mature protocol with the larger server ecosystem. The 10,000+ existing servers represent integration work that doesn&amp;rsquo;t need to be rebuilt. Start there. Layer A2A on top when your use cases require multi-agent coordination across organizational boundaries, supply chain, cross-platform orchestration, which is exactly where the Tyson Foods and Adobe deployments have landed.&lt;/p&gt;
&lt;p&gt;MCP security deserves a separate conversation. &lt;a href="https://astrix.security/learn/blog/state-of-mcp-server-security-2025/"&gt;Astrix Security&amp;rsquo;s research&lt;/a&gt; found that 53% of MCP servers rely on static credentials rather than OAuth. A critical vulnerability in the mcp-remote npm package (CVE-2025-6514) exposed 437,000+ installations to shell injection. TCP/IP had its share of early-stage security problems in the 1980s, so I&amp;rsquo;m not calling this fatal. But these are real vulnerabilities, and they will cause real incidents before the posture matures.&lt;/p&gt;
&lt;p&gt;Multiple analyst firms converge on an agentic AI market of roughly &lt;strong&gt;$7–8 billion in 2025&lt;/strong&gt;, growing at 40–50% annually, with projections ranging from &lt;a href="https://www.grandviewresearch.com/industry-analysis/ai-agents-market-report"&gt;$50 billion by 2030&lt;/a&gt; to &lt;a href="https://www.precedenceresearch.com/agentic-ai-market"&gt;$199 billion by 2034&lt;/a&gt;. NVIDIA&amp;rsquo;s CUDA is the comparison that matters: 4 million developers, 15 years of compounding library investment, and switching costs that produce &lt;a href="https://nvidianews.nvidia.com/news/nvidia-announces-financial-results-for-fourth-quarter-and-fiscal-2025"&gt;$130.5 billion in annual revenue at 73% gross margins&lt;/a&gt;. MCP&amp;rsquo;s 97 million monthly downloads aren&amp;rsquo;t CUDA yet. But the trajectory points the same direction. &lt;a href="#lightbox-agentic-ai-market-trajectory-png-4" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/agentic-ai-market-trajectory.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/agentic-ai-market-trajectory.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/agentic-ai-market-trajectory.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/agentic-ai-market-trajectory.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/agentic-ai-market-trajectory.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/agentic-ai-market-trajectory.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/agentic-ai-market-trajectory.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/agentic-ai-market-trajectory.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/agentic-ai-market-trajectory.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/agentic-ai-market-trajectory.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/agentic-ai-market-trajectory.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/agentic-ai-market-trajectory.png"
alt="Exhibit showing agentic AI market projections from $7-8 billion in 2025 to $50 billion by 2030 and up to $199 billion by 2034, with consensus 45% CAGR and comparison to NVIDIA CUDA $131B annual revenue"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;My best guess (and I want to be clear it&amp;rsquo;s a guess): MCP becomes the infrastructure layer, A2A becomes the coordination layer, much as TCP handles transport while HTTP handles application-layer communication. Different floors of the same building. The question remains whether 146 AAIF members can hold coherent standards against the competitive pressure of &lt;a href="https://tracxn.com/d/sectors/agentic-ai/__oyRAfdUfHPjf2oap110Wis0Qg12Gd8DzULlDXPJzrzs"&gt;over 1,000 active agentic AI startups&lt;/a&gt;, each with economic incentives to differentiate.&lt;/p&gt;</description></item><item><title>AI Models Are the New Rebar</title><link>http://philippdubach.com/posts/ai-models-are-the-new-rebar/</link><pubDate>Wed, 11 Mar 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/ai-models-are-the-new-rebar/</guid><description>&lt;p&gt;&lt;a href="https://huggingface.co/Qwen/Qwen3.5-35B-A3B"&gt;Qwen 3.5-35B-A3B&lt;/a&gt;, a model released by Alibaba in February 2026, runs on a single consumer GPU with 24 gigabytes of VRAM. A secondhand RTX 4090, available for around $2,000, generates 60 to 100 tokens per second with it. On select benchmarks per Alibaba&amp;rsquo;s own evaluations, it matches or beats Claude Sonnet 4.5. The Qwen 3.5 Flash tier costs &lt;a href="https://www.alibabacloud.com/help/en/model-studio/model-pricing"&gt;&lt;strong&gt;$0.10 per million input tokens&lt;/strong&gt;&lt;/a&gt; through Alibaba&amp;rsquo;s API. &lt;a href="https://www.anthropic.com/news/claude-sonnet-4-5"&gt;Claude Sonnet 4.5 costs &lt;strong&gt;$3.00&lt;/strong&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;That&amp;rsquo;s a 97 percent discount. For comparable performance.&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;m not cherry-picking. Zhipu AI&amp;rsquo;s &lt;a href="https://medium.com/@mlabonne/glm-5-chinas-first-public-ai-company-ships-a-frontier-model-a068cecb74e3"&gt;GLM-5 scores 1,452 on the Chatbot Arena leaderboard&lt;/a&gt;, the highest Elo rating of any open-source model, and its developer&amp;rsquo;s own figures put it at roughly 95 percent of closed-model performance at around 15 percent of the cost. Moonshot AI&amp;rsquo;s &lt;a href="https://www.kimi.com/blog/kimi-k2-5"&gt;Kimi K2.5&lt;/a&gt;, a trillion-parameter model, scores 99.0 on HumanEval and 96.1 on AIME 2025, with a Chatbot Arena Elo of 1,447, at roughly 88 percent less than Claude Opus 4.5 per token. The &lt;a href="https://hai.stanford.edu/ai-index/2025-ai-index-report/technical-performance"&gt;Stanford HAI 2025 AI Index&lt;/a&gt; found the performance gap between open-source and proprietary AI models on the Chatbot Arena leaderboard shrank from &lt;strong&gt;8 percent to 1.7 percent in a single year&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;This is not an IP story. It is not a China story. It is an industrial economics story. And we know how those end. &lt;a href="#lightbox-ai-performance-vs-price-png-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/ai-performance-vs-price.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/ai-performance-vs-price.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/ai-performance-vs-price.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/ai-performance-vs-price.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai-performance-vs-price.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/ai-performance-vs-price.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/ai-performance-vs-price.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/ai-performance-vs-price.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai-performance-vs-price.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/ai-performance-vs-price.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/ai-performance-vs-price.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/ai-performance-vs-price.png"
alt="Exhibit showing open-source AI models have crossed the performance threshold at a fraction of the price, with GLM-5, Kimi K2.5, DeepSeek V3, and Qwen 3.5 Flash all landing in the high-performance low-cost quadrant below $1 per million tokens while Claude Opus 4.5 sits at $15 and GPT-4o at $2.50"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;h2 id="what-the-steel-mills-can-tell-us"&gt;What the steel mills can tell us&lt;/h2&gt;
&lt;p&gt;In the mid-1960s, electric arc furnace mini-mills entered the steel market at the lowest-quality segment: rebar. Capital costs ran one-fifth to one-seventh of what an integrated plant required. Nucor, the most aggressive operator, built its first mill for $6 million when a comparable integrated facility cost $500 million or more. The response from companies like U.S. Steel was rational: retreat from low-margin rebar, harvest the better-margin products, improve average profitability in the short term. Sensible but wrong.&lt;/p&gt;
&lt;p&gt;Each segment mini-mills conquered had higher margins than the last. From rebar to structural steel, from structural steel to sheet metal, the disruptors climbed the value chain until there was nowhere left to climb. The American steel industry &lt;a href="https://www.chicagotribune.com/news/ct-xpm-1990-06-04-9002150481-story.html"&gt;lost money for five consecutive years in the early 1980s&lt;/a&gt;, posting aggregate losses of &lt;strong&gt;$3.38 billion in 1982 alone&lt;/strong&gt;. U.S. Steel shed more than half its workforce, pivoted to oil and gas, and by &lt;a href="https://investors.ussteel.com/news-events/news-releases/detail/659/nippon-steel-corporation-nsc-to-acquire-u-s-steel"&gt;June 2025 accepted a $14.9 billion acquisition by Nippon Steel&lt;/a&gt;, a fraction of its inflation-adjusted peak valuation. Nucor, the mini-mill, became the largest American steelmaker.&lt;/p&gt;
&lt;p&gt;Clayton Christensen spent a career documenting this pattern of disruptive innovation. The incumbents never failed because they made bad decisions. They failed because they made good decisions for their existing customers while the market shifted beneath them. OpenAI is serving demanding enterprise customers with the most capable models available. Anthropic is building trust with regulated industries. These are the correct moves for their current customers. They may also be exactly the wrong moves for the next five years.&lt;/p&gt;
&lt;h2 id="the-cost-decline-eats-strategy"&gt;The cost decline eats strategy&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://epoch.ai/data-insights/llm-inference-price-trends"&gt;Epoch AI&amp;rsquo;s research&lt;/a&gt;, published in 2025, found that AI inference prices are declining at a &lt;strong&gt;median rate of 50x per year&lt;/strong&gt; for equivalent performance levels, with a range spanning 9x to 900x depending on the task. Achieving GPT-4&amp;rsquo;s original performance on PhD-level science questions cost $30 per million input tokens when GPT-4 launched in early 2023. Through open-source alternatives today, the same performance costs under $0.10. A roughly 300-fold reduction in three years, at a pace that dwarfs Moore&amp;rsquo;s Law.&lt;/p&gt;
&lt;p&gt;David Cahn at Sequoia Capital put the structural problem plainly in his &lt;a href="https://sequoiacap.com/article/ais-600b-question/"&gt;&amp;quot;$600 Billion Question&amp;quot;&lt;/a&gt; analysis: &amp;ldquo;GPU computing is increasingly turning into a commodity, metered per hour. Without a monopoly or oligopoly, high fixed cost plus low marginal cost businesses almost always see prices competed down to marginal cost, like airlines.&amp;rdquo; The airline analogy is more foreboding than it sounds. The global airline industry generated cumulative net profits of $36 billion between 1945 and 2000, a net margin of 0.8 percent across 55 years. In the 2000s, the industry lost more than it had earned in the prior half-century combined. Even today, &lt;a href="https://www.iata.org/en/pressroom/2025-releases/2025-12-09-01"&gt;IATA projects airlines&amp;rsquo; return on invested capital at 6.8 percent&lt;/a&gt;, below their weighted average cost of capital of 8.2 percent.&lt;/p&gt;
&lt;p&gt;The difference between AI and airlines is that switching a flight carrier requires rebooking. Switching an AI model requires changing two lines of code. &lt;a href="#lightbox-inference-cost-collapse-png-2" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/inference-cost-collapse.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/inference-cost-collapse.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/inference-cost-collapse.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/inference-cost-collapse.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/inference-cost-collapse.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/inference-cost-collapse.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/inference-cost-collapse.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/inference-cost-collapse.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/inference-cost-collapse.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/inference-cost-collapse.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/inference-cost-collapse.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/inference-cost-collapse.png"
alt="Exhibit showing GPT-4 level performance went from $30 to $0.10 per million tokens in three years, with closed proprietary models shown alongside open-source alternatives that now match frontier performance at a fraction of the cost, representing a 300x cost reduction"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;h2 id="switching-costs-that-approach-zero"&gt;Switching costs that approach zero&lt;/h2&gt;
&lt;p&gt;The OpenAI API format has become the de facto industry standard, supported by virtually every major model provider and open-source inference engine. &lt;a href="https://github.com/BerriAI/litellm"&gt;LiteLLM&lt;/a&gt;, an open-source gateway with approximately 37,000 GitHub stars, provides a unified interface to over 100 providers through a single configuration change. OpenRouter offers managed access to more than 400 models. Setup time: under five minutes.&lt;/p&gt;
&lt;p&gt;Enterprise behavior already reflects this. Perplexity&amp;rsquo;s own data shows 92 percent of Fortune 500 employees use multi-model AI platforms, and their top enterprise accounts access an average of 30 different models. These are Perplexity&amp;rsquo;s internal figures, not independent market research: treat them as directional. The one meaningful source of lock-in is custom fine-tuned models, which are provider-specific and cannot be directly ported. That affects a small fraction of deployments. For the vast majority of inference calls, the model is interchangeable, and the customer buys on price.&lt;/p&gt;
&lt;h2 id="what-openais-numbers-actually-require"&gt;What OpenAI&amp;rsquo;s numbers actually require&lt;/h2&gt;
&lt;p&gt;On February 27, 2026, &lt;a href="https://openai.com/index/scaling-ai-for-everyone/"&gt;OpenAI closed a $110 billion funding round&lt;/a&gt;, the largest private capital raise in history, at a post-money valuation of &lt;strong&gt;$840 billion&lt;/strong&gt;. Amazon committed $50 billion. SoftBank $30 billion. Nvidia $30 billion. The valuation implies extraordinary confidence in OpenAI&amp;rsquo;s ability to maintain pricing power and grow revenue to somewhere between $200 and $280 billion by 2030. At 42x trailing revenue, it is priced not for today&amp;rsquo;s market but for a specific version of the future.&lt;/p&gt;
&lt;p&gt;OpenAI reported &lt;a href="https://openai.com/index/scaling-ai-for-everyone/"&gt;&lt;strong&gt;$20 billion in annualized recurring revenue&lt;/strong&gt;&lt;/a&gt; as of January 2026, up 233 percent year over year. Impressive. But the adjusted gross margin fell to 33 percent in 2025, down from 40 percent the prior year, as &lt;a href="https://the-decoder.com/openai-adds-111-billion-to-its-cash-burn-forecast-as-ai-costs-spiral-beyond-projections/"&gt;inference costs quadrupled to $8.4 billion&lt;/a&gt;. In the first half of 2025 alone, OpenAI lost $13.5 billion. Compute and technical talent costs consume approximately 75 percent of total revenue, and Microsoft takes another 20 percent through 2032. That leaves very little room for the margin expansion the valuation demands.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.anthropic.com/news/anthropic-raises-30-billion-series-g-funding-380-billion-post-money-valuation"&gt;Anthropic&lt;/a&gt; tells a similar story at a smaller scale. At a &lt;strong&gt;$380 billion valuation&lt;/strong&gt; on $14 billion in run-rate revenue, 27x, the company is also unprofitable, projecting positive cash flow somewhere around 2027 to 2028. Both companies are betting they can simultaneously grow revenue and expand margins. In commoditized markets, that is the bet that fails.&lt;/p&gt;
&lt;p&gt;Part of the financing is also circular. Amazon invests $50 billion in OpenAI; a portion flows back to AWS as compute spending. Nvidia invests $30 billion; the same money returns as GPU purchases. This inflates revenue figures while obscuring how much of the demand is genuinely independent. &lt;a href="#lightbox-openai-margin-squeeze-png-3" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/openai-margin-squeeze.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/openai-margin-squeeze.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/openai-margin-squeeze.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/openai-margin-squeeze.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/openai-margin-squeeze.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/openai-margin-squeeze.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/openai-margin-squeeze.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/openai-margin-squeeze.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/openai-margin-squeeze.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/openai-margin-squeeze.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/openai-margin-squeeze.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/openai-margin-squeeze.png"
alt="Exhibit showing OpenAI financials: $20B ARR up 233% but gross margin fell from 40% to 33% as inference costs quadrupled to $8.4B, net loss of $13.5B in H1 2025, with the $840B valuation requiring 43% revenue CAGR to 2030 while expanding margins against open-source price pressure"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;h2 id="who-actually-wins-when-the-model-layer-is-a-commodity"&gt;Who actually wins when the model layer is a commodity&lt;/h2&gt;
&lt;p&gt;Before writing off the incumbents, two historical cases are worth sitting with.&lt;/p&gt;
&lt;p&gt;Amazon Web Services has cut prices &lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/cost-optimization-pillar/cost_cloud_financial_management_scheduled.html"&gt;134 times since 2006&lt;/a&gt;, yet its operating margins expanded to a record &lt;a href="https://www.cnbc.com/2025/05/01/aws-q1-earnings-report-2025.html"&gt;39.5 percent in Q1 2025&lt;/a&gt;. Apple captures roughly 80 to 85 percent of global smartphone operating profits with around 18 to 21 percent of unit shipments, while commodity Android manufacturers earn negligible margins. Both got there the same way: years of accumulated switching costs, vertical integration, ecosystems that cost real money to leave. The question is whether AI model providers can build any of that. I don&amp;rsquo;t think they can, not at the model layer. An API endpoint returning text is not an iPhone. You change it in a config file on a Tuesday afternoon.&lt;/p&gt;
&lt;p&gt;So who does benefit? Nvidia and cloud providers collect rent regardless of which model runs on their hardware. That position is durable. The application layer looks better still: companies embedding AI into domain-specific workflows with proprietary data, where the model is an input rather than the product. As &lt;a href="https://eqtgroup.com/thinq/technology/why-ai-value-wont-just-accrue-to-foundational-models"&gt;Andrew Lewis at EQT&lt;/a&gt; put it, &amp;ldquo;Over time, the value is likely to accrue to the application layer and the product companies.&amp;rdquo; And then there are the platforms with distribution so large they can integrate AI at near-zero marginal cost: Meta embedding Llama into Instagram and WhatsApp, Google weaving Gemini into Search and Workspace. When Mark Zuckerberg open-sources Llama, he is deliberately commoditizing the model layer to prevent any single player from owning the stack above his distribution. When a $1.6 trillion company is your most committed price-cutter, that tells you something about where the margins are going.&lt;/p&gt;</description></item><item><title>AI Capex Arms Race: Who Blinks First?</title><link>http://philippdubach.com/posts/ai-capex-arms-race-who-blinks-first/</link><pubDate>Sun, 08 Mar 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/ai-capex-arms-race-who-blinks-first/</guid><description>&lt;p&gt;Alphabet&amp;rsquo;s free cash flow is projected to fall roughly &lt;strong&gt;90%&lt;/strong&gt; in 2026. Not because the business is in trouble. Because the company has committed to spending &lt;strong&gt;$83–93 billion more&lt;/strong&gt; on capital expenditure than it did last year.&lt;/p&gt;
&lt;p&gt;That is what $660–690 billion in AI capex looks like up close. &lt;a href="https://finance.yahoo.com/news/amazon-200-billion-ai-spending-153341517.html"&gt;Amazon guided to &lt;strong&gt;$200 billion&lt;/strong&gt; alone&lt;/a&gt;. Meta&amp;rsquo;s long-term debt more than doubled to &lt;a href="https://www.sec.gov/Archives/edgar/data/1326801/000162828026003832/meta-12312025xexhibit991.htm"&gt;&lt;strong&gt;$58.7 billion&lt;/strong&gt;&lt;/a&gt; to help finance its share. &lt;a href="https://www.goldmansachs.com/insights/articles/why-ai-companies-may-invest-more-than-500-billion-in-2026"&gt;Goldman Sachs projects&lt;/a&gt; cumulative 2025–2027 spending across the Big 4 at &lt;strong&gt;$1.15 trillion&lt;/strong&gt;, more than double the $477 billion spent over the prior three years combined. BofA credit strategists found this will consume &lt;a href="https://techblog.comsoc.org/2025/11/01/ai-spending-boom-accelerates-big-tech-to-invest-invest-an-aggregate-of-400-billion-in-2025-more-in-2026/"&gt;&lt;strong&gt;94% of operating cash flow minus dividends and buybacks&lt;/strong&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;At what revenue growth rate does any of this pay for itself? And what happens if inference costs fall 100-fold before the infrastructure is fully depreciated? We want to think about this the way a credit analyst would. Not as a technology story but as a corporate finance story. Because the numbers, assembled from earnings releases and analyst reports through February 2026, look less like a technology platform buildout and more like a leveraged buyout of the future. &lt;a href="#lightbox-ai-capex-hockey-stick-png-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/ai-capex-hockey-stick.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/ai-capex-hockey-stick.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/ai-capex-hockey-stick.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/ai-capex-hockey-stick.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai-capex-hockey-stick.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/ai-capex-hockey-stick.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/ai-capex-hockey-stick.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/ai-capex-hockey-stick.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai-capex-hockey-stick.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/ai-capex-hockey-stick.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/ai-capex-hockey-stick.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/ai-capex-hockey-stick.png"
alt="Exhibit showing 2025 actual versus 2026 guided capex for Big 4 hyperscalers: Amazon at $200B guided up 52%, Alphabet at $175-185B up 97%, Meta at $60-65B, Microsoft at $100-120B up 25%, totaling $610-655B combined up 63%"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;h2 id="the-lbo"&gt;The LBO&lt;/h2&gt;
&lt;p&gt;An LBO thesis goes like this: we borrow heavily today, acquire an asset, generate enough cash flow to service the debt, and eventually sell or refinance at a profit. The bet works if the returns from the acquired asset exceed the cost of capital. It fails if the asset underperforms, the cost of capital rises, or the timeline extends beyond what the capital structure can absorb.&lt;/p&gt;
&lt;p&gt;The hyperscaler capex thesis has the same structure, substituting &amp;ldquo;equity&amp;rdquo; and &amp;ldquo;operating cash flow&amp;rdquo; for debt. Each company is telling shareholders: we will deploy enormous capital today, accept near-zero or negative free cash flow for 18 to 36 months, and recoup that investment through AI revenue growth. Sundar Pichai put the bull case plainly &lt;a href="https://www.fool.com/earnings/call-transcripts/2024/07/23/alphabet-googl-q2-2024-earnings-call-transcript/"&gt;at Alphabet&amp;rsquo;s Q2 2024 earnings&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The risk of underinvesting is dramatically greater than the risk of overinvesting for us here.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;At five-year straight-line on $175 billion in Alphabet capex, you get $35 billion in annual depreciation. Add a conservative 10% cost of capital on the incremental investment, and the hurdle gets harder still. For the full &lt;strong&gt;$690 billion&lt;/strong&gt; in 2026 hyperscaler capex, the annual depreciation burden alone approaches &lt;strong&gt;$115–140 billion&lt;/strong&gt; at five-year lives. That is before interest, power, operations, or the cost of next year&amp;rsquo;s upgrade cycle.&lt;/p&gt;
&lt;p&gt;The revenue side of this ledger is far smaller than the capex side. Rough estimates place direct AI revenue across the ecosystem at &lt;strong&gt;$40–60 billion in 2025&lt;/strong&gt;, against AI-specific capex of roughly $300 billion. Coverage ratio: approximately &lt;strong&gt;0.15x&lt;/strong&gt;. &lt;a href="https://sequoiacap.com/article/ais-600b-question/"&gt;Sequoia&amp;rsquo;s David Cahn&lt;/a&gt; calculated that the AI ecosystem needs to generate &lt;strong&gt;$600 billion in annual revenue&lt;/strong&gt; to justify current infrastructure spending, against perhaps $50–100 billion it is actually generating. By 2026, with AI revenue perhaps reaching $80–120 billion and AI capex at $450 billion, the ratio improves to roughly &lt;strong&gt;0.25x&lt;/strong&gt;. Still not a business. &lt;a href="#lightbox-ai-revenue-coverage-gap-png-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/ai-revenue-coverage-gap.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/ai-revenue-coverage-gap.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/ai-revenue-coverage-gap.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/ai-revenue-coverage-gap.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai-revenue-coverage-gap.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/ai-revenue-coverage-gap.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/ai-revenue-coverage-gap.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/ai-revenue-coverage-gap.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai-revenue-coverage-gap.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/ai-revenue-coverage-gap.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/ai-revenue-coverage-gap.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/ai-revenue-coverage-gap.png"
alt="Exhibit showing AI revenue of roughly $50B in 2025 against $300B in AI-specific capex and the $600B revenue threshold estimated by Sequoia Capital, with coverage ratios of 0.17x in 2025 and 0.25x projected for 2026"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;h2 id="what-would-have-to-be-true"&gt;What would have to be true&lt;/h2&gt;
&lt;p&gt;The spending is not obviously irrational. The bull case is worth taking seriously: the right moment to build infrastructure for a platform shift is before the platform fully exists. Railroads were overbuilt. Fiber was overbuilt. Both excesses funded genuinely useful infrastructure that later ran at capacity. If AI becomes the general-purpose technology that most proponents claim, the AI infrastructure being deployed today could look like the most prescient investment since Standard Oil.&lt;/p&gt;
&lt;p&gt;But that argument requires you to believe some very specific things about revenue growth that have not yet materialized. The 2025–2030 revenue ramp embedded in current capex implies AI revenue growing from roughly $60 billion today to somewhere between $600 billion and $2 trillion by 2030, depending on which bullish scenario you pick. Bain calculates that even under the most aggressive adoption scenario, AI generates $1.2 trillion in revenue, against the $2 trillion the spending requires to break even.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://economics.mit.edu/news/daron-acemoglu-what-do-we-know-about-economics-ai"&gt;MIT&amp;rsquo;s Daron Acemoglu&lt;/a&gt;, who won the 2024 Nobel Prize in Economics, projects AI will deliver a total GDP increase of just &lt;strong&gt;1.1–1.6% over ten years&lt;/strong&gt;: roughly a &lt;strong&gt;0.05% annual productivity gain&lt;/strong&gt;. Only about 5% of economic tasks, he estimates, are cost-effectively automatable at current prices. Goldman Sachs&amp;rsquo; Jim Covello made a similar argument in a &lt;a href="https://www.datacenterdynamics.com/en/news/goldman-sachs-1tn-to-be-spent-on-ai-data-centers-chips-and-utility-upgrades-with-little-to-show-for-it-so-far/"&gt;June 2024 note&lt;/a&gt;: &amp;ldquo;Replacing low-wage jobs with tremendously costly technology is basically the polar opposite of the prior technology transitions I&amp;rsquo;ve witnessed in my thirty years of closely following the tech industry.&amp;rdquo; Neither of these is a fringe view. If either is roughly right, the revenue scenarios baked into current capex budgets do not close. And yet the same market is &lt;a href="http://philippdubach.com/posts/the-saaspocalypse-paradox/"&gt;destroying software stocks&lt;/a&gt; because AI adoption is supposedly too strong. Both readings cannot be true.&lt;/p&gt;
&lt;p&gt;Dario Amodei, who is himself building the infrastructure, &lt;a href="https://www.dwarkesh.com/p/dario-amodei-2"&gt;put it very bluntly on the Dwarkesh Podcast in February 2026&lt;/a&gt;: &amp;ldquo;If my revenue is not $1 trillion, if it&amp;rsquo;s even $800 billion, there&amp;rsquo;s no force on Earth, there&amp;rsquo;s no hedge on Earth that could stop me from going bankrupt if I buy that much compute.&amp;rdquo; He was describing his own spending discipline relative to peers. The companies spending three times as much as Anthropic apparently believe they have found the hedge he could not.&lt;/p&gt;
&lt;h2 id="depreciation-time-bomb"&gt;Depreciation time bomb&lt;/h2&gt;
&lt;p&gt;One risk most analysis underweights: AI hardware obsoletes faster than any previous infrastructure cycle.&lt;/p&gt;
&lt;p&gt;Hyperscalers have extended server useful lives from four to five and six years, saving billions in annual depreciation. But Amazon reversed course: in Q4 2024 it took a &lt;a href="https://behindthebalancesheet.substack.com/p/amazons-ai-reality-check"&gt;&lt;strong&gt;$920 million&lt;/strong&gt; charge to early-retire certain servers and networking equipment&lt;/a&gt;, then effective January 1, 2025 it shortened useful lives for a subset of servers from six to five years, citing &amp;ldquo;the increased pace of technology development, particularly in the area of artificial intelligence,&amp;rdquo; a decision expected to reduce 2025 operating income by a further $700 million. Jensen Huang, not a man known for underselling his own products, said of H100 GPUs once Blackwell shipped: &lt;a href="https://www.rev.com/transcripts/gtc-keynote-with-nvidia-ceo-jensen-huang"&gt;&amp;ldquo;You couldn&amp;rsquo;t give Hoppers away.&amp;rdquo;&lt;/a&gt; Nvidia now releases new architectures annually, where it previously released them every two years.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.cnbc.com/2025/11/11/big-short-investor-michael-burry-accuses-ai-hyperscalers-of-artificially-boosting-earnings.html"&gt;Michael Burry&lt;/a&gt;, who spent 2005 correctly modeling the mortgage market&amp;rsquo;s hidden risks, estimates that hyperscalers will understate depreciation by roughly &lt;strong&gt;$176 billion&lt;/strong&gt; in aggregate between 2026 and 2028, causing them to overreport earnings by more than 20%. I have no idea whether Burry is right on the specific number. But the direction is correct. If the useful life of a Blackwell GPU is closer to three years than five because Rubin replaces it in 2027, the depreciation math gets far worse.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://epoch.ai/data-insights/llm-inference-price-trends"&gt;Epoch AI measured&lt;/a&gt; inference costs falling at a median &lt;strong&gt;50 times per year&lt;/strong&gt;, accelerating to &lt;strong&gt;200 times per year&lt;/strong&gt; after January 2024. GPT-3-era processing cost around $20 per million tokens at launch in 2020. By early 2026, models of comparable capability cost roughly &lt;strong&gt;$0.07&lt;/strong&gt; per million tokens. That is a roughly 280-fold decline over five years, and there is no obvious reason for it to stop. &lt;a href="#lightbox-ai-inference-cost-cliff-png-3" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/ai-inference-cost-cliff.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/ai-inference-cost-cliff.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/ai-inference-cost-cliff.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/ai-inference-cost-cliff.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai-inference-cost-cliff.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/ai-inference-cost-cliff.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/ai-inference-cost-cliff.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/ai-inference-cost-cliff.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai-inference-cost-cliff.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/ai-inference-cost-cliff.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/ai-inference-cost-cliff.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/ai-inference-cost-cliff.png"
alt="Exhibit showing inference cost per million tokens falling from $20 at GPT-3 launch in 2020 to $0.07 in early 2026 on a log scale, with Epoch AI measuring acceleration to 200x per year decline after January 2024"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
The hyperscaler response to this is Jevons, &lt;a href="http://philippdubach.com/posts/does-ai-mean-the-demand-on-labor-goes-up/"&gt;an argument I explored in January&lt;/a&gt;: cheaper inference will explode demand, and the total compute consumed will far exceed what efficiency gains removed. They may be right. But the timing matters. Infrastructure being deployed today, at today&amp;rsquo;s GPU prices, needs to generate enough revenue before the next architecture cycle renders it economically obsolete. The payback window is not 36 months. It may be 18.&lt;/p&gt;
&lt;h2 id="arms-race-logic"&gt;Arms race logic&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://fortune.com/2025/09/19/zuckerberg-ai-bubble-definitely-possibility-sam-altman-collapse/"&gt;Mark Zuckerberg acknowledged&lt;/a&gt; the possibility of an AI bubble &amp;ldquo;definitely&amp;rdquo; in September 2025, then spent $72 billion anyway. This is not irrationality. It is game theory. If AI really does create winner-take-most outcomes, slowing down is a bet that the platform shift is smaller than your competitors believe. Most boards are not willing to make that bet. So everyone keeps spending, and as I &lt;a href="http://philippdubach.com/posts/every-bulge-bracket-bank-agrees-on-ai/"&gt;wrote last week&lt;/a&gt;, every bulge bracket bank agrees they should.&lt;/p&gt;
&lt;p&gt;But the same logic drove WorldCom&amp;rsquo;s Bernie Ebbers. The same logic drove Global Crossing. The specific claim driving the 1990s telecom bubble was that internet traffic was &amp;ldquo;doubling every 100 days.&amp;rdquo; It was false: &lt;a href="https://www-users.cse.umn.edu/~odlyzko/doc/internet.growth.myth2.pdf"&gt;researcher Andrew Odlyzko traced it to misleading WorldCom/UUNET claims&lt;/a&gt;, and actual traffic doubled roughly once per year. By 2001, only &lt;strong&gt;5% of installed fiber capacity was in use&lt;/strong&gt;. The infrastructure eventually ran at capacity; it just took a decade and several dozen bankruptcies to get there.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.oaktreecapital.com/insights/memo/is-it-a-bubble"&gt;Howard Marks published a December 2025 memo&lt;/a&gt; asking, with characteristic deliberateness, &amp;ldquo;Is It a Bubble?&amp;rdquo; He noted hyperscalers&amp;rsquo; capex was outpacing revenue momentum and lenders were sweetening terms to keep deal flow alive. J.P. Morgan projects &lt;strong&gt;$300 billion in investment-grade bonds&lt;/strong&gt; for AI data centers in 2026 alone. That is the same fragility that destroyed the telecom builders: cheap debt financing infrastructure before anyone has proved the revenue exists to service it.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://fortune.com/2026/02/23/ai-capex-us-gdp-negative-pantheon/"&gt;Without AI spending, Pantheon Macroeconomics calculated in February 2026&lt;/a&gt;, U.S. corporate capex would currently be negative. The entire infrastructure investment story depends on this cycle continuing: total U.S. GDP grew just 1.4% annualized in H1 2025, and AI-related investment accounted for essentially all of it.&lt;/p&gt;
&lt;aside class="disclaimer" role="note" aria-label="Disclaimer"&gt;
&lt;div class="disclaimer-content"&gt;&lt;p&gt;&lt;strong&gt;Disclaimer:&lt;/strong&gt; All opinions expressed are my own. This is not investment, financial, tax, or legal advice. Past performance does not indicate future results. Do your own research and consult qualified professionals before making financial decisions. No liability accepted for any losses.&lt;/p&gt;&lt;/div&gt;
&lt;/aside&gt;</description></item><item><title>93% of Developers Use AI Coding Tools. Productivity Hasn't Moved.</title><link>http://philippdubach.com/posts/93-of-developers-use-ai-coding-tools.-productivity-hasnt-moved./</link><pubDate>Wed, 04 Mar 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/93-of-developers-use-ai-coding-tools.-productivity-hasnt-moved./</guid><description>&lt;p&gt;A &lt;a href="https://arxiv.org/abs/2507.09089"&gt;study published in July 2025&lt;/a&gt; gave AI coding tools their most credible test yet. Sixteen experienced open-source developers, 246 real tasks, randomized controlled design. The researchers expected to measure how much faster AI made them. What they found: developers using AI took &lt;strong&gt;19% longer&lt;/strong&gt; to complete tasks than those working without it.&lt;/p&gt;
&lt;p&gt;The developers themselves thought they were 20% faster.&lt;/p&gt;
&lt;p&gt;That &lt;strong&gt;39-point gap&lt;/strong&gt; between perception and reality is the most important number in &lt;a href="https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/"&gt;METR&amp;rsquo;s paper&lt;/a&gt;. It lands inside two years of adoption data pointing in the opposite direction. &lt;a href="https://getdx.com/"&gt;DX&lt;/a&gt; surveyed 121,000 developers across 450+ companies and found &lt;strong&gt;92.6%&lt;/strong&gt; use AI coding tools at least monthly. &lt;a href="https://blog.jetbrains.com/ai/2026/02/the-best-ai-models-for-coding-accuracy-integration-and-developer-fit/"&gt;JetBrains&amp;rsquo; AI Pulse&lt;/a&gt; measured 93%. The &lt;a href="https://dora.dev/dora-report-2025"&gt;DORA 2025 report&lt;/a&gt; put it at 90%. On the productivity side: six independent research efforts converge on roughly the same ceiling, &lt;strong&gt;10%&lt;/strong&gt; at the system level, if you&amp;rsquo;re being generous.&lt;a href="#lightbox-ai-coding-perception-gap-png-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/ai-coding-perception-gap.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/ai-coding-perception-gap.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/ai-coding-perception-gap.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/ai-coding-perception-gap.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai-coding-perception-gap.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/ai-coding-perception-gap.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/ai-coding-perception-gap.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/ai-coding-perception-gap.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai-coding-perception-gap.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/ai-coding-perception-gap.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/ai-coding-perception-gap.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/ai-coding-perception-gap.png"
alt="Exhibit showing METR study results: developers using AI took 19% longer to complete tasks while believing they were 20% faster, a 39-point perception gap across 246 tasks with 56% of AI suggestions rejected"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;h2 id="the-bottleneck-was-never-the-typing"&gt;The bottleneck was never the typing&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://www.tocinstitute.org/theory-of-constraints.html"&gt;Goldratt&amp;rsquo;s Theory of Constraints&lt;/a&gt; makes the following prediction: optimizing a step that isn&amp;rsquo;t the bottleneck doesn&amp;rsquo;t improve system throughput. You can make the fastest machine on the factory floor twice as fast. If it&amp;rsquo;s feeding a queue that&amp;rsquo;s already backed up, you&amp;rsquo;ve accomplished nothing at the output level.&lt;/p&gt;
&lt;p&gt;Writing code has never been that bottleneck. &lt;a href="https://www.bain.com/insights/from-pilots-to-payoff-generative-ai-in-software-development-technology-report-2025/"&gt;Bain&amp;rsquo;s analysis&lt;/a&gt; found that writing and testing code accounts for roughly 25-35% of the total software development lifecycle. The rest goes to code review, understanding requirements, debugging, meetings, documentation. Even with a 100% speedup on the coding step, that gives you a 15-25% overall improvement, and that&amp;rsquo;s before accounting for what happens downstream when you generate a lot more code. Gergely Orosz, who runs The Pragmatic Engineer, &lt;a href="https://aws.amazon.com/blogs/enterprise-strategy/measuring-the-impact-of-ai-assistants-on-software-development/"&gt;put it directly&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Speed of typing out code has never been the bottleneck for software development.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;What the data shows now is that AI tools don&amp;rsquo;t just fail to clear the bottleneck. They move it downstream and make it worse. &lt;a href="#lightbox-ai-coding-impact-ceiling-png-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/ai-coding-impact-ceiling.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/ai-coding-impact-ceiling.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/ai-coding-impact-ceiling.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/ai-coding-impact-ceiling.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai-coding-impact-ceiling.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/ai-coding-impact-ceiling.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/ai-coding-impact-ceiling.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/ai-coding-impact-ceiling.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai-coding-impact-ceiling.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/ai-coding-impact-ceiling.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/ai-coding-impact-ceiling.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/ai-coding-impact-ceiling.png"
alt="Exhibit showing coding is 25-35% of the software development lifecycle with developers writing code only 52 minutes per day, meaning even a 100% coding speedup yields at most 15% system improvement under Amdahl&amp;#39;s Law"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;h2 id="the-code-review-bottleneck"&gt;The code review bottleneck&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://www.faros.ai/ai-productivity-paradox"&gt;Faros AI&lt;/a&gt; measured this across 10,000+ developers on 1,255 teams in June 2025. Teams with high AI adoption completed 21% more tasks and merged 98% more pull requests. PR size grew 154%. Then: review time up 91%, bugs up 9%, organizational DORA metrics flat.&lt;/p&gt;
&lt;p&gt;More PRs, bigger PRs, slower reviews, more bugs, no throughput improvement. The coding step accelerated. The review step, already a constraint, got worse. Michael Truell, &lt;a href="https://fortune.com/2025/12/19/cursor-ai-coding-startup-graphite-competition-heats-up/"&gt;Cursor&amp;rsquo;s CEO&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Cursor has made it much faster to write production code. However, for most engineering teams, reviewing code looks the same as it did three years ago&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Cursor then &lt;a href="https://cursor.com/blog/graphite"&gt;acquired Graphite&lt;/a&gt;, a code review startup. The acquisition is a more honest statement about where the constraint lives than anything in Cursor&amp;rsquo;s marketing. The &lt;a href="https://dora.dev/research/2024/dora-report/"&gt;DORA 2024 report&lt;/a&gt; found that for every 25 percentage point increase in AI adoption, delivery throughput dropped 1.5% and delivery stability dropped 7.2%. &lt;a href="https://dora.dev/dora-report-2025"&gt;DORA 2025&lt;/a&gt;, at 90% adoption, put it tersely: &amp;ldquo;AI doesn&amp;rsquo;t fix a team; it amplifies what&amp;rsquo;s already there.&amp;rdquo; The negative relationship with stability holds even as adoption saturates. &lt;a href="#lightbox-ai-coding-bottleneck-shift-png-2" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/ai-coding-bottleneck-shift.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/ai-coding-bottleneck-shift.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/ai-coding-bottleneck-shift.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/ai-coding-bottleneck-shift.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai-coding-bottleneck-shift.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/ai-coding-bottleneck-shift.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/ai-coding-bottleneck-shift.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/ai-coding-bottleneck-shift.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai-coding-bottleneck-shift.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/ai-coding-bottleneck-shift.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/ai-coding-bottleneck-shift.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/ai-coding-bottleneck-shift.png"
alt="Exhibit showing Faros AI data across 10,000&amp;#43; developers: high AI adoption teams merged 98% more pull requests but review time increased 91%, bugs rose 9%, and DORA delivery metrics were unchanged"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;h2 id="41-of-what"&gt;41% of what?&lt;/h2&gt;
&lt;p&gt;One number circulates constantly in press coverage: 41% of code is now AI-generated. It comes from Emad Mostaque, who took GitHub&amp;rsquo;s figure about the share of code accepted by Copilot users and &lt;a href="https://decrypt.co/147191/no-human-programmers-five-years-ai-stability-ceo"&gt;extrapolated it&lt;/a&gt; into a claim about all code everywhere. The original figure applied only to developers already using Copilot, a fraction of GitHub&amp;rsquo;s user base at the time. The extrapolation doesn&amp;rsquo;t hold.&lt;/p&gt;
&lt;p&gt;The more defensible numbers: &lt;a href="https://shiftmag.dev/this-cto-says-93-of-developers-use-ai-but-productivity-is-still-10-8013/"&gt;DX&amp;rsquo;s measurement across 4.2 million developers&lt;/a&gt; puts AI-generated production code at 26.9%. A &lt;a href="https://arxiv.org/abs/2506.08945"&gt;study published in Science&lt;/a&gt; found roughly 30% of Python functions from U.S. contributors on GitHub were AI-generated by late 2024. &lt;a href="https://fortune.com/2024/10/30/googles-code-ai-sundar-pichai/"&gt;Sundar Pichai&lt;/a&gt; said more than a quarter of all new code at Google is AI-generated. These numbers cluster around 25-30%.&lt;/p&gt;
&lt;p&gt;The inflated figure matters because it supports a specific argument: that AI has already crossed some threshold, that the transformation is done, that the productivity gains are already baked in. At 27%, AI is a meaningful contributor to software production. At 41%, you&amp;rsquo;re telling a different story, and the decisions that follow from it are different decisions.&lt;/p&gt;
&lt;p&gt;The quality picture at 27% is not reassuring. &lt;a href="https://www.businesswire.com/news/home/20250730694951/en/AI-Generated-Code-Poses-Major-Security-Risks-in-Nearly-Half-of-All-Development-Tasks-Veracode-Research-Reveals"&gt;Veracode tested 100+ LLMs&lt;/a&gt; across 80 coding tasks and found 45% of AI-generated code introduced OWASP Top 10 vulnerabilities. &lt;a href="https://www.coderabbit.ai/blog/state-of-ai-vs-human-code-generation"&gt;CodeRabbit&amp;rsquo;s analysis&lt;/a&gt; found AI-generated code contains 2.74x more security vulnerabilities than human-written code. &lt;a href="https://www.blackduck.com/blog/open-source-trends-ossra-report.html"&gt;Black Duck&amp;rsquo;s 2026 OSSRA report&lt;/a&gt; found vulnerabilities per codebase up 107% year over year, the mean codebase going from 280 to 581 known vulnerabilities. &lt;a href="https://thenewstack.io/martin-fowler-on-preparing-for-ais-nondeterministic-computing/"&gt;Martin Fowler&amp;rsquo;s framing&lt;/a&gt; is still the most honest I&amp;rsquo;ve seen: &amp;ldquo;Treat every slice as a PR from a rather dodgy collaborator who&amp;rsquo;s very productive in the lines-of-code sense, but you can&amp;rsquo;t trust a thing they&amp;rsquo;re doing.&amp;rdquo;&lt;/p&gt;
&lt;h2 id="perception-is-reality"&gt;Perception is reality&lt;/h2&gt;
&lt;p&gt;The 19% slowdown number has been contested, fairly: the CI is wide (+2% to +39%), the study covered experienced developers on complex codebases, and METR has acknowledged design limitations. In February 2026, &lt;a href="https://metr.org/blog/2026-02-24-uplift-update/"&gt;METR published an update&lt;/a&gt; changing their experiment design after discovering that 30-50% of invited developers declined to participate without AI access, a selection effect that biased the original sample toward developers who benefit least from AI. Their newer cohort (800+ tasks, 57 developers) showed a -4% slowdown with a CI of -15% to +9%, substantially less negative. METR&amp;rsquo;s conclusion: &amp;ldquo;AI likely provides productivity benefits in early 2026.&amp;rdquo; The perception gap and the bottleneck problem remain real, but the exact magnitude of the July 2025 finding should be read with that caveat.&lt;/p&gt;
&lt;p&gt;METR&amp;rsquo;s companion &lt;a href="https://arxiv.org/abs/2503.14499"&gt;Horizon benchmark&lt;/a&gt; (Kwa et al., 2025) puts numbers to that curve: the 50%-task-completion time horizon for Claude 3.7 Sonnet was 60 minutes. Claude Opus 4.6, released February 2026, reached 719 minutes. The doubling time from 2023 is approximately 128 days. METR frames the productivity result as a point on that trend, not a fixed constant, though they also note that their benchmark tasks are cleaner than real production work and performance on &amp;ldquo;messier&amp;rdquo; tasks may improve more slowly. But the perception gap itself is more robust than the exact slowdown figure, and it replicates.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://survey.stackoverflow.co/2025/ai/"&gt;Stack Overflow&amp;rsquo;s 2025 Developer Survey&lt;/a&gt; found favorable views of AI tools dropped from 70% to 60%, with 46% not trusting AI output and 66% citing &amp;ldquo;almost right but not quite&amp;rdquo; as their top frustration. &lt;a href="https://www.software.com/reports/code-time-report"&gt;Software.com&amp;rsquo;s monitoring&lt;/a&gt; of 250,000 developers found the median developer codes for 52 minutes per day, about 11% of a 40-hour week. The tools are fighting over 11% of the workday.&lt;/p&gt;
&lt;p&gt;A &lt;a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4945566"&gt;field experiment across 4,867 developers&lt;/a&gt; from MIT, Princeton, Wharton, and Microsoft found that above-median-tenure developers showed no significant productivity increase from AI tools. The people capable of using AI most effectively are also the people most likely to catch when it&amp;rsquo;s wrong and fix it. It&amp;rsquo;s why the tools work better for junior developers on simple tasks than for senior developers on the things that actually matter most.&lt;/p&gt;
&lt;h2 id="githubs-2022-copilot-study"&gt;GitHub&amp;rsquo;s 2022 Copilot study&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://arxiv.org/abs/2302.06590"&gt;GitHub&amp;rsquo;s 2022 Copilot study&lt;/a&gt;, the &amp;ldquo;55% faster&amp;rdquo; figure, still appears in enterprise sales decks in 2026. One JavaScript task: implementing a web server with HTTP endpoints. Thirty-five completers. No assessment of output quality, test coverage, or whether the code would survive production. Confidence interval: 21% to 89%. Participants knew they were being timed for productivity.&lt;/p&gt;
&lt;p&gt;What the study actually shows is that when you pick a task specifically suited to AI assistance and measure completion time without checking correctness, AI looks fast. That&amp;rsquo;s a real finding. It&amp;rsquo;s just not the one being used to justify eight-figure licensing deals.&lt;/p&gt;
&lt;h2 id="macro-data"&gt;Macro data&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://www.apolloacademy.com/waiting-for-the-ai-j-curve/"&gt;Apollo&amp;rsquo;s Torsten Slok&lt;/a&gt; wrote in early 2026: &amp;ldquo;AI is everywhere except in the incoming macroeconomic data.&amp;rdquo; An &lt;a href="https://www.nber.org/papers/w34836"&gt;NBER paper from February 2026&lt;/a&gt; surveying nearly 6,000 executives found over 80% of firms reported AI had no impact on productivity over the preceding three years. Expected improvement over the next three: 1.4%.&lt;/p&gt;
&lt;p&gt;Daron Acemoglu, who shared the 2024 Nobel Prize in Economics partly for his work on technology and labor markets, &lt;a href="https://www.nber.org/papers/w32487"&gt;projected&lt;/a&gt; a 0.5% total factor productivity increase from AI over the next decade. His reasoning: the economic value of AI concentrates in a narrow set of tasks that don&amp;rsquo;t represent enough of total economic activity to move aggregate numbers. The Bain arithmetic, at macroeconomic scale.&lt;/p&gt;
&lt;p&gt;The standard optimist response is the IT comparison: computers entered enterprises in the 1970s and 1980s without producing measurable productivity improvements for a decade, then the gains came in the mid-1990s. It&amp;rsquo;s a reasonable historical parallel. I&amp;rsquo;m genuinely uncertain whether it applies. Computers replaced manual processes wholesale. AI coding tools are a faster ingredient inside a process whose other ingredients haven&amp;rsquo;t changed: the requirements still need to be understood, the review still needs to happen, the tests still need to pass. The productivity lag might resolve. Or the structure of the workflow might mean it doesn&amp;rsquo;t, even eventually. I don&amp;rsquo;t know, and the honest answer is that nobody does yet.&lt;/p&gt;
&lt;h2 id="where-the-value-actually-lands"&gt;Where the value actually lands&lt;/h2&gt;
&lt;p&gt;Exploration is faster. When I&amp;rsquo;m working on something unfamiliar, a library I haven&amp;rsquo;t used, an API I&amp;rsquo;m integrating for the first time, the startup cost drops. A working first draft arrives in minutes rather than hours. That&amp;rsquo;s real, and I notice it. Whether it shows up in throughput metrics is a different question, and the data suggests mostly not, because the constraint was never the first draft.&lt;/p&gt;
&lt;p&gt;Boilerplate, test scaffolding, documentation: these genuinely benefit too. The tasks that are well-scoped and low-stakes if approximately wrong are where these tools earn their keep. Anyone who&amp;rsquo;s used them seriously already knew this before the research said so.&lt;/p&gt;
&lt;p&gt;Simon Willison, in an &lt;a href="https://www.npr.org/2025/10/21/nx-s1-5506141/ai-code-software-productivity-claims"&gt;NPR interview&lt;/a&gt;: &amp;ldquo;Our job is not to type code into a computer. Our job is to deliver systems that solve problems.&amp;rdquo; The tools handle the first part better than they did a year ago. The second part hasn&amp;rsquo;t changed.&lt;/p&gt;
&lt;h2 id="the-right-question"&gt;The right question&lt;/h2&gt;
&lt;p&gt;The useful product question, if the bottleneck is now review, is what makes review faster and more reliable, not what generates more code faster. AI tools that flag security issues, catch logic errors, and surface context about why code was written a certain way would attack the actual constraint. This is at least part of what Cursor is working toward with Graphite.&lt;/p&gt;
&lt;p&gt;The harder problem is cultural. &lt;a href="https://www.bain.com/insights/from-pilots-to-payoff-generative-ai-in-software-development-technology-report-2025/"&gt;Bain&lt;/a&gt; and DORA say the same thing from different angles: AI amplifies what&amp;rsquo;s already there. Teams with good review practices and clear requirements get leverage. Teams without them produce more code that still doesn&amp;rsquo;t ship on time. The organizations that most want a tool to fix their velocity tend to be the ones with the process debt that prevents any tool from working.&lt;/p&gt;
&lt;p&gt;I have no idea what the five-year picture looks like. The Solow paradox took a decade to resolve and resolved in ways nobody expected. Maybe the AI productivity gains show up in 2029 and the 2026 skeptics look naive. Genuinely possible. I try to hold that view honestly rather than dismiss it.&lt;/p&gt;
&lt;p&gt;What the data shows now: at 92.6% monthly adoption and roughly 27% of production code AI-generated, the experiment has run at real scale. Organizational throughput hasn&amp;rsquo;t moved past 10%. Experienced developers are slower with AI assistance than without it. Bugs are up, review times are up, code quality metrics are declining, and DORA stability goes the wrong way as adoption increases.&lt;/p&gt;</description></item><item><title>Peter Thiel's Physics Department</title><link>http://philippdubach.com/posts/peter-thiels-physics-department/</link><pubDate>Mon, 02 Mar 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/peter-thiels-physics-department/</guid><description>&lt;p&gt;On December 11, &lt;a href="https://en.wikipedia.org/wiki/Jimmy_Carr"&gt;Jimmy Carr&lt;/a&gt; sat on the &lt;a href="https://www.youtube.com/watch?v=mWDCZIvLrS4"&gt;TRIGGERnometry podcast&lt;/a&gt; and delivered a riff that sounded like Peter Thiel&amp;rsquo;s stagnation thesis filtered through a comedian&amp;rsquo;s timing:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Minus the screens from any room, we&amp;rsquo;re living in the 1970s. Nothing&amp;rsquo;s happened in physics since &amp;lsquo;72. String theory has not got us anywhere. But if you take the compute power of AI and point it at physics, what happens? We could have a world of plenty. I hope that&amp;rsquo;s the world we live in. But it could go another way.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Two months later, on February 13, GPT-5.2 &lt;a href="https://thequantuminsider.com/2026/02/13/ai-scientist-spots-what-physicists-missed-in-gluon-scattering/"&gt;derived and formally proved&lt;/a&gt; a new result in theoretical physics: single-minus gluon scattering amplitudes, long assumed to vanish, are nonzero in the half-collinear regime. Nima Arkani-Hamed at the Institute for Advanced Study called the formulas &amp;ldquo;strikingly simple&amp;rdquo; after fifteen years of personal curiosity about the problem. Nathaniel Craig at UC Santa Barbara called it &amp;ldquo;journal-level research advancing the frontiers of theoretical physics.&amp;rdquo;&lt;/p&gt;
&lt;h2 id="thiels-stagnation-case"&gt;Thiel&amp;rsquo;s stagnation case&lt;/h2&gt;
&lt;p&gt;Carr was paraphrasing Thiel, who has been making this argument for fifteen years. The &lt;a href="https://www.scribd.com/document/61379051/What-Happened-to-the-Future-Founders-Fund-Manifesto"&gt;Founders Fund manifesto&lt;/a&gt; (2011) put it bluntly: &amp;ldquo;We wanted flying cars, instead we got 140 characters.&amp;rdquo; Thiel&amp;rsquo;s framework distinguishes progress in bits from progress in atoms: spectacular digital gains since 1970, physical-world stagnation. Tyler Cowen named the broader phenomenon the Great Stagnation. On the &lt;a href="https://singjupost.com/a-i-mars-and-immortality-are-we-dreaming-big-enough-peter-thiel-transcript/"&gt;Douthat podcast&lt;/a&gt; Thiel was more measured: &amp;ldquo;The claim was that the velocity had slowed, it wasn&amp;rsquo;t zero.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;The data supports the velocity claim. Total factor productivity growth, the metric that captures genuine scientific progress and technological improvement, ran at roughly 1.7% annually from 1947 to 1973. Since 2004, it has averaged 0.4%. Robert Gordon&amp;rsquo;s &lt;em&gt;The Rise and Fall of American Growth&lt;/em&gt; argues the &amp;ldquo;special century&amp;rdquo; of 1870 to 1970 was a one-time event. &lt;a href="https://mattsclancy.substack.com/p/science-is-getting-harder"&gt;Bloom, Jones, Van Reenen, and Webb&lt;/a&gt; showed in the &lt;em&gt;American Economic Review&lt;/em&gt; that maintaining Moore&amp;rsquo;s Law required 18x more researchers in 2014 versus 1971.&lt;/p&gt;
&lt;a href="#lightbox-tfp-growth-stagnation-png-0" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/tfp-growth-stagnation.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/tfp-growth-stagnation.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/tfp-growth-stagnation.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/tfp-growth-stagnation.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/tfp-growth-stagnation.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/tfp-growth-stagnation.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/tfp-growth-stagnation.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/tfp-growth-stagnation.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/tfp-growth-stagnation.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/tfp-growth-stagnation.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/tfp-growth-stagnation.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/tfp-growth-stagnation.png"
alt="Peter Thiel&amp;#39;s stagnation thesis in data: US Total Factor Productivity growth by era showing 1.7 percent annually from 1947 to 1973 during the postwar boom, collapsing to 0.5 percent from 1973 to 1996, briefly recovering to 2.0 percent during the IT revival of 1996 to 2004, then falling back to 0.4 percent from 2004 to present, a 76 percent decline from the postwar peak"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;The Standard Model of particle physics was essentially complete by the early 1970s. Since then, we have confirmed things we already predicted: the Higgs boson (2012, 48 years after prediction), gravitational waves (2015, 99 years after Einstein), the accelerating expansion of the universe (1998). Important experimental work. But confirmations, not revolutions. No supersymmetric particles. No extra dimensions. No new fundamental energy sources. No unified field theory. String theory, the leading candidate for physics beyond the Standard Model, has produced &lt;a href="https://www.researchgate.net/publication/334607591_The_String_Theory_Landscape"&gt;zero experimentally confirmed predictions&lt;/a&gt; in 55 years and admits roughly 10^500 possible solutions, which is another way of saying it predicts everything and therefore nothing. &lt;a href="https://www.goodreads.com/author/quotes/17201066.Sabine_Hossenfelder"&gt;Sabine Hossenfelder&lt;/a&gt; captured the frustration:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Theoretical physicists used to explain what was observed. Now they try to explain why they can&amp;rsquo;t explain what was not observed.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id="what-ai-has-already-done-for-science"&gt;What AI has already done for science&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://x.com/demishassabis/status/1845864764469334239"&gt;AlphaFold&lt;/a&gt; predicted the three-dimensional structures of 214 million proteins, solving the protein folding problem for structural biology. It won the 2024 Nobel Prize in Chemistry for Demis Hassabis and John Jumper, and has been used by over 2 million researchers in 190 countries. DeepMind&amp;rsquo;s &lt;a href="https://deepmind.google/discover/blog/millions-of-new-materials-discovered-with-deep-learning/"&gt;GNoME&lt;/a&gt; identified 2.2 million new crystal structures and 381,000 predicted-stable materials, equivalent to roughly 800 years of prior human discovery in materials science. Lawrence Berkeley Lab&amp;rsquo;s A-Lab robotically synthesized 41 of these in &lt;a href="https://deepmind.google/blog/millions-of-new-materials-discovered-with-deep-learning/"&gt;17 days&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;In fusion, &lt;a href="https://deepmind.google/blog/bringing-ai-to-the-next-generation-of-fusion-energy/"&gt;DeepMind trained a reinforcement learning system&lt;/a&gt; to autonomously control plasma in a real tokamak at EPFL, sculpting it into configurations no human operator had achieved. &lt;a href="https://engineering.princeton.edu/news/2024/02/21/engineers-use-ai-wrangle-fusion-power-grid"&gt;Princeton researchers&lt;/a&gt; predicted tearing instabilities 300 milliseconds in advance and adjusted reactor parameters in real time: the first demonstration of preventing, not just suppressing, the instabilities that have plagued fusion for decades. &lt;a href="https://www.cleanenergy-platform.com/insight/inside-taes-2025-plasma-breakthroughand-how-it-changed-fusions-trajectory"&gt;TAE Technologies&lt;/a&gt; used AI-optimized beam injection to sustain plasma above 70 million degrees C. At Lawrence Livermore, the CogSim AI framework &lt;a href="https://lasers.llnl.gov/news/llnl-researchers-employed-ai-driven-model-predict-fusion-ignition-shot"&gt;predicted a 74% probability of ignition&lt;/a&gt; days before the December 2022 shot that achieved it.&lt;/p&gt;
&lt;p&gt;Microsoft and Pacific Northwest National Lab &lt;a href="https://www.datacenterdynamics.com/en/news/microsoft-and-pnnl-use-ai-and-hpc-for-battery-materials-research/"&gt;screened 32.6 million inorganic materials&lt;/a&gt; in roughly 80 hours, identified 18 finalists, and produced a &lt;a href="https://techround.co.uk/news/microsofts-ai-powered-battery-discovery-could-replace-lithium/"&gt;working battery prototype&lt;/a&gt; using 70% less lithium within nine months. In drug discovery, at least &lt;a href="https://pmc.ncbi.nlm.nih.gov/articles/PMC11800368/"&gt;75 AI-discovered drugs&lt;/a&gt; have entered clinical trials, up from 3 in 2016, with Phase I success rates of 80 to 90% compared to the traditional 40%.&lt;/p&gt;
&lt;p&gt;And then, GPT-5.2 produced a new result in theoretical physics. A proof that human physicists had not found. The mathematical reasoning timeline tells the story. &lt;a href="https://deepmind.google/discover/blog/alphageometry-an-olympiad-level-ai-system-for-geometry/"&gt;AlphaGeometry&lt;/a&gt; solved 25 of 30 Olympiad geometry problems in January 2024. By July 2024, &lt;a href="https://deepmind.google/blog/ai-solves-imo-problems-at-silver-medal-level/"&gt;AlphaProof earned a silver medal&lt;/a&gt; at the International Mathematical Olympiad. By 2025, &lt;a href="https://deepmind.google/blog/advanced-version-of-gemini-with-deep-think-officially-achieves-gold-medal-standard-at-the-international-mathematical-olympiad/"&gt;Gemini Deep Think scored gold&lt;/a&gt;: 5 of 6 problems, 35 points, end-to-end in natural language. Terence Tao &lt;a href="https://siliconreckoner.substack.com/p/terence-tao-on-machine-assisted-proofs"&gt;revised his prediction&lt;/a&gt; for superhuman AI mathematics from 2029 to 2026.&lt;/p&gt;
&lt;h2 id="751-compute-gap"&gt;75:1 compute gap&lt;/h2&gt;
&lt;p&gt;Here is the number that matters. Big Tech spent over &lt;strong&gt;$250 billion&lt;/strong&gt; on AI infrastructure in 2024 and 2025. Total US federal AI R&amp;amp;D spending: &lt;a href="https://federalbudgetiq.com/insights/federal-ai-and-it-research-and-development-spending-analysis/"&gt;&lt;strong&gt;$3.3 billion&lt;/strong&gt; per year&lt;/a&gt;. That is a compute divide of roughly 75:1 between commercial and scientific AI investment. The &lt;a href="https://cset.georgetown.edu/article/the-nairr-pilot-estimating-compute/"&gt;NAIRR pilot&lt;/a&gt; allocated about 3.2 yottaFLOPs to academic researchers, enough to train GPT-3.5 once but not enough for a single GPT-4-class run.&lt;/p&gt;
&lt;a href="#lightbox-compute-gap-75-to-1-png-1" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/compute-gap-75-to-1.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/compute-gap-75-to-1.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/compute-gap-75-to-1.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/compute-gap-75-to-1.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/compute-gap-75-to-1.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/compute-gap-75-to-1.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/compute-gap-75-to-1.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/compute-gap-75-to-1.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/compute-gap-75-to-1.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/compute-gap-75-to-1.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/compute-gap-75-to-1.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/compute-gap-75-to-1.png"
alt="The 75 to 1 AI compute gap between industry and science: Big Tech AI capex at over 250 billion dollars per year versus total federal AI R&amp;amp;D spending at 3.3 billion, DOE FASST at 2.4 billion authorized but pending, DOE Genesis at 320 million one-time, and NSF core AI at 494 million per year"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;The DOE&amp;rsquo;s &lt;a href="https://www.anl.gov/article/what-were-argonnes-top-science-research-breakthroughs-in-2025"&gt;Genesis Mission&lt;/a&gt; announced $320 million in December 2025. That is less than what Meta spends on AI infrastructure in a week. The &lt;a href="https://federalbudgetiq.com/insights/federal-ai-and-it-research-and-development-spending-analysis/"&gt;FASST initiative&lt;/a&gt; authorized $2.4 billion per year for five years, $12 billion total, but congressional appropriations are still pending. The US has three exascale supercomputers at national labs. These serve all of science, not just AI.&lt;/p&gt;
&lt;p&gt;If AI has already produced results in theoretical physics, materials science, fusion energy, and drug discovery with what amounts to scraps from the commercial table, what happens when someone makes a serious allocation? &lt;a href="https://fortune.com/2026/02/11/demis-hassabis-nobel-google-deepmind-predicts-ai-renaissance-radical-abundance/"&gt;Hassabis told Fortune&lt;/a&gt; in February 2026 that in 10 to 15 years &amp;ldquo;we&amp;rsquo;ll be in a kind of new golden era of discovery, a kind of new renaissance.&amp;rdquo; He described a vision of &amp;ldquo;radical abundance&amp;rdquo; where AI has &amp;ldquo;successfully bottled the scientific method.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.goldmansachs.com/insights/articles/generative-ai-could-raise-global-gdp-by-7-percent"&gt;Goldman Sachs estimates&lt;/a&gt; generative AI could raise global GDP by 7%, roughly $7 trillion. &lt;a href="https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-next-innovation-revolution-powered-by-ai"&gt;McKinsey pegs&lt;/a&gt; R&amp;amp;D-specific value at $360 to $560 billion annually, but explicitly noted they did not attempt to estimate&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;the value of truly breakthrough innovations that transform markets (if, for example, nuclear fusion was to enable limitless, clean electricity production).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id="the-bear-case-pattern-matching-is-not-physics"&gt;The bear case: pattern matching is not physics&lt;/h2&gt;
&lt;p&gt;The bear case is simple and serious. AI is the best pattern-matching system ever built. Physics does not advance by pattern matching. It advances by conceptual revolution: Riemannian geometry for general relativity, an entirely new mathematical framework for quantum mechanics, gauge theory for the Standard Model. None of these were discoverable in existing data.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://medium.com/@abdullrahmanburhan36/noam-chomsky-on-the-false-promise-of-chatgpt-18c70cda5e24"&gt;Noam Chomsky&lt;/a&gt; argued in the &lt;em&gt;New York Times&lt;/em&gt; that AI&amp;rsquo;s deepest flaw &amp;ldquo;is the absence of the most critical capacity of any intelligence: to say not only what is the case &amp;hellip; but also what is not the case and what could and could not be the case.&amp;rdquo; A commenter on &lt;a href="https://www.math.columbia.edu/~woit/wordpress/?p=15362"&gt;Peter Woit&amp;rsquo;s blog&lt;/a&gt; at Columbia spent &amp;ldquo;over 100 hours probing these models&amp;rdquo; on open problems and found they &amp;ldquo;basically never try to come up with something new&amp;rdquo; when the answer is not already in the training data.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.darioamodei.com/essay/machines-of-loving-grace"&gt;Dario Amodei&lt;/a&gt; was notably careful in &amp;ldquo;Machines of Loving Grace.&amp;rdquo; He predicted AI could compress 50 to 100 years of biological progress into 5 to 10 years, but on physics he hedged: particle physicists are &amp;ldquo;limited by data from particle accelerators&amp;rdquo; and &amp;ldquo;it&amp;rsquo;s not clear that they would do drastically better if they were superintelligent.&amp;rdquo; Some problems are not compute-limited. They are experiment-limited, or concept-limited, or both.&lt;/p&gt;
&lt;p&gt;Stephen Wolfram&amp;rsquo;s principle of computational irreducibility poses the hardest theoretical limit: some systems cannot be predicted by any shortcut. The only way to know what they do is to run them. If fundamental physics contains computationally irreducible problems, no amount of AI compute will crack them.&lt;/p&gt;
&lt;p&gt;But &lt;a href="https://mariokrenn.wordpress.com/"&gt;Mario Krenn&lt;/a&gt; at Max Planck offers a counterpoint from the lab bench. His team published in &lt;em&gt;Physical Review X&lt;/em&gt; on AI-discovered gravitational wave detector designs that outperform human designs, and in &lt;em&gt;Science Advances&lt;/em&gt; on an AI-discovered violation of Bell inequality with unentangled photons. He does not claim AI understands physics. He claims it finds things physicists miss: &amp;ldquo;I let the algorithm run, and within a few hours it found exactly the solution that we as human scientists couldn&amp;rsquo;t find for many weeks.&amp;rdquo;&lt;/p&gt;
&lt;a href="#lightbox-ai-science-paradox-png-3" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/ai-science-paradox.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/ai-science-paradox.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/ai-science-paradox.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/ai-science-paradox.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai-science-paradox.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/ai-science-paradox.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/ai-science-paradox.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/ai-science-paradox.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai-science-paradox.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/ai-science-paradox.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/ai-science-paradox.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/ai-science-paradox.png"
alt="The AI scientific discovery paradox: quantity metrics surging with 3x more papers published, 4.8x more citations received, and 33 percent more arXiv preprints, but quality metrics declining with 4.6 percent less topical territory covered, 22 percent less cross-paper engagement, and researchers herding toward the same topics"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;h2 id="two-roads"&gt;Two roads&lt;/h2&gt;
&lt;p&gt;The nuclear parallel is the one that matters. Fission was discovered in Berlin in December 1938. Hiroshima was August 1945. Seven years from pure physics to weapon. The first nuclear power plant came nine years later. Oppenheimer captured the dynamic: &amp;ldquo;When you see something that is technically sweet, you go ahead and do it, and you argue about what to do about it only after you have had your technical success.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;Every AI-accelerated physics breakthrough is inherently dual-use technology. The &lt;a href="https://www.peaknano.com/blog/the-iaea-world-fusion-outlook-2025"&gt;IAEA reports&lt;/a&gt; 35 of 45 private fusion companies expect commercial pilot plants between 2030 and 2035. Commonwealth Fusion Systems has raised roughly $3 billion. &lt;a href="https://english.news.cn/20250724/213ed7ff0e954935bd5645b30a9dafe3/c.html"&gt;China established a state-owned fusion company&lt;/a&gt; in July 2025. The fusion market is projected at $430 billion by 2030. The same plasma control AI that keeps a tokamak stable could, in principle, optimize weapons physics.&lt;/p&gt;
&lt;p&gt;I don&amp;rsquo;t know which road we&amp;rsquo;re on. I&amp;rsquo;m not sure anyone does. But the velocity of AI scientific discovery, from Olympiad geometry problems to a gold medal at the International Mathematical Olympiad to a result in theoretical physics, all within 25 months, suggests the question will be answered empirically rather than philosophically. And probably sooner than the physicists expect.&lt;/p&gt;
&lt;p&gt;The cost of intelligence has fallen roughly &lt;a href="https://blog.samaltman.com/three-observations"&gt;150x&lt;/a&gt; in two years. The cost of pointing it at physics is a policy choice, not a technical constraint. The 75:1 compute gap between commercial and scientific AI spending is the number that determines how fast this goes. Whether it should go fast is a different question entirely.&lt;/p&gt;</description></item><item><title>Every Bulge Bracket Bank Agrees on AI</title><link>http://philippdubach.com/posts/every-bulge-bracket-bank-agrees-on-ai/</link><pubDate>Sun, 01 Mar 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/every-bulge-bracket-bank-agrees-on-ai/</guid><description>&lt;a href="#lightbox-pdf_covers_overview-png-0" style="display: block; width: 100%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/pdf_covers_overview.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/pdf_covers_overview.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/pdf_covers_overview.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/pdf_covers_overview.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/pdf_covers_overview.png 1200w"
sizes="100vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/pdf_covers_overview.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/pdf_covers_overview.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/pdf_covers_overview.png 1440w"
sizes="100vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/pdf_covers_overview.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/pdf_covers_overview.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/pdf_covers_overview.png 2000w"
sizes="100vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/pdf_covers_overview.png"
alt="Cover pages of 12 AI research reports from Goldman Sachs, JPMorgan, Morgan Stanley, UBS, Barclays, Bank of America, HSBC, Citi, Deutsche Bank, and Santander."
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;I spent the last week reading 12 bank AI research reports from nine of the world&amp;rsquo;s largest financial institutions: Goldman Sachs, JPMorgan, Morgan Stanley (three separate reports), UBS, Barclays, Bank of America, HSBC, Citi, Deutsche Bank, and Santander. I wanted to understand how institutions that collectively manage trillions of dollars and employ thousands of analysts actually see this technology heading into 2026: where they agree, where they diverge, and what they&amp;rsquo;re being less than forthcoming about.&lt;/p&gt;
&lt;p&gt;What I found is useful, sometimes impressive, and &lt;em&gt;(mostly)&lt;/em&gt; worth reading.&lt;/p&gt;
&lt;h2 id="concerning-consensus"&gt;Concerning consensus&lt;/h2&gt;
&lt;p&gt;Every single institution frames AI as a general-purpose technology, not a product cycle. The analogies converge almost word-for-word: &lt;a href="https://www.goldmansachs.com/what-we-do/investment-banking/insights/articles/powering-the-ai-era/report.pdf"&gt;Goldman Sachs&lt;/a&gt; draws the line through railroads, electrification, and telecom. &lt;a href="https://www.santander.com/en/press-room/the-year-ahead-2025/the-macroeconomic-effects-of-artificial-intelligence"&gt;Santander&lt;/a&gt; deploys a formal three-stage GPT framework: steam, ICT, AI. &lt;a href="https://www.morganstanley.com/im/en-us/individual-investor/insights/tales-from-the-emerging-world/ais-silicon-backbone.html"&gt;Morgan Stanley&amp;rsquo;s semiconductor team&lt;/a&gt; writes that AI is &amp;ldquo;closer to electricity than consumer gadgets.&amp;rdquo; Deutsche Bank projects &lt;strong&gt;+$7 trillion&lt;/strong&gt; in global GDP over the decade. &lt;a href="https://www.ubs.com/global/en/wealthmanagement/insights/artificial-intelligence.html"&gt;UBS&lt;/a&gt; puts the AI revenue opportunity at &lt;strong&gt;$2.6 trillion&lt;/strong&gt; by 2030.&lt;/p&gt;
&lt;p&gt;Not one of the twelve reports seriously entertains the possibility that AI is more like 3D printing: genuinely useful in pockets, broadly disappointing in aggregate. Santander comes closest, citing &lt;a href="https://www.nber.org/papers/w32487"&gt;Daron Acemoglu&amp;rsquo;s&lt;/a&gt; conservative &lt;strong&gt;+0.7% cumulative TFP&lt;/strong&gt; estimate over ten years, but even Santander frames that as the floor of the range, not the central case. The optimistic end of the same distribution sits at &lt;strong&gt;+10–15%&lt;/strong&gt;. That&amp;rsquo;s not a rounding error. It&amp;rsquo;s a fundamental disagreement about whether AI will re-run the productivity miracle of electrification or prove more modest in aggregate, and most banks quietly pick the point on the distribution that best supports their commercial positioning.&lt;/p&gt;
&lt;p&gt;The chart below plots each bank by how bullish they are on AI&amp;rsquo;s economic impact against how grounded their analysis is in current empirical data versus forward projections. Bank of America sits alone in the top-right: data-driven and moderately bullish. Goldman sits at the bottom-right: maximally bullish, maximally projective. Santander is the lone occupant of the top-left: empirical and cautious.&lt;/p&gt;
&lt;a href="#lightbox-exhibit-1-macro-conviction1-png-1" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/exhibit-1-macro-conviction1.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/exhibit-1-macro-conviction1.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/exhibit-1-macro-conviction1.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/exhibit-1-macro-conviction1.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/exhibit-1-macro-conviction1.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/exhibit-1-macro-conviction1.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/exhibit-1-macro-conviction1.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/exhibit-1-macro-conviction1.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/exhibit-1-macro-conviction1.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/exhibit-1-macro-conviction1.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/exhibit-1-macro-conviction1.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/exhibit-1-macro-conviction1.png"
alt="Bank AI research reports compared on two axes: macro conviction (cautious to bullish) and evidence basis (projective to empirical). BofA is the only data-driven bull. Goldman Sachs is a projective bull. Santander is the only data-driven skeptic. Most institutions cluster in the bullish-projective quadrant."
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;That chart is an editorial interpretation, not a precise measurement. But the shape is right. Bank of America is the only institution that consistently anchors its claims to actual GDP data rather than projections. Goldman Sachs, at the other extreme, produces a report that reads as a pitch to every infrastructure CFO and sovereign wealth fund in the world. Both can be making valid arguments. They&amp;rsquo;re just not making the same kind.&lt;/p&gt;
&lt;h2 id="whats-happening-vs-what-might-happen"&gt;What’s happening vs. what might happen&lt;/h2&gt;
&lt;p&gt;BofA and Santander are the two worth pausing on, because they&amp;rsquo;re doing something different from the rest: they&amp;rsquo;re reporting what&amp;rsquo;s happening rather than what might happen.&lt;/p&gt;
&lt;p&gt;Bank of America, using Bureau of Labor Statistics and Bureau of Economic Analysis data, finds that AI capex contributed &lt;strong&gt;1.4–1.5 percentage points&lt;/strong&gt; to US GDP growth in H1 2025. Headline growth rates were running around 2% in that period. So AI infrastructure spending was the single largest driver of US economic expansion. That&amp;rsquo;s a real number from real data, and it&amp;rsquo;s the most important figure in any of these reports.&lt;/p&gt;
&lt;p&gt;BofA also finds a &lt;em&gt;positive&lt;/em&gt; correlation between AI adoption and employment in white-collar sectors: software developers are up &lt;strong&gt;+17.9%&lt;/strong&gt;, while insurance appraisers, a role where AI substitutes directly for human judgment, are down &lt;strong&gt;-20%&lt;/strong&gt;. The disruption is concentrated in specific tasks. It hasn&amp;rsquo;t shown up in aggregate employment. Yet.&lt;/p&gt;
&lt;p&gt;Then there&amp;rsquo;s Santander, which writes the most academically rigorous report of the twelve and includes numbers the consensus would rather not linger on. The enterprise AI adoption rate data is sobering: only around &lt;strong&gt;10% of US companies&lt;/strong&gt; are actually using AI to produce goods and services. &lt;strong&gt;42% of companies abandoned GenAI projects in 2024&lt;/strong&gt;, a figure corroborated by &lt;a href="https://mlq.ai/media/quarterly_decks/v0.1_State_of_AI_in_Business_2025_Report.pdf"&gt;MIT&amp;rsquo;s 2025 GenAI Divide research&lt;/a&gt;, which found 95% of enterprise pilots fail to reach production. Only &lt;strong&gt;1%&lt;/strong&gt; of companies describe their rollouts as mature. Meanwhile, 78% say they use AI in at least one function. The gap between &amp;ldquo;we have a pilot&amp;rdquo; and &amp;ldquo;this is generating value&amp;rdquo; is enormous.&lt;/p&gt;
&lt;p&gt;Goldman&amp;rsquo;s &lt;strong&gt;$800 million per day&lt;/strong&gt; in hyperscaler capex and Santander&amp;rsquo;s 42% abandonment rate aren&amp;rsquo;t as contradictory as they look. Capex precedes productivity in every infrastructure cycle. That part is historically unambiguous. The question is how long the gap lasts, and whether the eventual productivity gains justify what&amp;rsquo;s been spent getting there.&lt;/p&gt;
&lt;h2 id="dotcom-comparison"&gt;Dotcom comparison&lt;/h2&gt;
&lt;p&gt;Every report that addresses the bubble question reaches the same conclusion: this isn&amp;rsquo;t the late 1990s.&lt;/p&gt;
&lt;p&gt;The primary evidence is valuation. Nvidia trades at &lt;strong&gt;25–30x forward earnings&lt;/strong&gt; versus Cisco&amp;rsquo;s &lt;strong&gt;~140x&lt;/strong&gt; at the March 2000 peak. The Magnificent 6 sit at roughly &lt;strong&gt;35x&lt;/strong&gt; versus &lt;strong&gt;55x&lt;/strong&gt; for the TMT index at its apex. &lt;a href="https://www.morganstanley.com/im/en-us/individual-investor/insights/tales-from-the-emerging-world/ais-silicon-backbone.html"&gt;Morgan Stanley&amp;rsquo;s Silicon Backbone report&lt;/a&gt; makes this comparison rigorously, and I think they&amp;rsquo;re right that the earnings quality is categorically different from dot-com era technology stocks.&lt;/p&gt;
&lt;p&gt;But the comparison works less cleanly when you look at concentration rather than individual valuations. Deutsche Bank notes that the top 10 S&amp;amp;P 500 companies now represent &lt;strong&gt;40% of total market cap&lt;/strong&gt;, an extreme not seen at the dot-com peak. A &lt;a href="https://www.investing.com/news/stock-market-news/bofas-survey-shows-54-of-investors-say-ai-in-bubble-60-say-stocks-overvalued-4284842"&gt;Bank of America fund manager survey&lt;/a&gt; from October 2025 found &lt;strong&gt;54% of global managers believe AI equities are in a bubble&lt;/strong&gt;, and &lt;strong&gt;60% view global equities as overvalued&lt;/strong&gt;. You can simultaneously hold that Nvidia&amp;rsquo;s PE is reasonable and that a portfolio with 40% weight in ten companies carries concentration risk that PE comparisons don&amp;rsquo;t capture. Reassuring on one axis. Alarming on another. Most sell-side AI research cites whichever data point supports its preferred conclusion and leaves the tension sitting there unaddressed.&lt;/p&gt;
&lt;p&gt;There&amp;rsquo;s also a subtler version of the bubble question that none of the twelve reports asks directly. The &amp;ldquo;infrastructure comes before productivity&amp;rdquo; argument is historically correct: railroads were overbuilt before they transformed commerce; the internet fibre glut of 1999–2000 eventually became the backbone of the digital economy. But the investors who financed Global Crossing and 360networks still lost everything. The infrastructure thesis being correct in the long run isn&amp;rsquo;t the same as every current valuation being justified. Goldman&amp;rsquo;s report is particularly careful to avoid addressing that distinction. The implicit message, &amp;ldquo;we financed the pipes before and it worked out,&amp;rdquo; skips past the question of which financiers got paid and which got wiped out in the transition.&lt;/p&gt;
&lt;h2 id="sell-side"&gt;Sell side&lt;/h2&gt;
&lt;p&gt;The following chart maps risk awareness against bullishness of tone, and the clustering is revealing.&lt;/p&gt;
&lt;a href="#lightbox-exhibit-3-risk-bullishness1-png-4" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/exhibit-3-risk-bullishness1.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/exhibit-3-risk-bullishness1.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/exhibit-3-risk-bullishness1.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/exhibit-3-risk-bullishness1.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/exhibit-3-risk-bullishness1.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/exhibit-3-risk-bullishness1.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/exhibit-3-risk-bullishness1.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/exhibit-3-risk-bullishness1.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/exhibit-3-risk-bullishness1.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/exhibit-3-risk-bullishness1.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/exhibit-3-risk-bullishness1.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/exhibit-3-risk-bullishness1.png"
alt="Goldman Sachs and UBS AI research reports plotted as aggressively bullish and risk-dismissive. Santander and BofA are measured and risk-aware. HSBC is an optimistic hand-waver. Chart maps risk awareness vs bullishness of tone across 12 bank AI research reports."
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;Goldman and UBS are in the bottom-right: aggressively bullish, risk-dismissive. Santander and BofA are in the top-left, actually wrestling with the uncertainty. HSBC is the clearest case of motivated reasoning: the report is written explicitly to stop private banking clients from panic-selling their SaaS positions after multiple quarters of multiple compression. &lt;em&gt;(Whether that advice turns out to be right is a separate question.)&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;I don&amp;rsquo;t think this makes any of these reports dishonest. But the reader needs to supply the discount rate that each institution&amp;rsquo;s interests warrant.&lt;/p&gt;
&lt;p&gt;Goldman Sachs earns advisory fees on the data centre and energy deals it describes. Barclays lends to energy infrastructure projects. Morgan Stanley is selling both EM equity exposure and second-order stock-picking strategies through its asset management arm. UBS provides a clean three-layer investment framework that maps directly to its wealth management product shelf. Citi frames AI as accelerating the electronification of markets, the very trend that drives Citi&amp;rsquo;s trading revenue. &lt;a href="https://fortune.com/2026/02/18/will-ai-destroy-jobs-deutsche-bank-asks-ai-to-predict/"&gt;Deutsche Bank&lt;/a&gt;, most self-aware of the ten, used AI to generate its AI report. The meta-commentary is right there in the methodology.&lt;/p&gt;
&lt;p&gt;Not a single report concludes &amp;ldquo;this may be overhyped and you should meaningfully reduce exposure.&amp;rdquo; Every institution has a commercial interest in the AI narrative staying bullish. That doesn&amp;rsquo;t mean the narrative is wrong. It does mean unanimous conviction from nine sell-side AI research teams is not the same thing as nine independent analyses reaching the same conclusion.&lt;/p&gt;
&lt;h2 id="second-order-ai-beneficiaries"&gt;Second-order AI beneficiaries&lt;/h2&gt;
&lt;p&gt;The next two charts contain what I think is the most interesting tension across all twelve reports.&lt;/p&gt;
&lt;a href="#lightbox-exhibit-2-value-chain1-png-5" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/exhibit-2-value-chain1.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/exhibit-2-value-chain1.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/exhibit-2-value-chain1.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/exhibit-2-value-chain1.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/exhibit-2-value-chain1.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/exhibit-2-value-chain1.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/exhibit-2-value-chain1.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/exhibit-2-value-chain1.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/exhibit-2-value-chain1.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/exhibit-2-value-chain1.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/exhibit-2-value-chain1.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/exhibit-2-value-chain1.png"
alt="Value chain focus vs time horizon: which banks favour first-order AI enablers (chips, data centres) vs second-order AI beneficiaries (deploying companies). Goldman Sachs and Barclays are near-term first-order plays. Morgan Stanley second-order report sits in long-term deployers quadrant."
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;a href="#lightbox-exhibit-4-disruption-timeline1-png-6" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/exhibit-4-disruption-timeline1.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/exhibit-4-disruption-timeline1.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/exhibit-4-disruption-timeline1.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/exhibit-4-disruption-timeline1.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/exhibit-4-disruption-timeline1.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/exhibit-4-disruption-timeline1.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/exhibit-4-disruption-timeline1.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/exhibit-4-disruption-timeline1.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/exhibit-4-disruption-timeline1.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/exhibit-4-disruption-timeline1.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/exhibit-4-disruption-timeline1.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/exhibit-4-disruption-timeline1.png"
alt="AI disruption magnitude vs timeline across 12 bank research reports. Goldman Sachs and Barclays expect large near-term disruption. Santander sees incremental long-term change. Morgan Stanley robotics and JPMorgan see radical but distant disruption. BofA sees moderate disruption already underway."
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;&lt;a href="https://www.morganstanley.com/im/en-us/individual-investor/insights/articles/investing-in-second-order-effects.html"&gt;Morgan Stanley&amp;rsquo;s Counterpoint Global team&lt;/a&gt;, in the second-order effects report, presents historical data that should make the rest of this collection at least slightly uncomfortable. In the railroad era, Walmart&amp;rsquo;s equivalent outperformed Ford&amp;rsquo;s equivalent by &lt;strong&gt;1,622x to 23x&lt;/strong&gt;. In the internet era, Netflix returned &lt;strong&gt;519x&lt;/strong&gt; versus Cisco&amp;rsquo;s &lt;strong&gt;4x&lt;/strong&gt;. It&amp;rsquo;s the same pattern every time: the companies that &lt;em&gt;use&lt;/em&gt; the infrastructure to serve customers dramatically outperform the companies that &lt;em&gt;build&lt;/em&gt; it.&lt;/p&gt;
&lt;p&gt;Yet nearly every bank&amp;rsquo;s actual investment positioning sits in Nvidia, ASML, hyperscalers, data centre REITs, nuclear utilities, overwhelmingly first-order enablers. Either the historical pattern won&amp;rsquo;t repeat this time (possible, but not argued anywhere in these reports), or there&amp;rsquo;s a valid timing explanation (first-order wins in the buildout phase, second-order wins in deployment) or most of these recommendations will look dated within five years.&lt;/p&gt;
&lt;p&gt;Morgan Stanley&amp;rsquo;s own three reports collectively make the case for second-order investing over the long run while still recommending first-order plays in the near term. That&amp;rsquo;s not quite inconsistent. But the tension deserves more acknowledgment than it gets.&lt;/p&gt;
&lt;h2 id="power"&gt;Power&lt;/h2&gt;
&lt;p&gt;If I had to pick one analytical claim that holds up regardless of where the productivity debate lands, it&amp;rsquo;s this: power is the binding constraint, and the infrastructure required to relieve it is real, expensive, and already being built.&lt;/p&gt;
&lt;p&gt;The numbers are consistent across institutions. US data centre power consumption runs at &lt;strong&gt;150–175 TWh&lt;/strong&gt; today. &lt;a href="https://www.ib.barclays/our-insights/ai-revolution-meeting-massive-infrastructure-demand.html"&gt;Barclays&lt;/a&gt; projects &lt;strong&gt;560 TWh by 2030&lt;/strong&gt;, approximately 13% of total US electricity. Goldman Sachs estimates &lt;strong&gt;60%&lt;/strong&gt; of new data centre power through 2030 will require net-new generation capacity. The US power grid has an average age of &lt;strong&gt;40 years&lt;/strong&gt;. Token consumption grew &lt;strong&gt;4,274%&lt;/strong&gt; in a single year. Data centre construction spending has grown roughly &lt;strong&gt;60% year-on-year&lt;/strong&gt; since ChatGPT launched in late 2022.&lt;/p&gt;
&lt;p&gt;Barclays frames this as a Jevons paradox: efficiency improvements in model inference will, counterintuitively, increase total energy consumption because they make AI cheaper and drive higher usage. I think that&amp;rsquo;s right. It&amp;rsquo;s exactly how personal computing and the internet played out. Every report that addresses energy lands on nuclear as the preferred long-term solution: &lt;a href="https://www.energy.gov/ne/articles/9-key-takeaways-president-trumps-executive-orders-nuclear-energy"&gt;four executive orders&lt;/a&gt; in early 2025, a 400 GW capacity target by 2050, the &lt;a href="https://www.constellationenergy.com/news/2024/Constellation-to-Launch-Crane-Clean-Energy-Center-Restoring-Jobs-and-Carbon-Free-Power-to-The-Grid.html"&gt;Three Mile Island restart&lt;/a&gt;. That consensus may prove correct. It may also be the sector where the infrastructure-before-returns gap runs longest.&lt;/p&gt;
&lt;h2 id="what-the-reports-dont-say"&gt;What the reports don&amp;rsquo;t say&lt;/h2&gt;
&lt;p&gt;The quadrant charts map where the banks are looking. They&amp;rsquo;re less revealing about what&amp;rsquo;s off the frame entirely.&lt;/p&gt;
&lt;p&gt;No report models a structured downside scenario: AI capex producing disappointing returns, hyperscalers pulling back, or a major data centre financing default triggering something worse. The closest is Santander&amp;rsquo;s 42% abandonment statistic, but even Santander doesn&amp;rsquo;t ask what happens if that number climbs to 60%.&lt;/p&gt;
&lt;p&gt;No report discusses AI safety or alignment risks. &lt;a href="https://www.ubs.com/global/en/wealthmanagement/insights/artificial-intelligence.html"&gt;UBS&lt;/a&gt; notes that AI task completion duration has doubled every seven months and explicitly references the AGI trajectory, then moves directly to investment implications, as if &amp;ldquo;AGI trajectory&amp;rdquo; carries no risk premium at all. I find that strange.&lt;/p&gt;
&lt;p&gt;The collision between AI energy demand and climate commitments gets almost no treatment. Only &lt;a href="https://www.ib.barclays/our-insights/ai-revolution-meeting-massive-infrastructure-demand.html"&gt;Barclays&lt;/a&gt; mentions that global CO2 emissions hit a record &lt;strong&gt;37.7 gigatonnes&lt;/strong&gt; &lt;a href="https://www.iea.org/reports/global-energy-review-2025/co2-emissions"&gt;in 2023&lt;/a&gt;. The institutions projecting AI consuming 13% of US electricity by 2030 don&amp;rsquo;t reconcile that with the net-zero commitments in their own sustainability reports.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.jpmorganchase.com/content/dam/jpmorganchase/documents/center-for-geopolitics/decoding-the-new-global-operating-system.pdf"&gt;JPMorgan&lt;/a&gt;, which provides the most detailed geopolitical analysis of the twelve, never models a Taiwan Strait disruption scenario. &lt;a href="https://www.morganstanley.com/im/en-us/individual-investor/insights/tales-from-the-emerging-world/ais-silicon-backbone.html"&gt;Morgan Stanley&lt;/a&gt; identifies Taiwan, Korea, and China as &amp;ldquo;irreplaceable&amp;rdquo; nodes in the AI hardware supply chain, while calling emerging market semiconductor exposure &amp;ldquo;long-term infrastructure participation.&amp;rdquo; Those two characterisations sit in very uncomfortable proximity, and neither report acknowledges it.&lt;/p&gt;
&lt;p&gt;I came away from this with real respect for several of these pieces, particularly BofA&amp;rsquo;s empirical rigour and Santander&amp;rsquo;s willingness to cite unflattering numbers. The energy infrastructure thesis seems to me the most durable of the lot: the power bottleneck is real regardless of where you land on the productivity question.&lt;/p&gt;
&lt;p&gt;But I also came away convinced that this consensus is shaped as much by institutional incentive as by analytical independence. When nine institutions with combined AI-related revenue exposure in the hundreds of billions all agree you should increase AI exposure, the interesting question isn&amp;rsquo;t whether they&amp;rsquo;re right. They may well be.&lt;/p&gt;</description></item><item><title>When AI Labs Become Defense Contractors</title><link>http://philippdubach.com/posts/when-ai-labs-become-defense-contractors/</link><pubDate>Sun, 01 Mar 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/when-ai-labs-become-defense-contractors/</guid><description>&lt;p&gt;&lt;a href="https://airandspace.si.edu/collection-objects/lockheed-vega-5b-amelia-earhart/nasm_A19670093000"&gt;Lockheed started by building Amelia Earhart&amp;rsquo;s favorite plane&lt;/a&gt;. Then came a government loan guarantee in 1971 (the L-1011 TriStar nearly killed the company), a Cold War, decades of consolidation, and now a business that earns &lt;a href="https://news.lockheedmartin.com/2025-01-28-Lockheed-Martin-Reports-Fourth-Quarter-and-Full-Year-2024-Financial-Results"&gt;&lt;strong&gt;92.5%&lt;/strong&gt; of its revenue from government contracts&lt;/a&gt;, with the F-35 alone accounting for &lt;strong&gt;26%&lt;/strong&gt; of its $71 billion in annual sales. The process took about 50 years. AI labs becoming defense contractors will happen faster.&lt;/p&gt;
&lt;p&gt;On February 27, 2026, two things happened within hours of each other. President Trump ordered every federal agency to &lt;a href="https://www.cnbc.com/2026/02/27/trump-anthropic-ai-pentagon.html"&gt;&amp;ldquo;IMMEDIATELY CEASE all use of Anthropic&amp;rsquo;s technology&amp;rdquo;&lt;/a&gt; after CEO Dario Amodei refused to strip safety constraints from Claude&amp;rsquo;s Pentagon deployment, &lt;a href="https://www.anthropic.com/news/statement-department-of-war"&gt;specifically prohibitions on mass domestic surveillance and fully autonomous weapons&lt;/a&gt;. Defense Secretary Pete Hegseth then labeled Anthropic a &lt;a href="https://www.cbsnews.com/news/hegseth-declares-anthropic-supply-chain-risk/"&gt;&amp;ldquo;Supply-Chain Risk to National Security,&amp;rdquo;&lt;/a&gt; a designation previously reserved for foreign adversaries like Huawei, &lt;a href="https://fortune.com/2026/02/28/openai-pentagon-deal-anthropic-designated-supply-chain-risk-unprecedented-action-damage-its-growth/"&gt;never before applied to an American company&lt;/a&gt;. That evening, Sam Altman announced that OpenAI had signed a deal to deploy its models on the Pentagon&amp;rsquo;s classified network, &lt;a href="https://x.com/sama/status/2027578652477821175"&gt;posting that the Department of War &amp;ldquo;displayed a deep respect for safety.&amp;rdquo;&lt;/a&gt; (Whether that reflects the Pentagon&amp;rsquo;s actual position or Altman&amp;rsquo;s political optimism, remains unclear for now.)&lt;/p&gt;
&lt;p&gt;Most coverage has framed this as an ethics dispute. I think that framing is going to age poorly. What I see is the economics of defense spending doing what they have always done to every company they touch, and the ethics arguments becoming less audible as the financial gravity increases.&lt;/p&gt;
&lt;h2 id="the-last-supper-and-defense-industry-consolidation"&gt;The Last Supper and defense industry consolidation&lt;/h2&gt;
&lt;p&gt;In the summer of 1993, Secretary of Defense Les Aspin and Deputy Secretary William Perry invited the CEOs of America&amp;rsquo;s defense firms to dinner at the Pentagon and told them, in so many words, that most of them would not survive. Cold War budget cuts meant the government could sustain roughly one prime contractor per equipment category. &lt;a href="https://www.defensenews.com/industry/2024/02/20/the-pentagon-wants-industry-to-transform-again-to-meet-demand-can-it/"&gt;Norman Augustine, then CEO of Martin Marietta, named it the Last Supper.&lt;/a&gt; The message was clear: consolidate or die, and the government would not stop you from consolidating.&lt;/p&gt;
&lt;p&gt;The restructuring that followed was fast, even by M&amp;amp;A standards. &lt;a href="https://en.wikipedia.org/wiki/Last_Supper_(defense_industry)"&gt;Within four years, &lt;strong&gt;51 prime defense contractors collapsed into five&lt;/strong&gt;&lt;/a&gt;: &lt;a href="https://www.ftc.gov/news-events/news/press-releases/1995/05/lockheed-corporation"&gt;Lockheed merged with Martin Marietta in 1995 ($10 billion)&lt;/a&gt;, &lt;a href="https://boeing.mediaroom.com/1997-07-31-Boeing-Completes-McDonnell-Douglas-Merger"&gt;Boeing absorbed McDonnell Douglas in 1997 ($13.3 billion)&lt;/a&gt;, Raytheon folded in Hughes Electronics and Texas Instruments&amp;rsquo; defense unit. Between 2011 and 2015, &lt;a href="https://www.defensenews.com/breaking-news/2017/12/14/american-exodus-17000-us-defense-suppliers-may-have-left-the-defense-sector/"&gt;an additional &lt;strong&gt;17,000 U.S. companies exited the defense industry&lt;/strong&gt;&lt;/a&gt;, a contraction that hollowed out the supplier base the Big Five still depend on today.&lt;/p&gt;
&lt;p&gt;The revenue dependency data shows what happens to the companies on the inside of that consolidation. Boeing before 1997 was, as &lt;a href="https://www.cnn.com/2024/01/30/business/boeing-history-of-problems"&gt;Bank of America analyst Ron Epstein put it&lt;/a&gt;, &amp;ldquo;a company where engineers were high church.&amp;rdquo; Post-merger, Boeing relocated its headquarters from Seattle&amp;rsquo;s engineering center to Chicago, physically separating leadership from manufacturing. &lt;a href="https://boeing.mediaroom.com/2025-01-28-Boeing-Reports-Fourth-Quarter-Results"&gt;Defense rose to &lt;strong&gt;35.8% of Boeing&amp;rsquo;s FY2024 revenue&lt;/strong&gt; ($23.9 billion)&lt;/a&gt;. The cultural shift that merger carried, financial discipline over engineering judgment, is what most 737 MAX post-mortems eventually trace back to. Companies don&amp;rsquo;t plan to end up here. They respond to incentives, and the incentives compound.&lt;/p&gt;
&lt;a href="#lightbox-government-revenue-dependency-png-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/government-revenue-dependency.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/government-revenue-dependency.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/government-revenue-dependency.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/government-revenue-dependency.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/government-revenue-dependency.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/government-revenue-dependency.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/government-revenue-dependency.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/government-revenue-dependency.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/government-revenue-dependency.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/government-revenue-dependency.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/government-revenue-dependency.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/government-revenue-dependency.png"
alt="Government revenue dependency across defense primes and AI defense contractors: Lockheed Martin at 92.5%, RTX at 55%, Boeing at 35.8%, Palantir at 53.7%, OpenAI at 5%, and Anthropic at 2%, showing how classified defense work creates a one-way revenue ratchet"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;The AI industry will face the same incentives, just faster, and through a different mechanism: not M&amp;amp;A but access to classified networks and government-funded compute.&lt;/p&gt;
&lt;h2 id="how-pentagon-ai-spending-reshapes-a-company"&gt;How Pentagon AI spending reshapes a company&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://defensescoop.com/2024/03/11/pentagon-ai-budget-request-2025/"&gt;The FY2025 DoD AI budget was &lt;strong&gt;$1.8 billion&lt;/strong&gt;&lt;/a&gt;, a figure that nearly everyone involved described as insufficient. &lt;a href="https://defensescoop.com/2025/06/26/dod-fy26-budget-request-autonomy-unmanned-systems/"&gt;The FY2026 budget request earmarks &lt;strong&gt;$13.4 billion&lt;/strong&gt; for AI and autonomous systems&lt;/a&gt;, a roughly 7x increase in a single budget cycle, and the first time these technologies have their own standalone line item inside a total defense request of &lt;strong&gt;$892.6 billion&lt;/strong&gt;. For context: &lt;a href="https://siliconangle.com/2026/02/12/anthropic-closes-30b-round-annualized-revenue-tops-14b/"&gt;Anthropic&amp;rsquo;s full annualized revenue as of February 2026 was approximately &lt;strong&gt;$14 billion&lt;/strong&gt;&lt;/a&gt;. The Pentagon just made AI a budget category larger than most of the companies selling it.&lt;/p&gt;
&lt;a href="#lightbox-dod-ai-budget-context-png-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/dod-ai-budget-context.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/dod-ai-budget-context.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/dod-ai-budget-context.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/dod-ai-budget-context.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/dod-ai-budget-context.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/dod-ai-budget-context.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/dod-ai-budget-context.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/dod-ai-budget-context.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/dod-ai-budget-context.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/dod-ai-budget-context.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/dod-ai-budget-context.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/dod-ai-budget-context.png"
alt="Pentagon AI budget FY2026 at $13.4 billion compared to AI lab revenues: a 7x jump from $1.8 billion in FY2025, set against Anthropic annualized revenue of $14 billion, OpenAI FY2025 revenue of $13.1 billion, and Palantir FY2025 revenue of $4.48 billion"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;Anthropic burns an estimated $3–5 billion annually; &lt;a href="https://www.cnbc.com/2026/02/20/openai-resets-spend-expectations-targets-around-600-billion-by-2030.html"&gt;OpenAI burned approximately &lt;strong&gt;$8 billion in 2025&lt;/strong&gt;&lt;/a&gt;. Neither has a clear path to profitability before 2027 at earliest. Government AI contracts offer something consumer businesses cannot: predictable, multi-year, politically protected revenue streams that don&amp;rsquo;t churn when a competitor releases a better model.&lt;/p&gt;
&lt;p&gt;The defense procurement structures deepen that dependency over time. &lt;a href="https://www.congress.gov/crs-product/IF12558"&gt;IDIQ contracts (Indefinite Delivery, Indefinite Quantity), which now account for roughly &lt;strong&gt;56% of DoD contract award dollars&lt;/strong&gt;&lt;/a&gt;, run five years with extension options. &lt;a href="https://defensescoop.com/2025/05/23/dod-palantir-maven-smart-system-contract-increase/"&gt;Palantir&amp;rsquo;s Maven Smart System contract started at $480 million and expanded to &lt;strong&gt;nearly $1.3 billion through 2029&lt;/strong&gt;&lt;/a&gt;. The JWCC cloud contract, which replaced the &lt;a href="https://www.cnbc.com/2021/07/06/pentagon-cancels-10-billion-jedi-cloud-contract.html"&gt;cancelled $10 billion JEDI contract&lt;/a&gt;, placed over &lt;strong&gt;$3.9 billion in task orders within three years&lt;/strong&gt; of award to AWS, Google, Microsoft, and Oracle. Once embedded in classified systems, switching costs become close to prohibitive. A competitor cannot simply offer better inference speed.&lt;/p&gt;
&lt;p&gt;Security clearances are maybe the most underappreciated asset in the defense tech ecosystem. &lt;a href="https://federalnewsnetwork.com/defense-main/2025/05/dcsa-backlog-of-security-clearance-investigations-down-24/"&gt;Processing a clearance takes an average of &lt;strong&gt;243 days end-to-end&lt;/strong&gt;&lt;/a&gt;, up to a year for TS/SCI with polygraph. Only around &lt;strong&gt;4.2 million Americans&lt;/strong&gt; hold active clearances, roughly 2.5% of the labor force, and an estimated 500,000 to 700,000 cleared positions currently sit unfilled. &lt;a href="https://news.clearancejobs.com/2025/03/20/national-security-compensation-reaches-new-high-despite-workforce-challenges/"&gt;Average cleared professional compensation hit &lt;strong&gt;$119,131 in 2025&lt;/strong&gt;; full-scope-polygraph holders averaged &lt;strong&gt;$141,299&lt;/strong&gt;&lt;/a&gt;. For AI labs accustomed to hiring from MIT, Cambridge, and ETH Zürich, the cleared talent pool is thin and gets more expensive every year.&lt;/p&gt;
&lt;p&gt;Any lab serious about classified work has to build a parallel organizational structure: separate hiring pipeline, separate facilities, separate operational security requirements. The lab that builds that structure first has a moat no competitor can cross quickly.&lt;/p&gt;
&lt;h2 id="palantirs-trajectory-as-the-defense-tech-blueprint"&gt;Palantir&amp;rsquo;s trajectory as the defense tech blueprint&lt;/h2&gt;
&lt;p&gt;The clearest view of where this ends is Palantir, which has been running the experiment at scale for a decade. &lt;a href="https://www.cnbc.com/2026/02/02/palantir-pltr-q4-2025-earnings.html"&gt;It posted &lt;strong&gt;$4.48 billion in FY2025 revenue&lt;/strong&gt;, up 56% year-over-year&lt;/a&gt;, with government comprising &lt;strong&gt;53.7%&lt;/strong&gt; of the total, down from a peak of &lt;strong&gt;58.2% in 2021&lt;/strong&gt; as its commercial AIP platform gained traction. &lt;a href="https://www.army.mil/article/287506/u_s_army_awards_enterprise_service_agreement_to_enhance_military_readiness_and_drive_operational_efficiency"&gt;Its $10 billion U.S. Army Enterprise Agreement in July 2025 consolidated 75 existing software contracts into a single framework&lt;/a&gt;. Its market capitalization reached roughly &lt;strong&gt;$320 billion&lt;/strong&gt; by late February 2026, making it worth nearly twice Boeing. The model, government as the client that funds and validates the technology, commercial as the client that justifies the valuation, is what the AI labs are now building toward.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://news.crunchbase.com/venture/openai-raise-largest-ai-venture-deal-ever/"&gt;OpenAI at an &lt;strong&gt;$840 billion valuation&lt;/strong&gt;&lt;/a&gt; with a classified Pentagon network deal is already further down that road than most coverage acknowledges. It has &lt;a href="https://openai.com/index/openai-appoints-retired-us-army-general/"&gt;appointed retired General Paul Nakasone&lt;/a&gt;, former NSA director, to its board. It hired Dane Stuckey, who spent a decade at Palantir and served as its CISO for six of those years, &lt;a href="https://techcrunch.com/2024/10/15/former-palantir-ciso-dane-stuckey-joins-openai-to-lead-security/"&gt;as its own CISO&lt;/a&gt;. It has active job postings for Government Account Directors in Defense requiring Top Secret clearance and defense revenue targets exceeding $2 million per year.&lt;/p&gt;
&lt;p&gt;The publishing record is moving the same way. &lt;a href="https://openai.com/index/introducing-openai/"&gt;OpenAI&amp;rsquo;s 2015 founding post&lt;/a&gt; promised researchers &amp;ldquo;will be strongly encouraged to publish their work.&amp;rdquo; GPT-1 shipped with open-sourced code. GPT-2 was partially withheld in 2019, GPT-3 fully closed in 2020, GPT-4&amp;rsquo;s architecture undisclosed in 2023. OpenAI released smaller open-source models in August 2025 (its first since GPT-2, six years later) but they were text-only, trained on synthetic data, not frontier systems. &lt;a href="https://www.bloomberg.com/news/articles/2025-02-04/google-removes-language-on-weapons-from-public-ai-principles"&gt;Google removed the &amp;ldquo;AI applications we will not pursue&amp;rdquo; section from its principles in February 2025&lt;/a&gt;, including the explicit weapons prohibition. &lt;a href="https://about.fb.com/news/2024/11/open-source-ai-america-global-security/"&gt;Meta opened Llama to defense agencies and contractors including Lockheed Martin and Anduril in November 2024&lt;/a&gt;. Anthropic has never open-sourced a Claude model. Every major lab is moving in the same direction.&lt;/p&gt;
&lt;a href="#lightbox-openness-retreat-timeline-png-4" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/openness-retreat-timeline.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/openness-retreat-timeline.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/openness-retreat-timeline.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/openness-retreat-timeline.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/openness-retreat-timeline.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/openness-retreat-timeline.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/openness-retreat-timeline.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/openness-retreat-timeline.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/openness-retreat-timeline.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/openness-retreat-timeline.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/openness-retreat-timeline.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/openness-retreat-timeline.png"
alt="Timeline of AI lab research openness from 2015 to 2026, showing the retreat from open-source to classified military AI work: OpenAI moved from open-source GPT-1 to classified Pentagon deployment, Google removed its weapons prohibition, Meta opened Llama to defense contractors, and Anthropic was labeled a supply-chain risk"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;The counterargument, and it&amp;rsquo;s a real one, is that defense R&amp;amp;D has historically generated civilian spillovers: ARPANET, GPS, jet engines, the semiconductor supply chain. &lt;a href="https://direct.mit.edu/rest/article/107/1/14/114751/The-Intellectual-Spoils-of-War-Defense-R-amp-D"&gt;Moretti, Steinwender, and Van Reenen, writing in the &lt;em&gt;Review of Economics and Statistics&lt;/em&gt; (2025)&lt;/a&gt;, found that a 10% increase in government-funded defense R&amp;amp;D generates a 5–6% increase in privately funded R&amp;amp;D in the same industry: crowding-in, not crowding-out. The estimated total effect: U.S. private R&amp;amp;D investment is &lt;strong&gt;$85 billion higher&lt;/strong&gt; than it would be without government defense spending.&lt;/p&gt;
&lt;p&gt;But there&amp;rsquo;s a difference between how much research gets done and what it gets pointed at. Lockheed&amp;rsquo;s R&amp;amp;D is now probably almost entirely classified hypersonics and directed-energy weapons. What it learns there does not flow back to commercial applications in any useful timeframe. The research volume expands; the scope narrows. Bell Labs devoted a substantial share of its personnel to government contracts at its Cold War peak; &lt;a href="https://cepr.org/voxeu/columns/how-antitrust-enforcement-can-spur-innovation-bell-labs-and-1956-consent-decree"&gt;the 1956 AT&amp;amp;T Consent Decree forced royalty-free patent licensing on the transistor&lt;/a&gt;, which accidentally accelerated the civilian semiconductor industry by giving Texas Instruments and Fairchild Semiconductor access to the core technology. AI labs operating under classification will not be forced to open-license anything. That mechanism does not exist for software under ITAR.&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;m more confident in the direction of this analysis than in the timeline. The Anthropic supply-chain-risk designation may not survive legal challenge. The $13.4 billion FY2026 AI budget might not survive unchanged. Amodei might find a compromise that others in the industry treat as a ceiling rather than a floor. What I don&amp;rsquo;t think reverses is the structural pull. The defense budget is the largest single purchaser of advanced technology on earth, it&amp;rsquo;s growing, it operates on multi-year contract cycles that reward incumbents, and it is willing to use blunt regulatory tools against companies that don&amp;rsquo;t cooperate, as Anthropic learned in about six hours on February 27.&lt;/p&gt;
&lt;p&gt;The Last Supper logic applies here too: the government will not block consolidation, and it will not save the AI defense contractors that don&amp;rsquo;t participate. It will just find a different partner who will.&lt;/p&gt;</description></item><item><title>People Live in Levels, Not Rates</title><link>http://philippdubach.com/posts/people-live-in-levels-not-rates/</link><pubDate>Sat, 28 Feb 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/people-live-in-levels-not-rates/</guid><description>&lt;blockquote&gt;
&lt;p&gt;Economics doesn&amp;rsquo;t take into account what&amp;rsquo;s best for society. The goal of economics in a capitalist system is to make the most amount of money for your shareholders.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That&amp;rsquo;s Jon Stewart, &lt;a href="https://podcasts.apple.com/us/podcast/the-irrational-economy-with-richard-thaler/id1583132133?i=1000747991551"&gt;telling a Nobel laureate&lt;/a&gt; what his own field is about. On February 4, Stewart hosted Richard Thaler on &amp;ldquo;The Weekly Show&amp;rdquo; to discuss behavioral economics. Thaler, the Chicago Booth professor who won the 2017 Nobel for &lt;a href="https://news.uchicago.edu/story/richard-thaler-wins-nobel-prize-his-contributions-behavioural-economics"&gt;his work on how real humans deviate from rational-agent models&lt;/a&gt;, spent 92 minutes patiently explaining things Stewart had already decided weren&amp;rsquo;t true. &lt;a href="https://x.com/jasonfurman/status/2021395695081750874"&gt;Jason Furman&lt;/a&gt;, Harvard professor and former Obama CEA chair, called it &amp;ldquo;the single worst interview I&amp;rsquo;ve ever done&amp;rdquo; (referencing his own 2024 Stewart appearance). That tweet hit 754,000 views. &lt;a href="https://www.theargumentmag.com/p/jon-stewart-has-become-his-own-worst"&gt;Jerusalem Demsas&lt;/a&gt; wrote the sharpest rebuttal, arguing Stewart &amp;ldquo;has no idea what economics actually is.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;The pile-on was deserved in its specifics and wrong in its framing. Stewart got basic things wrong. He also pointed at something real that the profession keeps failing to address.&lt;/p&gt;
&lt;h2 id="the-carbon-tax-moment"&gt;The carbon tax moment&lt;/h2&gt;
&lt;p&gt;The episode&amp;rsquo;s most telling exchange involved climate policy. Thaler offered the textbook answer: a carbon tax. &lt;a href="https://singjupost.com/the-irrational-economy-w-nobel-laureate-richard-thaler-transcript/"&gt;Every economist agrees on this&lt;/a&gt;, Thaler said, and he&amp;rsquo;s roughly right. Stewart rejected it on political grounds: the moment energy prices rise, voters punish the party in power. Fair enough. But then Stewart &lt;a href="https://podcasts.happyscribe.com/the-weekly-show-with-jon-stewart/the-irrational-economy-with-richard-thaler"&gt;proposed his own solution&lt;/a&gt;: &amp;ldquo;create a model that creates robust markets in damage mitigation and carbon mitigation.&amp;rdquo; Thaler paused. That is a carbon tax. Stewart had arrived at the standard economic answer while believing he was overturning it.&lt;/p&gt;
&lt;p&gt;This moment captures the entire problem. Stewart&amp;rsquo;s instinct, that political feasibility should constrain policy design, is not a &amp;ldquo;bizarre non sequitur&amp;rdquo; as some economists claimed. It&amp;rsquo;s the reason we don&amp;rsquo;t have a carbon tax. But his conviction that economics as a discipline has nothing to say about society&amp;rsquo;s wellbeing is wrong in a way that matters. Thaler&amp;rsquo;s own career is a direct counterexample: behavioral nudges have &lt;a href="https://www.bi.team/about-us/who-we-are/"&gt;enrolled 10 million UK workers&lt;/a&gt; into pension savings, and Thaler &lt;a href="https://singjupost.com/the-irrational-economy-w-nobel-laureate-richard-thaler-transcript/"&gt;told Stewart&lt;/a&gt; that renaming an ACA plan tier from &amp;ldquo;catastrophic&amp;rdquo; to &amp;ldquo;economy&amp;rdquo; cut the uninsured rate by &lt;strong&gt;10%&lt;/strong&gt;. But I think the economists who piled on Stewart missed the more interesting question he was circling: if the economy is working well by standard measures, why does it feel broken to so many people? The answer is what I&amp;rsquo;d call the levels-vs-rates problem, and it explains both the vibecession and the trust gap between economists and the public they claim to serve.&lt;/p&gt;
&lt;h2 id="levels-vs-rates-disconnect"&gt;Levels-vs-rates disconnect&lt;/h2&gt;
&lt;p&gt;The headline data is strong. &lt;a href="https://www.bea.gov/news/2026/gross-domestic-product-3rd-quarter-2025-updated-estimate-gdp-industry-and-corporate"&gt;GDP grew 4.4% annualized in Q3 2025&lt;/a&gt;. &lt;a href="https://tradingeconomics.com/united-states/unemployment-rate"&gt;Unemployment sits at 4.3%&lt;/a&gt;. &lt;a href="https://www.cnbc.com/2026/02/13/heres-the-inflation-breakdown-for-january-2026-in-one-chart.html"&gt;Inflation has fallen to 2.4%&lt;/a&gt;, with core CPI at 2.5%, its lowest since April 2021. Real wages have outpaced inflation every month since June 2023. The S&amp;amp;P 500 posted &lt;a href="https://www.fool.com/investing/2026/01/22/the-sp-500-just-did-something-weve-never-seen-befo/"&gt;three consecutive years of double-digit gains&lt;/a&gt;, returning 86% cumulative. An economist looking at these numbers would say the economy is performing well. Thaler more or less said exactly that.&lt;/p&gt;
&lt;p&gt;But rates and levels are different things, and people live in levels. The cumulative CPI increase since early 2020 is roughly &lt;strong&gt;25%&lt;/strong&gt;. &lt;a href="https://www.traceone.com/resources/plm-compliance-blog/grocery-store-items-that-have-increased-most-in-price"&gt;Food-at-home prices are up 29.4% since March 2020&lt;/a&gt;. The $150 grocery bill became $186 and will never go back to $150. &lt;a href="https://www.cotality.com/insights/articles/2025-housing-market-moderation-and-rebalancing"&gt;Housing affordability is at its lowest point since the 1980s&lt;/a&gt;, with home prices up 30-45% from pre-pandemic levels and &lt;a href="https://www.freddiemac.com/pmms"&gt;mortgage rates near 6%&lt;/a&gt;, more than double the 2.65% pandemic low. &lt;a href="https://time.com/7327333/health-insurance-costs-increasing-2026/"&gt;ACA marketplace premiums rose 21.7-26% for 2026&lt;/a&gt;, the largest increase in nearly a decade. The rate of change has normalized. The level shift is permanent.&lt;/p&gt;
&lt;a href="#lightbox-macro-vs-sentiment-disconnect-png-0" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/macro-vs-sentiment-disconnect.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/macro-vs-sentiment-disconnect.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/macro-vs-sentiment-disconnect.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/macro-vs-sentiment-disconnect.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/macro-vs-sentiment-disconnect.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/macro-vs-sentiment-disconnect.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/macro-vs-sentiment-disconnect.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/macro-vs-sentiment-disconnect.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/macro-vs-sentiment-disconnect.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/macro-vs-sentiment-disconnect.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/macro-vs-sentiment-disconnect.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/macro-vs-sentiment-disconnect.png"
alt="Exhibit showing the disconnect between macro indicators and lived experience, with the left column showing GDP growth of 4.4 percent, unemployment of 4.3 percent, inflation of 2.4 percent, S&amp;amp;P 500 up 86 percent cumulative, and 17 months of real wage gains, versus the right column showing University of Michigan consumer sentiment at 57.3 in the 3rd percentile, grocery prices up 29.4 percent, mortgage rates at 6.09 percent up from 2.65 percent, ACA premiums up 26 percent, and 8 percent of households with zero or negative net worth"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;a href="#lightbox-levels-vs-rates1-png-1" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/levels-vs-rates1.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/levels-vs-rates1.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/levels-vs-rates1.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/levels-vs-rates1.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/levels-vs-rates1.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/levels-vs-rates1.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/levels-vs-rates1.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/levels-vs-rates1.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/levels-vs-rates1.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/levels-vs-rates1.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/levels-vs-rates1.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/levels-vs-rates1.png"
alt="Exhibit showing cumulative price increases since March 2020 by essential spending category with housing up roughly 38 percent, food at home up 29.4 percent, health insurance premiums up 26 percent, and aggregate CPI up roughly 25 percent, contrasted with the current inflation rate of 2.4 percent, illustrating that the rate has normalized but the level shift is permanent"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;This is what Stewart was gesturing at. He expressed it as &amp;ldquo;economics doesn&amp;rsquo;t care about people,&amp;rdquo; which is wrong. What he actually meant: the metrics economists use to declare success (rates of change, aggregate growth, unemployment) don&amp;rsquo;t capture what households experience at the grocery store, the mortgage broker, or the insurance renewal. The economic perception gap isn&amp;rsquo;t irrational. It&amp;rsquo;s a measurement problem.&lt;/p&gt;
&lt;h2 id="k-shaped-wages-the-recovery-that-reversed"&gt;K-shaped wages: the recovery that reversed&lt;/h2&gt;
&lt;p&gt;The distributional story makes both sides&amp;rsquo; aggregate claims misleading. &lt;a href="https://www.epi.org/blog/low-wage-workers-faced-worsening-affordability-in-2025/"&gt;The Economic Policy Institute reported&lt;/a&gt; in February 2026 that low-wage workers&amp;rsquo; real wages declined in 2025, ending five years of historically fast gains. High earners held steady at 4.5% wage growth. &lt;a href="https://fortune.com/2025/11/10/k-shaped-economy-wage-growth-wealthiest-poorest-americans-diverge/"&gt;The lowest-income quartile fell from 7.5% to roughly 3.5%&lt;/a&gt;. The pandemic-era wage compression that closed up to one-third of the post-1979 wage gap, documented by Autor, Dube, and McGrew in their 2023 paper, has reversed. &lt;a href="https://www.cnbc.com/2026/01/30/wealth-inequality-k-shaped-economy-united-states-consumer-spending-trump.html"&gt;CNBC reported&lt;/a&gt; that the top 1% now hold roughly 32% of US net worth (about $52 trillion), while the bottom 50% hold 2.5%. Eight percent of American households have zero or negative net worth.&lt;/p&gt;
&lt;a href="#lightbox-k-shaped-wage-divergence2-png-2" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/k-shaped-wage-divergence2.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/k-shaped-wage-divergence2.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/k-shaped-wage-divergence2.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/k-shaped-wage-divergence2.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/k-shaped-wage-divergence2.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/k-shaped-wage-divergence2.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/k-shaped-wage-divergence2.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/k-shaped-wage-divergence2.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/k-shaped-wage-divergence2.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/k-shaped-wage-divergence2.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/k-shaped-wage-divergence2.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/k-shaped-wage-divergence2.png"
alt="Exhibit showing K-shaped wage divergence from 2021 to 2025, with bottom quartile wage growth peaking at 7.5 percent in 2022 then collapsing to 3.5 percent by late 2025, while top quartile wage growth held steady at 4.5 percent throughout, illustrating the reversal of pandemic-era wage compression that had closed up to one-third of the post-1979 inequality gap"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;The &lt;a href="https://www.advisorperspectives.com/dshort/updates/2026/02/06/consumer-sentiments-marginal-gains-six-month-peak-still-feels-like-a-valley"&gt;University of Michigan consumer sentiment index&lt;/a&gt; sits at 57.3: the 3rd percentile of its entire historical range. The &lt;a href="https://markets.financialcontent.com/stocks/article/marketminute-2026-2-11-the-great-sentiment-schism-us-consumer-confidence-hits-12-year-low-amid-radical-partisan-divide"&gt;Conference Board&amp;rsquo;s Consumer Confidence Index hit 84.5 in January&lt;/a&gt;, a 12-year low. Charles Schwab&amp;rsquo;s Kevin Gordon &lt;a href="https://finance.yahoo.com/video/economic-data-isnt-moving-sentiment-190033538.html"&gt;coined the term &amp;ldquo;vibepression&amp;rdquo;&lt;/a&gt; in December 2025 as sentiment hit new lows. Kyla Scanlon&amp;rsquo;s &lt;a href="https://www.mercatus.org/macro-musings/kyla-scanlon-vibecession-vibe-economy-and-path-growing-american-wealth"&gt;original &amp;ldquo;vibecession&amp;rdquo; concept from 2022&lt;/a&gt; never actually resolved; it just got a bleaker name.&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;m not sure consumer sentiment surveys mean much anymore. A &lt;a href="https://markets.financialcontent.com/stocks/article/marketminute-2026-2-11-the-great-sentiment-schism-us-consumer-confidence-hits-12-year-low-amid-radical-partisan-divide"&gt;50-point partisan gap&lt;/a&gt; between Republicans and Democrats renders the aggregate figure almost meaningless as an economic indicator. It tells you about political identity, not lived experience. But the affordability data underneath the sentiment numbers is not a polling artifact. &lt;a href="https://newsletter.mikekonczal.com/p/why-affordability-and-the-vibecession"&gt;Mike Konczal&amp;rsquo;s February 2026 analysis&lt;/a&gt; showed that budget shares devoted to essentials, food, shelter, transportation, and healthcare, have increased even as aggregate real incomes recovered. He called this the &amp;ldquo;essentials squeeze,&amp;rdquo; and dismissed the standard economist response (&amp;ldquo;it&amp;rsquo;s just money illusion&amp;rdquo;) as inadequate. I think he&amp;rsquo;s right.&lt;/p&gt;
&lt;h2 id="economists-communication-problem"&gt;Economists&amp;rsquo; communication problem&lt;/h2&gt;
&lt;p&gt;The &lt;a href="https://www.nominalnews.com/p/jon-stewart-thaler-economics-debate"&gt;Nominal News&lt;/a&gt; author, a PhD economist, made the argument I find most persuasive: Stewart&amp;rsquo;s view of economics is wrong, but the reason he holds it is because the profession has failed to distinguish between what it actually does and the policy opinions individual economists express in op-eds and cable news appearances. When a prominent economist dismisses wealth taxes by citing implementation costs as if costs alone settle the question, they&amp;rsquo;re blending analysis with preference. Do that enough times, and people like Stewart conclude that the entire field is an exercise in defending the status quo.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.theargumentmag.com/p/jon-stewart-has-become-his-own-worst"&gt;Demsas catalogued&lt;/a&gt; economics&amp;rsquo; accomplishments: Alvin Roth&amp;rsquo;s kidney exchange, RCTs delivering tutoring to 5 million Indian children, longer intervals between recessions. These are real. But the profession&amp;rsquo;s most visible public-facing moments, failing to predict 2008, designing a response perceived as rescuing banks over households, insisting the economy is strong while consumer sentiment sits near all-time lows, have eroded trust in ways that one good Substack post can&amp;rsquo;t repair.&lt;/p&gt;
&lt;p&gt;Stewart then &lt;a href="https://singjupost.com/the-wealth-of-wall-street-with-oren-cass-transcript/"&gt;brought on Oren Cass&lt;/a&gt; to discuss financialization, which &lt;a href="https://www.theargumentmag.com/p/jon-stewart-has-become-his-own-worst"&gt;prompted Demsas to write&lt;/a&gt;: &amp;ldquo;Damn, I felt bad for a second but Stewart may be beyond help.&amp;rdquo; The irony is thick. The populist left (Stewart) and the populist right (Cass) are making the same structural complaint about economics from opposite directions. Stewart says economics serves capital. Cass says economics serves free trade orthodoxy. Both are wrong about the discipline and right that its public-facing representatives have blurred the line between analysis and advocacy for decades.&lt;/p&gt;
&lt;p&gt;Here&amp;rsquo;s where I land. Nudge theory &lt;a href="https://www.sciencenews.org/article/nudge-theory-behavioral-science-psychology-structural-change"&gt;works in specific, bounded domains&lt;/a&gt;: default enrollment, plan labeling, organ donation opt-outs. A &lt;a href="https://theconversation.com/nudge-theory-what-15-years-of-research-tells-us-about-its-promises-and-politics-210534"&gt;meta-analysis by Maier et al.&lt;/a&gt; found that real-world nudges increase desired behavior by an average of &lt;strong&gt;1.4 percentage points&lt;/strong&gt; after correcting for publication bias, versus 8.7 in lab settings. Useful but limited. Stewart&amp;rsquo;s &amp;ldquo;nudge vs. shove&amp;rdquo; framing is crude. Thaler&amp;rsquo;s point that mandates become dangerous when political control shifts (&amp;ldquo;Sometimes Trump is President&amp;rdquo;) is underrated. But neither addressed the actual hard question: what do you do about a permanent 25% price level shift in essentials that no nudge can reverse and no rate-of-change metric captures?&lt;/p&gt;
&lt;p&gt;I don&amp;rsquo;t think anyone has a good answer to that. The economists who piled on Stewart for not understanding Pigouvian taxation weren&amp;rsquo;t wrong. They just weren&amp;rsquo;t answering the question their audience was asking.&lt;/p&gt;</description></item><item><title>Novo Was Europe's Most Valuable Company</title><link>http://philippdubach.com/posts/novo-was-europes-most-valuable-company/</link><pubDate>Mon, 23 Feb 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/novo-was-europes-most-valuable-company/</guid><description>&lt;p&gt;Novo Nordisk was Europe&amp;rsquo;s most valuable company 20 months ago. Today its market capitalization falls behind ASML, LVMH, Hermès, L&amp;rsquo;Oréal, SAP, Prosus, Siemens, Inditex, Deutsche Telekom, and Santander.&lt;/p&gt;
&lt;p&gt;The stock has lost roughly &lt;strong&gt;75%&lt;/strong&gt; since its June 2024 peak of $142.44, falling from a &lt;strong&gt;$640 billion&lt;/strong&gt; market cap to under &lt;strong&gt;$160 billion&lt;/strong&gt;. Shares dropped another 16% this morning after CagriSema, the follow-on obesity drug that was supposed to restore Novo&amp;rsquo;s competitive story, &lt;a href="https://www.globenewswire.com/news-release/2026/02/23/3242381/0/en/Novo-Nordisk-A-S-CagriSema-demonstrated-23-weight-loss-in-an-open-label-head-to-head-REDEFINE-4-trial-in-people-with-obesity-the-primary-endpoint-was-not-achieved.html"&gt;failed its head-to-head trial&lt;/a&gt; against Eli Lilly&amp;rsquo;s Zepbound. The REDEFINE 4 results confirm what a former Novo advisor &lt;a href="https://www.alpha-sense.com/"&gt;told AlphaSense&lt;/a&gt; back in December: CagriSema is &amp;ldquo;not particularly impressive.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;I like this stock over the long term. The GLP-1 market is real, the addressable population is enormous, and Novo still sells more semaglutide than anyone. But liking a stock and holding on to it no matter the outlook are not the same thing. Or as Warren Buffett would say:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The most important thing to do if you find yourself in a hole is to stop digging&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The problems are compounding. US pricing is resetting structurally lower through MFN and IRA. The pipeline just lost its best competitive argument. International patents are falling away faster than expected. Eli Lilly is pulling ahead on every axis. And Novo &lt;a href="https://www.cnbc.com/2026/02/03/novo-nordisk-2025-earnings-wegovy-ozempic.html"&gt;guided for its &lt;strong&gt;first revenue decline in modern history&lt;/strong&gt;&lt;/a&gt; in 2026: adjusted sales down &lt;strong&gt;5-13%&lt;/strong&gt;. A former senior district sales manager at Novo described that guidance as &amp;ldquo;very tepid,&amp;rdquo; and added that the severity of the market reaction suggests investors may be pricing in further downside beyond what management disclosed.&lt;/p&gt;
&lt;h2 id="stock-collapse"&gt;Stock collapse&lt;/h2&gt;
&lt;p&gt;The speed of the decline matters. In June 2024, NVO hit $142.44. Then, in sequence: a July 2025 guidance cut after Q2 results showed US pricing headwinds worse than expected (shares dropped roughly 22% in a session). A September 2025 announcement of &lt;a href="https://www.cnbc.com/2025/09/10/wegovy-maker-novo-nordisk-to-cut-around-9000-jobs.html"&gt;9,000 job cuts&lt;/a&gt; and DKK 8 billion in restructuring charges under new CEO Maziar Mike Doustdar, read not as efficiency but as admission of trouble. February 4, 2026 full-year results &lt;a href="https://www.fiercepharma.com/pharma/novo-shares-plummet-sales-profit-warning-26"&gt;guiding adjusted sales growth at -5% to -13%&lt;/a&gt; (the stock &lt;a href="https://www.euronews.com/business/2026/02/04/novo-nordisk-stock-sinks-by-17-after-bleak-2026-forecast"&gt;cratered 18% in Copenhagen&lt;/a&gt;). And today, REDEFINE 4.&lt;/p&gt;
&lt;p&gt;The 52-week range was $43.08 to $93.80 before today&amp;rsquo;s open. NVO is now trading around $40, a new low. The all-time high was less than two years ago.&lt;/p&gt;
&lt;a href="#lightbox-novo-peer-valuation-png-0" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/novo-peer-valuation.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/novo-peer-valuation.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/novo-peer-valuation.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/novo-peer-valuation.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/novo-peer-valuation.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/novo-peer-valuation.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/novo-peer-valuation.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/novo-peer-valuation.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/novo-peer-valuation.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/novo-peer-valuation.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/novo-peer-valuation.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/novo-peer-valuation.png"
alt="Exhibit showing Novo Nordisk trading cheaper than every large-cap pharma peer except Pfizer, with NVO at 13x forward PE and minus 48 percent one-year return versus Eli Lilly at 30x and plus 17 percent, AstraZeneca at 20x and plus 40 percent, Merck at 14x, and AbbVie at 17x, with NVO the only company guiding for negative FY2026 revenue growth"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;Novo now trades at a lower forward multiple than Merck and below Pfizer, which is dealing with its own post-COVID structural decline. Whether that valuation is justified is the real question.&lt;/p&gt;
&lt;a href="#lightbox-novo-valuation-compression-png-1" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/novo-valuation-compression.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/novo-valuation-compression.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/novo-valuation-compression.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/novo-valuation-compression.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/novo-valuation-compression.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/novo-valuation-compression.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/novo-valuation-compression.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/novo-valuation-compression.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/novo-valuation-compression.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/novo-valuation-compression.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/novo-valuation-compression.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/novo-valuation-compression.png"
alt="Exhibit showing Novo Nordisk forward PE compressing 75 percent from 41.8x in FY2023 to 13.2x in FY2026E while Eli Lilly remains at approximately 30x, with event markers for the July 2025 guidance cut, September 2025 job cuts, and February 2026 revenue decline guidance"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;h2 id="2026-guidance"&gt;2026 guidance&lt;/h2&gt;
&lt;p&gt;It is rare to see a company of Novo&amp;rsquo;s stature guide for a sales decline. This is not a biotech that lost a coin-flip Phase 3. This is the global leader in GLP-1s telling investors that revenue will shrink.&lt;/p&gt;
&lt;a href="#lightbox-novo-2026-guidance-png-2" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/novo-2026-guidance.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/novo-2026-guidance.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/novo-2026-guidance.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/novo-2026-guidance.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/novo-2026-guidance.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/novo-2026-guidance.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/novo-2026-guidance.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/novo-2026-guidance.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/novo-2026-guidance.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/novo-2026-guidance.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/novo-2026-guidance.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/novo-2026-guidance.png"
alt="Exhibit showing Novo Nordisk FY2026 guidance with adjusted sales growth of minus 5 to minus 13 percent CER, adjusted operating profit growth of minus 5 to minus 13 percent, reported DKK sales growth of minus 8 to minus 16 percent, and reported operating profit growth of minus 10 to minus 18 percent, with capex of DKK 55 billion and free cash flow of DKK 35-45 billion"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;a href="#lightbox-novo-revenue-growth-inflection-png-3" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/novo-revenue-growth-inflection.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/novo-revenue-growth-inflection.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/novo-revenue-growth-inflection.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/novo-revenue-growth-inflection.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/novo-revenue-growth-inflection.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/novo-revenue-growth-inflection.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/novo-revenue-growth-inflection.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/novo-revenue-growth-inflection.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/novo-revenue-growth-inflection.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/novo-revenue-growth-inflection.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/novo-revenue-growth-inflection.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/novo-revenue-growth-inflection.png"
alt="Exhibit showing Novo Nordisk year-over-year revenue growth from FY2016 to FY2026E, with growth accelerating from plus 10.9 percent in FY2021 to plus 31.3 percent in FY2023 before decelerating to plus 6.4 percent in FY2025 and turning negative at minus 7.4 percent in FY2026E, the first revenue decline in modern company history"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;Three structural forces are driving the decline, each on a different timeline.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://www.whitehouse.gov/fact-sheets/2025/11/fact-sheet-president-donald-j-trump-announces-major-developments-in-bringing-most-favored-nation-pricing-to-american-patients/"&gt;November 2025 MFN deal&lt;/a&gt; with the Trump Administration cut Wegovy&amp;rsquo;s government price to &lt;strong&gt;$349/month&lt;/strong&gt; and set Medicare/Medicaid rates at roughly &lt;strong&gt;$245/month&lt;/strong&gt;, a &lt;a href="https://www.cnbc.com/2025/11/06/trump-eli-lilly-novo-nordisk-deal-obesity-drug-prices.html"&gt;60-80% reduction&lt;/a&gt; from prior list prices. Insulin was capped at $35/month. Lilly took a similar deal (Zepbound at $346/month), so neither company gained competitive advantage, but both lost pricing power permanently in the government channel. The commercial channel is following. Payers who previously paid $800-1,000 per month for Wegovy are now pointing at the government rate and demanding comparable terms.&lt;/p&gt;
&lt;p&gt;Internationally, the patent picture is worse than most investors realize. Semaglutide&amp;rsquo;s compound patent lapsed in Canada in January 2026 after Novo failed to pay a &lt;a href="https://fortune.com/2025/06/17/novo-nordisk-ozempic-wegovy-semaglutide-canada-patent-protection-fee/"&gt;maintenance fee&lt;/a&gt; of roughly CAD 250 (&lt;em&gt;on a self reflective note, maybe this story alone should have made me leave&lt;/em&gt;). Sandoz and Apotex are preparing generic launches. &lt;a href="https://www.theglobeandmail.com/business/economy/article-generic-ozempic-canada-drugmakers/"&gt;Dr. Reddy&amp;rsquo;s has filed in 87 countries&lt;/a&gt;. In China, at least 15 manufacturers are in development. Brazil&amp;rsquo;s federal court denied a patent extension. The US patent thicket (320 applications, 154 granted, settlements pushing generics to roughly 2031-32) provides breathing room domestically, but international operations generated DKK 112 billion in 2025 revenue, and the erosion has started.&lt;/p&gt;
&lt;p&gt;Meanwhile, &lt;a href="https://stateline.org/2025/11/28/states-retreat-from-covering-drugs-for-weight-loss/"&gt;several states have dropped Medicaid coverage&lt;/a&gt; for GLP-1 obesity drugs since late 2025: California, Pennsylvania, New Hampshire, South Carolina. &lt;a href="https://www.kff.org/medicaid/medicaid-coverage-of-and-spending-on-glp-1s/"&gt;Only 13 states still cover them&lt;/a&gt;. The IRA&amp;rsquo;s &lt;a href="https://www.fiercepharma.com/pharma/medicare-unveils-price-reductions-15-drugs-including-novos-semaglutide"&gt;Round 2 negotiations&lt;/a&gt;, effective January 2027, set Ozempic at &lt;strong&gt;$274/month&lt;/strong&gt; (71% below list) and Wegovy at &lt;strong&gt;$385/month&lt;/strong&gt;. With 2.3 million Medicare semaglutide users, that is a massive revenue compression event arriving in twelve months.&lt;/p&gt;
&lt;a href="#lightbox-novo-margin-erosion-png-5" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/novo-margin-erosion.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/novo-margin-erosion.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/novo-margin-erosion.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/novo-margin-erosion.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/novo-margin-erosion.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/novo-margin-erosion.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/novo-margin-erosion.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/novo-margin-erosion.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/novo-margin-erosion.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/novo-margin-erosion.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/novo-margin-erosion.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/novo-margin-erosion.png"
alt="Exhibit showing Novo Nordisk income statement from FY2021 to FY2026E with gross margin falling 370 basis points from 84.7 percent in FY2024 to 81.0 percent in FY2025, R&amp;amp;D spending rising from 12.6 to 16.8 percent of revenue, and EBITDA margin compressing from 50.8 to 48.4 percent, with FY2026E projecting further deterioration across all metrics"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;A former director at Novo anticipates strong GLP-1 growth for at least five to seven years but warns that pricing pressures from biosimilars and generics will force significant price cuts in that period. Long-term share, in this person&amp;rsquo;s view, depends on real-world efficacy and the ability to secure additional indications, not on the brand franchise alone.&lt;/p&gt;
&lt;h2 id="cagrisema-a-pipeline-crisis"&gt;CagriSema: a pipeline crisis&lt;/h2&gt;
&lt;p&gt;I want to push back on the framing already circulating in some analyst notes, which is that REDEFINE 4 is &amp;ldquo;disappointing but manageable.&amp;rdquo; It is not manageable. This was the trial that was supposed to prove Novo could compete with Lilly on superior efficacy. It proved the opposite.&lt;/p&gt;
&lt;a href="#lightbox-novo-efficacy-comparison-png-6" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/novo-efficacy-comparison.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/novo-efficacy-comparison.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/novo-efficacy-comparison.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/novo-efficacy-comparison.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/novo-efficacy-comparison.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/novo-efficacy-comparison.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/novo-efficacy-comparison.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/novo-efficacy-comparison.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/novo-efficacy-comparison.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/novo-efficacy-comparison.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/novo-efficacy-comparison.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/novo-efficacy-comparison.png"
alt="Exhibit comparing weight loss efficacy across injectable and oral obesity drugs, showing Eli Lilly&amp;#39;s retatrutide at 28.7 percent, Zepbound at 25.5 percent, Novo&amp;#39;s CagriSema at 23.0 percent, injectable Wegovy at approximately 15 percent, Lilly&amp;#39;s orforglipron at approximately 14.7 percent, and the Wegovy pill at approximately 13.6 percent, with CagriSema trailing Zepbound by 2.5 percentage points and retatrutide by nearly 6 percentage points"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;The 2.5 percentage point gap on the on-treatment estimand is bad enough. The 3.4 point gap on intention-to-treat is worse, because it suggests CagriSema also has a tolerability or adherence problem relative to tirzepatide. &lt;a href="https://www.nejm.org/doi/full/10.1056/NEJMoa2502081"&gt;Only 57% of REDEFINE 1 patients&lt;/a&gt; reached the highest CagriSema dose, hinting at a ceiling.&lt;/p&gt;
&lt;p&gt;A former senior director at Novo expressed disappointment with the REDEFINE trial designs, which allowed for patient down-titration, potentially diluting the efficacy signal. This person regards the asset as safe but questions its commercial strength against aggressive competition. A former Novo advisor was blunter: if Lilly&amp;rsquo;s retatrutide launches before CagriSema gains traction, it would be a &amp;ldquo;marketing car crash&amp;rdquo; for Novo, potentially relegating CagriSema to &amp;ldquo;second best&amp;rdquo; status.&lt;/p&gt;
&lt;p&gt;Novo&amp;rsquo;s management pointed to the blinded REDEFINE 11 trial (flexible dosing) and a planned higher-dose CagriSema study as paths to demonstrating &amp;ldquo;full weight-loss potential.&amp;rdquo; Maybe. But REDEFINE 11 results won&amp;rsquo;t arrive until the &lt;strong&gt;first half of 2027&lt;/strong&gt;, and by then Lilly will likely have retatrutide data showing roughly 29% weight loss, plus an approved orforglipron pill without the fasting restrictions.&lt;/p&gt;
&lt;p&gt;CagriSema will still probably get FDA approval in late 2026, based on the REDEFINE 1 and 2 placebo data. But launching a drug with clinical proof of inferiority to the market leader is a very different commercial proposition than launching one with a credible superiority story. Pricing, formulary positioning, and physician adoption all get harder. A former director at Eli Lilly told AlphaSense that Lilly&amp;rsquo;s retatrutide appears superior to both Zepbound and CagriSema based on available data, and that CagriSema lacks a compelling differentiation story, particularly on muscle preservation. The obesity market, this person believes, will double or triple over the next decade, but price reductions will be the primary driver of that expansion.&lt;/p&gt;
&lt;h2 id="eli-lilly-is-pulling-ahead"&gt;Eli Lilly is pulling ahead&lt;/h2&gt;
&lt;p&gt;This is the part I think the Novo bull case underweights. Lilly is pulling ahead on efficacy, pipeline breadth, oral convenience, manufacturing capacity, and patent duration, all at once.&lt;/p&gt;
&lt;p&gt;By end of Q3 2025, Lilly held &lt;a href="https://www.cnbc.com/2026/02/04/eli-lilly-novo-nordisk-earnings-glp1-market.html"&gt;63% of US branded anti-obesity prescription share and 57% of total US GLP-1 scripts&lt;/a&gt;. Zepbound&amp;rsquo;s Q4 US revenue was &lt;strong&gt;$4.2 billion&lt;/strong&gt; (+122% YoY). Full-year 2025 tirzepatide revenue reached &lt;a href="https://investor.lilly.com/news-releases/news-release-details/lilly-reports-fourth-quarter-2025-financial-results-and-provides"&gt;&lt;strong&gt;$36.5 billion&lt;/strong&gt;&lt;/a&gt;, making it the world&amp;rsquo;s best-selling drug molecule. Lilly guided 2026 revenue at &lt;strong&gt;$80-83 billion&lt;/strong&gt;, implying roughly 25% growth. Novo guided for a decline.&lt;/p&gt;
&lt;p&gt;Three pipeline assets make the gap worse over time.&lt;/p&gt;
&lt;p&gt;Orforglipron, Lilly&amp;rsquo;s oral non-peptide GLP-1, has an FDA decision expected April-May 2026. No food restrictions, no fasting window. It &lt;a href="https://www.clinicaltrialsarena.com/news/lillys-orforglipron-trumps-oral-semaglutide-in-head-to-head-trial/"&gt;beat oral semaglutide head-to-head&lt;/a&gt; in the ACHIEVE-3 diabetes trial. &lt;a href="https://www.goldmansachs.com/pdfs/insights/pages/gs-research/weighing-the-glp1-market/report.pdf"&gt;Goldman Sachs projects&lt;/a&gt; 60% oral GLP-1 market share by 2030. An obesity physician familiar with both compounds views the orforglipron launch as a turning point precisely because it lacks the &amp;ldquo;strict rules&amp;rdquo; associated with oral Wegovy: fasting, water restrictions, the administration burden that limits real-world compliance. If efficacy is comparable, this person argues, the lower-friction option wins.&lt;/p&gt;
&lt;p&gt;Retatrutide, the triple agonist (GLP-1/GIP/glucagon), showed &lt;a href="https://investor.lilly.com/news-releases/news-release-details/lillys-triple-agonist-retatrutide-delivered-weight-loss-average"&gt;&lt;strong&gt;28.7%&lt;/strong&gt; weight loss at 68 weeks&lt;/a&gt; in TRIUMPH-4. That is 5+ points above CagriSema&amp;rsquo;s best showing. NDA filing is projected for late 2026. &lt;a href="https://www.clinicaltrialsarena.com/news/lilly-retatrutide-data-phase-iii-trial/"&gt;GlobalData forecasts&lt;/a&gt; $15.6 billion in 2031 sales.&lt;/p&gt;
&lt;p&gt;Manufacturing: Lilly has committed &lt;a href="https://www.cnbc.com/2025/09/23/eli-lilly-plans-6point5-billion-texas-manufacturing-plant-for-obesity-pill.html"&gt;&lt;strong&gt;$50 billion+&lt;/strong&gt; in investment&lt;/a&gt; since 2020, including a &lt;a href="https://investor.lilly.com/news-releases/news-release-details/lilly-plans-build-new-65-billion-facility-manufacture-active"&gt;$6.5 billion Texas oral pill facility&lt;/a&gt;. Tirzepatide patents extend through the back half of the 2030s, giving Lilly 5-7 more years of US exclusivity than semaglutide.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.biospace.com/business/lillys-weight-loss-trio-could-top-100b-in-revenue-thanks-to-oral-option"&gt;Truist estimates&lt;/a&gt; Lilly&amp;rsquo;s obesity/diabetes trio could reach $101 billion in combined peak sales worldwide, before retatrutide even enters the market.&lt;/p&gt;
&lt;h2 id="the-wegovy-pill-one-bright-spot"&gt;The Wegovy pill: one bright spot&lt;/h2&gt;
&lt;p&gt;Credit where it&amp;rsquo;s due. Oral Wegovy, approved December 22, 2025 and launched January 5, 2026, &lt;a href="https://www.nbcnews.com/health/health-news/170000-people-us-are-taking-wegovy-pill-novo-nordisk-says-rcna257395"&gt;reached over 170,000 patients within four weeks&lt;/a&gt;. Weekly prescriptions hit roughly 50,000 by late January. &lt;a href="https://www.cnbc.com/2026/01/16/novo-nordisk-shares-wegovy-obesity-pill-launch.html"&gt;TD Cowen noted&lt;/a&gt; it generated roughly 15x more prescriptions than injectable Wegovy at the same post-launch stage, and double Zepbound&amp;rsquo;s trajectory.&lt;/p&gt;
&lt;p&gt;But about 90% of those prescriptions are self-pay at $149/month, because formulary coverage for the new formulation is limited. That is great for patient access and terrible for revenue per patient compared to the injectable franchise. CEO Doustdar acknowledged the tension: the pill launch is strong, but &amp;ldquo;the price hit on the existing business trumps the great pill launch.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;Clinically, oral semaglutide 25mg delivers roughly 13.6% weight loss in all-comers, well below injectable Wegovy (roughly 15%) and further below Zepbound (20%+). A former senior diabetes care specialist at Novo expressed skepticism about the oral format&amp;rsquo;s long-term success, noting the challenging administration requirements contrast poorly with Lilly&amp;rsquo;s track record of marketing easier-to-take products. A second former specialist at Novo offered the counterpoint: if Novo prices the Wegovy pill aggressively enough, it could capture share despite the convenience gap. The pricing lever is there. Whether management pulls it hard enough, fast enough, is the question.&lt;/p&gt;
&lt;p&gt;When orforglipron arrives with comparable efficacy and no fasting requirement, the Wegovy pill&amp;rsquo;s competitive position narrows. The window is months, not years.&lt;/p&gt;
&lt;h2 id="where-i-come-out"&gt;Where I come out&lt;/h2&gt;
&lt;p&gt;I keep going back and forth on this one, and I think that ambivalence is the right response.&lt;/p&gt;
&lt;p&gt;The case for buying: Novo at 11x earnings is pricing in a catastrophe. The GLP-1 market is &lt;a href="https://www.jpmorgan.com/insights/global-research/current-events/obesity-drugs"&gt;projected to reach $100-150 billion by 2030&lt;/a&gt;. Novo still has the most prescribed semaglutide franchise on the planet. The Wegovy pill launch is legitimately strong. The balance sheet is healthy (debt/equity roughly 0.67x post-Catalent, free cash flow guided at DKK 35-45 billion). The dividend yield is approaching 4%. If the obesity treatment market is multi-winner rather than winner-take-all, Novo at these levels could compound nicely over 5+ years.&lt;/p&gt;
&lt;p&gt;The case for waiting: there is no positive catalyst before May at the earliest. Orforglipron approval could arrive any day and further pressure the oral franchise. Post-CagriSema analyst target revisions haven&amp;rsquo;t happened yet. European institutional selling may have further to run. &lt;a href="https://www.marketbeat.com/stocks/NYSE/NVO/short-interest/"&gt;Short interest is under 1%&lt;/a&gt; of shares outstanding, meaning the 75% decline has been driven overwhelmingly by longs selling, not shorts pressing, with the implication that forced selling from funds that haven&amp;rsquo;t yet adjusted positions could continue. And the fundamental problem remains: Lilly has proven clinical superiority in injectables, will likely have a better oral, and has a triple agonist coming that makes both companies&amp;rsquo; current drugs look modest.&lt;/p&gt;
&lt;a href="#lightbox-novo-institutional-positioning-png-7" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/novo-institutional-positioning.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/novo-institutional-positioning.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/novo-institutional-positioning.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/novo-institutional-positioning.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/novo-institutional-positioning.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/novo-institutional-positioning.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/novo-institutional-positioning.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/novo-institutional-positioning.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/novo-institutional-positioning.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/novo-institutional-positioning.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/novo-institutional-positioning.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/novo-institutional-positioning.png"
alt="Exhibit showing Q4 2025 13F filing data for Novo Nordisk ADR holders, with long-only funds Capital International and Fidelity cutting positions by 36.7 percent and 28.8 percent respectively, while options desks and quant funds including Citadel in put options plus 47.4 percent, Goldman Sachs in shares plus 63.6 percent, Jane Street in put options plus 86.9 percent, and D.E. Shaw in shares plus 126.2 percent are building positions"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;My instinct is that the stock is closer to a bottom than a top, but that the bottom may not be in yet. Forced selling, analyst downgrades from today&amp;rsquo;s CagriSema miss, and the looming orforglipron approval create a window where further downside is plausible. Barclays noted that some will call the 2026 guide a &amp;ldquo;kitchen sink&amp;rdquo; that management will beat, but as they pointed out, the same was said last year and it proved wrong.&lt;/p&gt;
&lt;p&gt;For investors with a 3-5 year horizon who can tolerate further near-term downside, this is getting interesting. For anyone who needs to see improving fundamentals before committing capital, there is no rush. The problems I&amp;rsquo;ve laid out here are structural, and structural takes quarters to fix, not days.&lt;/p&gt;
&lt;aside class="disclaimer" role="note" aria-label="Disclaimer"&gt;
&lt;div class="disclaimer-content"&gt;This analysis is based on publicly available information as of February 23, 2026. It reflects the author&amp;rsquo;s personal interpretation and opinion, not investment, financial, or legal advice. I hold no position in NVO or LLY at the time of writing. All projections involve uncertainty and forward-looking statements may prove wrong. Key data sources: Novo Nordisk annual report FY2025, Novo Nordisk press releases, Eli Lilly Q4 2025 earnings, sell-side research (DNB Carnegie, Deutsche Bank, TD Cowen, Canaccord Genuity, CFRA, KeyBanc, Jefferies), FDA.gov, CMS.gov, company filings. Expert quotes are from third-party sources, not the author&amp;rsquo;s direct conversations.&lt;/div&gt;
&lt;/aside&gt;</description></item><item><title>The Absolute Insider Mess of Prediction Markets</title><link>http://philippdubach.com/posts/the-absolute-insider-mess-of-prediction-markets/</link><pubDate>Sun, 22 Feb 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/the-absolute-insider-mess-of-prediction-markets/</guid><description>&lt;p&gt;Someone at Google, or close enough to Google, &lt;a href="https://www.inc.com/ava-levinson/polymarket-million-dollar-google-win-raises-questions/91274626"&gt;deposited $3 million into Polymarket&lt;/a&gt; on December 3, 2025, bet on 23 separate &amp;ldquo;Google Year in Search&amp;rdquo; outcomes, &lt;a href="https://gizmodo.com/polymarket-user-accused-of-1-million-insider-trade-on-google-search-markets-2000696258"&gt;got 22 right&lt;/a&gt;, and walked away with &lt;strong&gt;$1.15 million&lt;/strong&gt; in profit in under 24 hours. One of those bets: that &lt;a href="https://thedefiant.io/news/defi/polymarket-users-suspect-insider-trading-after-google-trend-markets-crown-surprise-winner"&gt;d4vd would be the most-searched person of 2025&lt;/a&gt;, purchased at roughly 5 cents when the market gave it a 0.2% probability.&lt;/p&gt;
&lt;p&gt;The wallet, originally called AlphaRacoon, had previously made over $150,000 correctly &lt;a href="https://finance.yahoo.com/news/polymarket-user-makes-over-1-155738322.html"&gt;predicting the exact launch window&lt;/a&gt; of Google&amp;rsquo;s Gemini 3.0 in November 2025. As blockchain engineer &lt;a href="https://x.com/JeongHaeju"&gt;Haeju Jeong&lt;/a&gt;, who first flagged the account, put it: this is a Google insider milking Polymarket for quick money. The wallet later changed its username to 0xafEe, which might be the most half-hearted attempt at anonymity since an MIT researcher &lt;a href="https://www.cnbc.com/2017/07/13/mit-scientist-googled-insider-trading-then-got-arrested-for-insider-trading.html"&gt;Googled &amp;ldquo;how sec detect unusual trade&amp;rdquo;&lt;/a&gt; before insider trading.&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;ve been following prediction markets for a while, mostly for the macro forecasting angle, but also because the regulatory ambiguity is fascinating and there are some market inefficiencies worth watching. But the last three months have produced a concentration of insider trading cases that made me want to work through the problem more carefully. The AlphaRacoon case is the most entertaining. The two that followed are more serious.&lt;/p&gt;
&lt;h2 id="three-cases-three-months-zero-enforcement"&gt;Three cases, three months, zero enforcement&lt;/h2&gt;
&lt;p&gt;On February 12, 2026, Israeli authorities &lt;a href="https://www.timesofisrael.com/two-indicted-for-using-classified-info-to-place-online-bets-on-military-operations/"&gt;indicted two people&lt;/a&gt; for using classified military intelligence to bet on Polymarket during &lt;a href="https://www.npr.org/2026/02/12/nx-s1-5712801/polymarket-bets-traders-israel-military"&gt;Israel&amp;rsquo;s 12-day war with Iran&lt;/a&gt; in June 2025. A Polymarket account called &lt;a href="https://gizmodo.com/israel-accuses-two-polymarket-bettors-of-trading-on-classified-military-operations-2000721224"&gt;&amp;ldquo;ricosuave666&amp;rdquo;&lt;/a&gt; placed seven bets on questions like &amp;ldquo;Will Israel attack Iran on Friday?&amp;rdquo; and got every one correct. The most profitable single wager: &lt;a href="https://coinpaper.com/14565/israel-indicts-two-over-polymarket-iran-bets"&gt;nearly $129,000&lt;/a&gt; that Israel would strike by a specified date. Total winnings: roughly &lt;a href="https://www.middleeasteye.net/news/israeli-soldier-indicted-allegedy-using-classified-intelligence-bet-attacks-mena"&gt;$150,000-$152,000&lt;/a&gt;. The &lt;a href="https://www.nbcnews.com/world/israel/israel-charges-reservist-classified-information-bet-polymarket-rcna258709"&gt;Shin Bet, Israel Police, and Defense Ministry&lt;/a&gt; called it a real security risk to IDF operations. This is the &lt;a href="https://www.npr.org/2026/02/12/nx-s1-5712801/polymarket-bets-traders-israel-military"&gt;first criminal prosecution&lt;/a&gt; anywhere in the world tied to prediction market insider trading.&lt;/p&gt;
&lt;p&gt;In between, there was Venezuela. On the evening of January 2, 2026, an account called &amp;ldquo;Burdensome-Mix,&amp;rdquo; created less than a week earlier, placed over $20,000 in bets that Maduro would be removed from power by January 31. Less than an hour after the final bet, &lt;a href="https://www.cbsnews.com/news/polymarket-maduro-capture-bet-400000/"&gt;Trump ordered the military strike&lt;/a&gt;. By 4:21 AM, Maduro was captured. The account&amp;rsquo;s $33,934 across 13 bets &lt;a href="https://fortune.com/2026/01/05/prediction-markets-insider-trading-problem/"&gt;returned &lt;strong&gt;$436,759&lt;/strong&gt;&lt;/a&gt;. &lt;a href="https://www.ms.now/news/lucrative-bets-on-venezuela-trigger-insider-trading-scrutiny"&gt;Chainalysis found&lt;/a&gt; the trader cashed out through mainstream U.S. exchanges with no apparent effort to hide their identity. The trader has never been identified.&lt;/p&gt;
&lt;p&gt;Each case escalates. AlphaRacoon is someone profiting from corporate knowledge. Burdensome-Mix had advance knowledge of U.S. foreign policy. The Israeli soldiers were monetizing classified operational intelligence during wartime. The surface area for insider trading on prediction markets is, to use the technical term, enormous. &lt;a href="#lightbox-three-insider-cases-png-0" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/three-insider-cases.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/three-insider-cases.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/three-insider-cases.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/three-insider-cases.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/three-insider-cases.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/three-insider-cases.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/three-insider-cases.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/three-insider-cases.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/three-insider-cases.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/three-insider-cases.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/three-insider-cases.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/three-insider-cases.png"
alt="Exhibit showing three insider trading cases compared side by side: AlphaRacoon with $1.15M profit from 22 of 23 Google bets using corporate information with no enforcement, Burdensome-Mix with $436K profit from 13 of 13 Maduro bets using government information with no enforcement, and ricosuave666 with $152K profit from 7 of 7 Israel-Iran strike bets using classified military intelligence prosecuted by Israel only"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;h2 id="now-theres-an-ai-that-hunts-them"&gt;Now there&amp;rsquo;s an AI that hunts them&lt;/h2&gt;
&lt;p&gt;Peter Liu, a former Google DeepMind research scientist now co-founding Twenty Labs, &lt;a href="https://x.com/peterjliu/status/2024901585806225723"&gt;published results&lt;/a&gt; from Compound AI&amp;rsquo;s Polymarket integration that systematically detects suspected insiders. The system built a custom database optimized for AI agent queries rather than relying on Polymarket&amp;rsquo;s rate-limited API. Liu described the agents as &amp;ldquo;super-human at making data science queries,&amp;rdquo; noting that each agent operates like 10 concurrent human analysts.&lt;/p&gt;
&lt;p&gt;Compound AI independently rediscovered AlphaRacoon despite the username change. More interestingly, it found that AlphaRacoon has friends: a user called &amp;ldquo;yicici&amp;rdquo; who made money in the same Google markets, suggesting a coordinated network rather than a lone wolf. When pointed at OpenAI, the system found accounts &amp;ldquo;oddly good at predicting OpenAI launch dates for models and products,&amp;rdquo; with at least one that exclusively traded OpenAI events.&lt;/p&gt;
&lt;p&gt;It&amp;rsquo;s not just Compound AI. &lt;a href="https://gizmodo.com/tracking-insider-trading-on-polymarket-is-turning-into-a-business-of-its-own-2000709286"&gt;Polysights&lt;/a&gt;, built by 29-year-old Canadian trader Tre Upshaw, has attracted &lt;a href="https://www.bloomberg.com/news/articles/2026-01-13/prediction-market-insider-trading-drawing-increased-scrutiny"&gt;24,000 users&lt;/a&gt; and is closing a $2 million funding round after receiving a &lt;a href="https://gizmodo.com/tracking-insider-trading-on-polymarket-is-turning-into-a-business-of-its-own-2000709286"&gt;$25,000 Polymarket grant&lt;/a&gt;. Roughly 85% of flagged trades turned out to be winners. Individual programmers have built &lt;a href="https://www.civolatility.com/p/polymarkets-insider-trading-problem"&gt;copytrading bots&lt;/a&gt; that follow suspected insiders, with one reportedly turning $5,700 into $80,000 by tailing signals during the Maduro event.&lt;/p&gt;
&lt;p&gt;The irony is rich. Blockchain&amp;rsquo;s radical transparency, the thing that was supposed to make financial markets honest, is simultaneously enabling insider detection and insider copytrading. The same data pipeline that lets Compound AI catch cheaters also lets copytraders amplify their profits.&lt;/p&gt;
&lt;h2 id="regulation"&gt;Regulation&lt;/h2&gt;
&lt;p&gt;On regulated stock markets, insider trading law is well-established. &lt;a href="https://en.wikipedia.org/wiki/SEC_Rule_10b-5"&gt;SEC Rule 10b-5&lt;/a&gt;, decades of case law, a well-staffed enforcement division, cooperation agreements with every broker-dealer in America. Everyone in the industry knows it&amp;rsquo;s illegal.&lt;/p&gt;
&lt;p&gt;On prediction markets, almost none of that infrastructure exists. SEC Rule 10b-5 doesn&amp;rsquo;t apply because &lt;a href="https://www.corporatecomplianceinsights.com/prediction-markets-sports-betting-insider-trading/"&gt;prediction market contracts are swaps, not securities&lt;/a&gt;. That puts them under the &lt;a href="https://en.wikipedia.org/wiki/Commodity_Futures_Trading_Commission"&gt;CFTC&lt;/a&gt;, which has historically focused on commodity manipulation (spoofing, cornering), not information-based trading. The CFTC has brought &lt;a href="https://www.dlnews.com/articles/regulation/prediction-markets-bend-insider-trading-rules-will-they-break/"&gt;exactly zero enforcement actions&lt;/a&gt; for prediction market insider trading.&lt;/p&gt;
&lt;p&gt;The CFTC does have &lt;a href="https://www.corporatecomplianceinsights.com/prediction-markets-sports-betting-insider-trading/"&gt;Rule 180.1&lt;/a&gt;, modeled on 10b-5, which prohibits trading on material nonpublic information. But with a distinction that matters: it requires proof of a breached &amp;ldquo;pre-existing duty.&amp;rdquo; In securities law, nearly any MNPI-based trade violates the law. In commodities law, trading on proprietary information is the entire point: a farmer trading grain futures based on their own crop outlook is how the market is supposed to work. Former CFTC Commissioner &lt;a href="https://en.wikipedia.org/wiki/Kalshi"&gt;Caroline Pham&lt;/a&gt; has argued that importing securities-law concepts into derivatives markets is analytically confused.&lt;/p&gt;
&lt;p&gt;Daniel Barabander of Variant Fund &lt;a href="https://variant.fund/articles/thoughts-law-insider-trading-prediction-markets/"&gt;published an analysis&lt;/a&gt; on February 6 that crystallized the problem. Insider trading is fundamentally about breaching a promise: a Tesla employee trading on a &amp;ldquo;Will TSLA beat Q4 estimates?&amp;rdquo; prediction market violates their confidentiality obligations. But someone who overhears investment bankers discussing a deal at a restaurant generally commits no crime, because no promise exists to breach. Prediction markets, &amp;ldquo;by making almost anything tradable,&amp;rdquo; expand valuable inside information into contexts where the existence of any relevant promise is far less clear.&lt;/p&gt;
&lt;p&gt;The strongest enforcement tool may be criminal wire fraud. At the Securities Enforcement Forum on February 5, SDNY U.S. Attorney Jay Clayton was &lt;a href="https://natlawreview.com/article/betting-future-enforcement-risks-prediction-markets"&gt;asked&lt;/a&gt; whether prediction market participants were beyond the reach of fraud statutes. His answer: &amp;ldquo;No.&amp;rdquo; Asked whether to expect enforcement actions: &amp;ldquo;Yes.&amp;rdquo; But &lt;a href="https://www.corporatecomplianceinsights.com/prediction-markets-sports-betting-insider-trading/"&gt;Polymarket&amp;rsquo;s terms of service&lt;/a&gt; don&amp;rsquo;t specifically mention insider trading, which complicates the wire fraud theory.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Matt_Levine_(journalist)"&gt;Matt Levine&lt;/a&gt;, who has written about this topic at least three times between December 2025 and February 2026, &lt;a href="https://www.bloomberg.com/opinion/newsletters/2026-02-12/insider-trading-on-war"&gt;puts it best&lt;/a&gt;. His core argument: insider trading is not about fairness. It&amp;rsquo;s about theft. The problem isn&amp;rsquo;t that you have information the market doesn&amp;rsquo;t. You&amp;rsquo;re supposed to try to get information the market doesn&amp;rsquo;t; that&amp;rsquo;s the entire point of financial markets. The problem is that you&amp;rsquo;re using information that belongs to someone else, your employer or client or country, without their permission. You&amp;rsquo;ve breached a duty.&lt;/p&gt;
&lt;p&gt;This framing matters because prediction market enthusiasts instinctively believe insider trading is good for their markets: it makes prices more accurate. Levine acknowledged this directly. But he also identified the fatal flaw: if prediction markets are full of insider traders, there&amp;rsquo;d be no one to trade against. He estimated that the first 20 people to get arrested for insider trading on Kalshi &amp;ldquo;will be very surprised.&amp;rdquo;&lt;/p&gt;
&lt;h2 id="why-regulation-matters-the-lemons-problem"&gt;Why regulation matters: the lemons problem&lt;/h2&gt;
&lt;p&gt;The economic case for regulating insider trading on prediction markets goes beyond fairness or legality: it&amp;rsquo;s about whether these markets can survive.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/George_Akerlof"&gt;George Akerlof&amp;rsquo;s&lt;/a&gt; 1970 &lt;a href="https://en.wikipedia.org/wiki/The_Market_for_Lemons"&gt;&amp;ldquo;Market for Lemons&amp;rdquo;&lt;/a&gt; paper described a dynamic where information asymmetry between buyers and sellers causes markets to collapse. When sellers know more than buyers about product quality, buyers reduce their willingness to pay. Honest sellers with good products leave the market because they can&amp;rsquo;t get fair prices. This raises the average &amp;ldquo;lemon&amp;rdquo; rate among remaining sellers, causing more buyers to withdraw. The process continues until only lemons remain.&lt;/p&gt;
&lt;p&gt;Applied to prediction markets: if insiders consistently win, uninformed participants recognize they&amp;rsquo;re trading against counterparties with superior information and leave. Market makers widen spreads or exit entirely. Dartmouth economist &lt;a href="https://faculty.tuck.dartmouth.edu/eric-zitzewitz/"&gt;Eric Zitzewitz&lt;/a&gt;, who studies prediction markets, has stated this directly: prediction markets &amp;ldquo;require loads of uninformed investors to function&amp;rdquo; for liquidity. If liquidity providers worry about &lt;a href="https://en.wikipedia.org/wiki/Adverse_selection"&gt;adverse selection&lt;/a&gt;, they provide less liquidity, and any accuracy benefit from insider trading is more than offset by the participation loss. &lt;a href="#lightbox-adverse-selection-spiral-png-1" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/adverse-selection-spiral.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/adverse-selection-spiral.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/adverse-selection-spiral.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/adverse-selection-spiral.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/adverse-selection-spiral.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/adverse-selection-spiral.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/adverse-selection-spiral.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/adverse-selection-spiral.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/adverse-selection-spiral.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/adverse-selection-spiral.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/adverse-selection-spiral.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/adverse-selection-spiral.png"
alt="Exhibit showing Akerlof&amp;#39;s Market for Lemons dynamic applied to prediction markets as a five-step adverse selection spiral: Step 1 insiders profit at extreme win rates, Step 2 uninformed traders absorb systematic losses, Step 3 participants withdraw from the market, Step 4 liquidity collapses with wider spreads, Step 5 forecasting accuracy degrades as the cycle repeats"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;Wall Street firms are entering prediction markets at speed: &lt;a href="https://www.drw.com/work-at-drw/listings/prediction-markets-trader-3332253"&gt;DRW&lt;/a&gt; is building a dedicated desk at $175,000-$200,000 base salary, &lt;a href="https://news.kalshi.com/p/liquid-prediction-markets-are-finally-here"&gt;Susquehanna became Kalshi&amp;rsquo;s first official market maker&lt;/a&gt;, &lt;a href="https://www.bloomberg.com/news/articles/2026-02-09/jump-trading-poised-to-gain-stakes-in-kalshi-and-polymarket"&gt;Jump Trading&lt;/a&gt; is taking equity stakes in both platforms, and &lt;a href="https://www.cnbc.com/2026/01/15/goldman-sachs-ceo-looks-at-how-to-get-involved-in-prediction-markets.html"&gt;Goldman Sachs CEO David Solomon&lt;/a&gt; has met leadership of both Kalshi and Polymarket. These firms are there to &lt;a href="https://www.financemagnates.com/fintech/wall-street-quants-move-into-prediction-markets-to-hunt-for-arbitrage-not-to-bet/"&gt;make markets, not to bet on whether Israel will strike Iran&lt;/a&gt;. Market makers who systematically take the other side of trades bleed money when their counterparties have inside information. If the institutional players conclude the game is rigged, the resulting liquidity withdrawal would hollow out the market.&lt;/p&gt;
&lt;p&gt;Combined Polymarket and Kalshi weekly volume &lt;a href="https://europeanbusinessmagazine.com/business/prediction-markets-are-now-a-6b-a-week-industry-heres-whos-winning/"&gt;exceeded $6 billion&lt;/a&gt; by early 2026. Full-year 2025 volume across all platforms &lt;a href="https://www.gamblinginsider.com/in-depth/110180/prediction-market-statistics"&gt;reached approximately &lt;strong&gt;$44 billion&lt;/strong&gt;&lt;/a&gt;, a roughly 300x increase from early 2024. &lt;a href="https://www.npr.org/2026/01/17/nx-s1-5672615/kalshi-polymarket-prediction-market-boom-traders-slang-glossary"&gt;Bloomberg terminals now carry prediction market data&lt;/a&gt;. &lt;a href="https://www.npr.org/2026/01/17/nx-s1-5672615/kalshi-polymarket-prediction-market-boom-traders-slang-glossary"&gt;CNN&lt;/a&gt; struck a deal to integrate Kalshi markets into its coverage.&lt;/p&gt;
&lt;h2 id="prediction-markets-as-macroeconomic-forecasting-tools"&gt;Prediction markets as macroeconomic forecasting tools&lt;/h2&gt;
&lt;p&gt;On February 12, 2026, the same day Israeli authorities announced the first-ever prediction market insider trading prosecution, Federal Reserve Board economist &lt;a href="https://www.federalreserve.gov/econres/anthony-m-diercks.htm"&gt;Anthony Diercks&lt;/a&gt;, along with Jared Dean Katz (Northwestern) and Jonathan Wright (&lt;a href="https://www.nber.org/papers/w34702"&gt;Johns Hopkins/NBER&lt;/a&gt;), &lt;a href="https://www.federalreserve.gov/econres/feds/kalshi-and-the-rise-of-macro-markets.htm"&gt;published&lt;/a&gt; &amp;ldquo;Kalshi and the Rise of Macro Markets&amp;rdquo; through the &lt;a href="https://www.federalreserve.gov/econres/feds/index.htm"&gt;Fed&amp;rsquo;s Finance and Economics Discussion Series&lt;/a&gt;. It&amp;rsquo;s the &lt;a href="https://natlawreview.com/article/federal-reserve-researchers-find-prediction-markets-deliver-forecasting-value"&gt;most thorough empirical study yet&lt;/a&gt; on whether prediction markets work as macroeconomic forecasting tools.&lt;/p&gt;
&lt;p&gt;The headline finding: Kalshi&amp;rsquo;s macro markets perform as well as, and in some cases better than, traditional forecasting instruments. For &lt;a href="https://en.wikipedia.org/wiki/Federal_funds_rate"&gt;federal funds rate&lt;/a&gt; decisions, Kalshi&amp;rsquo;s median and mode forecasts &lt;a href="https://defirate.com/news/federal-reserve-study-finds-kalshi-markets-rival-traditional-economic-forecast-tools/"&gt;matched the actual policy outcome&lt;/a&gt; on the day before every FOMC meeting since 2022. That&amp;rsquo;s a perfect record. The mean absolute error for rate forecasts 150 days out was comparable to the &lt;a href="https://www.newyorkfed.org/markets/survey-market-participants"&gt;New York Fed&amp;rsquo;s Survey of Market Expectations&lt;/a&gt;, a survey of professional forecasters. For headline CPI, Kalshi forecasts &lt;a href="https://www.cryptonewsz.com/federal-reserve-study-kalshi-macro-forecast/"&gt;statistically outperformed the Bloomberg consensus&lt;/a&gt; in certain windows.&lt;/p&gt;
&lt;p&gt;The paper identifies a specific structural advantage. &lt;a href="https://en.wikipedia.org/wiki/Federal_funds_rate#Federal_funds_futures"&gt;Fed funds futures&lt;/a&gt; force a binomial assumption: two possible outcomes per meeting. Kalshi&amp;rsquo;s contract structure assigns nonzero probability to seven or more distinct rate outcomes simultaneously. After speeches by Fed Governors &lt;a href="https://en.wikipedia.org/wiki/Christopher_Waller"&gt;Waller&lt;/a&gt; and Bowman, Kalshi markets adjusted the implied probability of a July 2025 rate cut to around 25% within hours. That probability dropped after the June employment report beat forecasts. This is what the authors call &amp;ldquo;rich intraday dynamics&amp;rdquo;: the market updates continuously as information arrives, unlike surveys that provide snapshots every six weeks.&lt;/p&gt;
&lt;p&gt;The Fed paper is preliminary research, not official policy. But the central bank&amp;rsquo;s own economists are treating prediction markets as credible information infrastructure. The authors &lt;a href="https://natlawreview.com/article/federal-reserve-researchers-find-prediction-markets-deliver-forecasting-value"&gt;intend to make the underlying data publicly available&lt;/a&gt;, which would further normalize prediction market data as a standard input to policy analysis.&lt;/p&gt;
&lt;p&gt;If prediction markets are valuable enough that the Federal Reserve is studying them as forecasting tools for &lt;a href="https://en.wikipedia.org/wiki/Monetary_policy_of_the_United_States"&gt;monetary policy&lt;/a&gt;, the insider trading problem becomes a question of whether a tool the central bank wants to rely on can maintain the informational integrity that makes it useful. Insiders trading on classified military intelligence don&amp;rsquo;t make the Fed&amp;rsquo;s rate probability distributions more accurate. They make them less trustworthy.&lt;/p&gt;
&lt;h2 id="the-regulatory-picture-fractured"&gt;The regulatory picture, fractured&lt;/h2&gt;
&lt;p&gt;There are two regulatory tracks, and they aren&amp;rsquo;t converging.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Kalshi"&gt;Kalshi&lt;/a&gt; is &lt;a href="https://news.kalshi.com/p/how-kalshi-keeps-traders-safe"&gt;CFTC-regulated&lt;/a&gt;, explicitly prohibits insider trading, runs an in-house surveillance system called &amp;ldquo;Poirot,&amp;rdquo; has completed over 200 investigations in the past year, and requires &lt;a href="https://en.wikipedia.org/wiki/Know_your_customer"&gt;KYC/AML&lt;/a&gt; verification. &lt;a href="https://en.wikipedia.org/wiki/Polymarket"&gt;Polymarket&lt;/a&gt;&amp;rsquo;s international platform, operated by a Panama-incorporated entity, allows &lt;a href="https://www.corporatecomplianceinsights.com/prediction-markets-sports-betting-insider-trading/"&gt;permissionless crypto wallets without identity verification&lt;/a&gt;. Its terms of service don&amp;rsquo;t specifically mention insider trading. &lt;a href="#lightbox-kalshi-vs-polymarket-regulation-png-3" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/kalshi-vs-polymarket-regulation.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/kalshi-vs-polymarket-regulation.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/kalshi-vs-polymarket-regulation.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/kalshi-vs-polymarket-regulation.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/kalshi-vs-polymarket-regulation.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/kalshi-vs-polymarket-regulation.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/kalshi-vs-polymarket-regulation.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/kalshi-vs-polymarket-regulation.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/kalshi-vs-polymarket-regulation.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/kalshi-vs-polymarket-regulation.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/kalshi-vs-polymarket-regulation.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/kalshi-vs-polymarket-regulation.png"
alt="Exhibit comparing Kalshi and Polymarket regulatory postures across seven dimensions: Kalshi has CFTC regulation, full KYC, explicit insider trading prohibition, Poirot surveillance system, institutional market makers, and USD settlement, while Polymarket has no regulator, permissionless crypto wallets, no insider trading policy, no surveillance, and its CEO calls insider trading super cool"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;CFTC Chairman Michael Selig, confirmed in December 2025, laid out a &lt;a href="https://www.sidley.com/en/insights/newsupdates/2026/02/us-cftc-signals-imminent-rulemaking-on-prediction-markets"&gt;four-part plan&lt;/a&gt; on January 29: withdraw the Biden-era proposed ban on political event contracts (done February 4), begin drafting new rules, assess ongoing litigation, and support market development. On February 17, he &lt;a href="https://www.cnbc.com/2026/02/17/cftc-defends-prediction-market-enforcement-states-challenge.html"&gt;published a Wall Street Journal op-ed&lt;/a&gt; asserting exclusive CFTC jurisdiction over prediction markets and filed an amicus brief supporting Crypto.com against Nevada gaming regulators. Selig announced an advisory committee whose planned members include both &lt;a href="https://en.wikipedia.org/wiki/Polymarket"&gt;Polymarket CEO Shayne Coplan&lt;/a&gt; and &lt;a href="https://en.wikipedia.org/wiki/Kalshi"&gt;Kalshi CEO Tarek Mansour&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Rep. Ritchie Torres (D-NY) &lt;a href="https://ritchietorres.house.gov/posts/in-response-to-suspicious-polymarket-trade-preceding-maduro-operation-rep-ritchie-torres-introduces-legislation-to-crack-down-on-insider-trading-on-prediction-markets"&gt;introduced legislation&lt;/a&gt; in late January, directly responding to the Maduro trade, that would ban federal officials from trading prediction market contracts related to government activity. The bill targets a real problem, Levine&amp;rsquo;s point about government officials profiting from events they can influence, but it doesn&amp;rsquo;t create a general insider trading prohibition. It wouldn&amp;rsquo;t have stopped AlphaRacoon or the Israeli soldiers.&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;m genuinely unsure where this lands. The libertarian case for prediction market insider trading, that it makes prices more accurate and the market should be a pure information aggregation mechanism, has intellectual appeal. The Akerlof case against it, that unchecked adverse selection destroys the market&amp;rsquo;s ability to function, has empirical support. The &lt;a href="https://www.federalreserve.gov/econres/feds/kalshi-and-the-rise-of-macro-markets.htm"&gt;Diercks, Katz, and Wright paper&lt;/a&gt; suggests the stakes are higher than either camp acknowledges: these aren&amp;rsquo;t just gambling venues. They&amp;rsquo;re becoming part of the plumbing that central banks and institutional investors use to make real decisions.&lt;/p&gt;
&lt;p&gt;My instinct, and I want to be honest that it&amp;rsquo;s more instinct than conclusion at this point, is that the prediction market industry will end up roughly where securities markets were after the &lt;a href="https://en.wikipedia.org/wiki/Securities_Exchange_Act_of_1934"&gt;Securities Exchange Act of 1934&lt;/a&gt;. Some insider trading enforcement is necessary to maintain market integrity, not because trading on private information is inherently wrong, but because without it, the adverse selection spiral will destroy the markets that are otherwise proving genuinely useful. The question is whether that enforcement framework gets built proactively or whether it takes a scandal large enough to force it.&lt;/p&gt;
&lt;p&gt;Polymarket&amp;rsquo;s CEO has &lt;a href="https://gizmodo.com/israel-accuses-two-polymarket-bettors-of-trading-on-classified-military-operations-2000721224"&gt;called insider trading &amp;ldquo;super cool.&amp;rdquo;&lt;/a&gt; The Fed is &lt;a href="https://www.federalreserve.gov/econres/feds/kalshi-and-the-rise-of-macro-markets.htm"&gt;studying his platform&amp;rsquo;s macro forecasting ability&lt;/a&gt;. The Israeli military is &lt;a href="https://www.npr.org/2026/02/12/nx-s1-5712801/polymarket-bets-traders-israel-military"&gt;prosecuting soldiers&lt;/a&gt; who bet on it.&lt;/p&gt;</description></item><item><title>Economics of a Super Bowl Ad</title><link>http://philippdubach.com/posts/economics-of-a-super-bowl-ad/</link><pubDate>Fri, 20 Feb 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/economics-of-a-super-bowl-ad/</guid><description>&lt;p&gt;A 30-second Super Bowl ad costs &lt;strong&gt;$8 million&lt;/strong&gt;. That&amp;rsquo;s $267,000 per second, roughly the median U.S. home price for every tick of the clock. Super Bowl LX drew &lt;a href="https://www.nielsen.com/news-center/2026/super-bowl-lx-delivers-124-9-million-viewers/"&gt;124.9 million average viewers with a peak of 137.8 million&lt;/a&gt;, the highest peak audience in American television history. The NFL accounted for &lt;a href="https://www.sportico.com/business/media/2026/sportico-top-100-nfl-towers-over-us-media-landscape-1234880235/"&gt;84 of the top 100 most-watched U.S. telecasts&lt;/a&gt; in 2025. The Oscars, by comparison, managed 19.7 million.&lt;/p&gt;
&lt;p&gt;Ro (that&amp;rsquo;s the name of the direct-to-patient telehealth company) CEO Zachariah Reitano, writing from direct experience as a 2026 Super Bowl advertiser, &lt;a href="https://ro.co/perspectives/super-bowl-economics/"&gt;published a detailed cost breakdown&lt;/a&gt; based on his own spending and interviews with 10+ brands. The picture that emerges is considerably more expensive than the headline number. Production runs $1–4 million for studio, crew, and post-production before any famous face enters the frame. Celebrity endorsement talent adds $1–5 million, with the current A-list sweet spot at $3–5 million &lt;a href="https://www.hollywoodreporter.com/business/business-news/2026-super-bowl-ads-stars-ai-comedy-1236490270/"&gt;according to WME agent Tim Curtis&lt;/a&gt;. Then comes the companion buy: for every 30-second slot, advertisers are generally required to commit to spending an equivalent amount on other programs broadcast by the same network. For NBC&amp;rsquo;s 2026 Super Bowl, that meant additional inventory across the Winter Olympics and NBA All-Star Game, adding another $7–10 million to the tab.&lt;/p&gt;
&lt;p&gt;Total committed spend: &lt;strong&gt;$16–23 million&lt;/strong&gt; for a single 30-second spot. &lt;a href="https://www.cfo.com/news/a-cfo-guide-to-super-bowl-ad-spend-jason-hershman-point-/811381/"&gt;CFO.com&amp;rsquo;s Jason Hershman&lt;/a&gt; brackets the full range at $15–50 million depending on ambition.&lt;/p&gt;
&lt;p&gt;For companies already spending nine figures annually on marketing, the framing of a Super Bowl ad as a &amp;ldquo;portfolio bet with capped downside&amp;rdquo; applies to virtually any marketing investment at that scale. It&amp;rsquo;s whether that $10 million generates more value here than in the other places you&amp;rsquo;ve been spending $10 million. The observation is reductive but directionally useful: the special-ness of the Super Bowl needs to be demonstrated in the data, not assumed from the vibes. But then on the other hand, as &lt;a href="https://www.bloomberg.com/opinion/newsletters/2026-02-09/predicting-the-big-game?cmpid=BBD020926_MONEYSTUFF"&gt;Matt Levin&lt;/a&gt; puts it, it&amp;rsquo;s comparably cheap:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;One thing that the ads made me think about is how cheap Super Bowl advertising is, for an AI company. A Super Bowl spot costs something like $10 million for airtime plus another few million to produce, for a total at the high end of maybe $20 or $30 million, or roughly the cost of paying one employee for one month at a leading AI lab. Mark Zuckerberg carries around $30 million in his wallet in case he runs into an OpenAI engineer at Starbucks. The cost of creating a cutting-edge AI model — in compute and researcher pay — is astronomical in a way that makes the cost of any advertising, even Super Bowl advertising, look like nothing.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;But let&amp;rsquo;s look at the data.&lt;/p&gt;
&lt;h2 id="the-cpm-looks-reasonable-everything-else-is-complicated"&gt;The CPM looks reasonable. Everything else is complicated.&lt;/h2&gt;
&lt;p&gt;At $8 million reaching roughly 125 million viewers, the Super Bowl&amp;rsquo;s &lt;a href="https://adwave.com/resources/super-bowl-commercial-cost"&gt;effective CPM lands around $63–65 per thousand impressions&lt;/a&gt;. Standard primetime TV runs $20–30. Streaming TV sits at $15–35. TikTok charges $5–10. &lt;a href="https://digiday.com/marketing/heres-what-else-a-8m-30-second-super-bowl-budget-can-purchase-in-2026/"&gt;Digiday calculated&lt;/a&gt; that for the same $8 million media buy, an advertiser could purchase 1.6 billion TikTok impressions, 267 million Google search impressions, or a primetime network TV spot every night for four months.&lt;/p&gt;
&lt;p&gt;But CPM comparisons are misleading here because they treat all impressions as equivalent. They aren&amp;rsquo;t. The Super Bowl is the last true monoculture event in American media, and the only advertising environment where the ads are the product. People rewatch them, rank them, discuss them at work Monday morning. The Today Show airs them as content. &lt;a href="https://www.edo.com/resources/how-tv-advertisers-can-win-super-bowl-and-beyond"&gt;EDO&lt;/a&gt;, a TV outcomes measurement company, found that a single Super Bowl ad generates the same brand-search engagement as &lt;strong&gt;1,056 primetime ads&lt;/strong&gt;.&lt;/p&gt;
&lt;h2 id="there-is-academic-evidence-on-super-bowl-ad-roi"&gt;There is academic evidence on Super Bowl ad ROI&lt;/h2&gt;
&lt;p&gt;The cleanest causal evidence comes from &lt;a href="https://www.gsb.stanford.edu/insights/do-super-bowl-ads-really-work"&gt;Wesley Hartmann at Stanford GSB and Daniel Klapper at Humboldt University&lt;/a&gt;, published in &lt;em&gt;Marketing Science&lt;/em&gt;. Using &lt;a href="https://web.stanford.edu/~wesleyr/SuperBowl.pdf"&gt;Nielsen data across 55 media markets and six years of Super Bowls&lt;/a&gt;, they exploited exogenous variation in viewership (specifically, ratings spikes caused by local team participation) to estimate causal effects. Their results: Budweiser earned an extra $96 million from Super Bowl advertising, a &lt;strong&gt;172% return on investment&lt;/strong&gt;. Budweiser&amp;rsquo;s short-run sales revenue ran 15.75% higher per household than competitors in the weeks following the game.&lt;/p&gt;
&lt;p&gt;But Hartmann and Klapper&amp;rsquo;s most important finding on ad effectiveness is that when two brands in the same product category both advertise, neither gains incremental profit. The effects cancel out. Coca-Cola and Pepsi have both advertised annually in the Super Bowl for years. The researchers found no statistically significant volume increase for Coca-Cola regardless of whether it advertised, and the direction of the coefficients, if anything, suggested a negative relationship. The entire soda category&amp;rsquo;s Super Bowl spending appears to be a value-destroying exercise that neither side can unilaterally exit.&lt;/p&gt;
&lt;p&gt;This is a textbook prisoner&amp;rsquo;s dilemma. Game theory applied to advertising predicts exactly this outcome: if Bud Light and Coors Light both spend $50 million on ads, they each profit $200 million. If both spend only $10 million, they each profit $240 million. Both rationally choose $50 million. &lt;a href="#lightbox-super-bowl-prisoners-dilemma-png-0" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/super-bowl-prisoners-dilemma.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/super-bowl-prisoners-dilemma.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/super-bowl-prisoners-dilemma.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/super-bowl-prisoners-dilemma.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/super-bowl-prisoners-dilemma.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/super-bowl-prisoners-dilemma.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/super-bowl-prisoners-dilemma.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/super-bowl-prisoners-dilemma.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/super-bowl-prisoners-dilemma.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/super-bowl-prisoners-dilemma.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/super-bowl-prisoners-dilemma.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/super-bowl-prisoners-dilemma.png"
alt="Super Bowl advertising prisoner&amp;#39;s dilemma payoff matrix showing two competing beer brands where both rationally choose heavy spend of $50M each yielding $200M profit apiece at Nash equilibrium, versus the Pareto optimal outcome of light spend at $10M each yielding $240M profit apiece, destroying $80M in collective profit that the NFL captures"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;Anheuser-Busch understood this and paid to avoid it. The company &lt;a href="https://www.marketingdive.com/news/NFL-Anheuser-Busch-InBev-Super-Bowl-Advertising/625707/"&gt;held exclusive beer advertising rights for 33 consecutive years&lt;/a&gt; (1989–2022), spending &lt;a href="https://money.cnn.com/2016/02/05/news/anheuser-busch-super-bowl-advertising/"&gt;&lt;strong&gt;$278 million over a decade&lt;/strong&gt;&lt;/a&gt; partly to prevent competitive neutralization. When exclusivity ended in 2023, the Super Bowl immediately featured nine beer ads from multiple brands. Budweiser&amp;rsquo;s ROI almost certainly declined.&lt;/p&gt;
&lt;p&gt;Stock price studies paint a muddier picture. An &lt;a href="https://doi.org/10.3390/su12176686"&gt;MDPI Sustainability study&lt;/a&gt; examining 272 ads from 142 firms (2010–2019) found positive cumulative abnormal returns of 2.35% over 10 days post-game. &lt;a href="https://bridgewise.com/blog/super-bowl-stock-price-fumble/"&gt;Bridgewise&lt;/a&gt;, covering 2021–2024, found the opposite: a portfolio of Super Bowl advertisers underperformed the S&amp;amp;P 500 by 9.2% after six months, with only 25% of individual advertisers outperforming. &lt;a href="https://www.kantar.com/north-america/company-news/in-game-ad-revenue-for-super-bowl-lvi-increased-by-more-than-143-million"&gt;Kantar&amp;rsquo;s analysis&lt;/a&gt; reports an average ROI of $4.60 per dollar spent, a figure broadly consistent with their multi-year tracking. A &lt;a href="https://digitalcommons.georgiasouthern.edu/marketing-facpubs/19/"&gt;Georgia Southern study by Eastman and Iyer&lt;/a&gt; found that USA Today Ad Meter likeability scores, the industry&amp;rsquo;s most-cited metric for judging Super Bowl ads, had no significant relationship with financial effectiveness.&lt;/p&gt;
&lt;h2 id="attribution-is-difficult"&gt;Attribution is difficult&lt;/h2&gt;
&lt;p&gt;I always wondered how well attribution works. It seems mostly guesswork to me. The evidence suggests this is more right than wrong, though &amp;ldquo;guesswork&amp;rdquo; understates the sophistication of modern marketing attribution tools while overstating their accuracy. A &lt;a href="https://www.revsure.ai/the-state-of-marketing-attribution-in-2024"&gt;2024 Ascend2 survey&lt;/a&gt; found that only 29% of marketers are &amp;ldquo;extremely confident&amp;rdquo; in their attribution accuracy. More than a third of CMOs do not fully trust their own marketing data. The problems are real: privacy signal loss from GDPR, CCPA, and iOS opt-outs has degraded observable data. Cross-device fragmentation means customers touch 3–5+ devices before converting. Platform self-reporting creates systematic overcounting, with Google, Meta, and Amazon each claiming credit for the same sale.&lt;/p&gt;
&lt;p&gt;For Super Bowl ads specifically, the attribution challenge is amplified by confounding. Brands run concurrent promotions, digital retargeting campaigns, influencer activations, and PR blitzes. Many release ads days before the game. Academic research suggests pricing relative to competition has 20–25x greater impact on sales than total advertising across all channels, which means a coincidental price change during Super Bowl week can wash out the advertising signal entirely.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://blog.cloudflare.com/super-bowl-lviii/"&gt;Cloudflare&amp;rsquo;s DNS data&lt;/a&gt; showed TurboTax saw a &lt;strong&gt;24,875%&lt;/strong&gt; traffic increase above baseline after its 2024 Super Bowl ad. e.l.f. Cosmetics saw 8,118%. Poppi saw 7,329%. But a &lt;a href="https://www.similarweb.com/blog/insights/super-bowl-impact/"&gt;Similarweb analysis&lt;/a&gt; of 28-day post-game traffic found an average increase of only &lt;strong&gt;~1%&lt;/strong&gt; across all advertisers. The spike is enormous and ephemeral. &lt;a href="https://adage.com/article/special-report-super-bowl/super-bowl-glow-measure-weeks/296864/"&gt;YouGov BrandIndex&lt;/a&gt; found that only 10 of roughly 50+ advertisers saw positive buzz lift above the margin of error, with a maximum duration of two weeks. &lt;a href="#lightbox-super-bowl-spike-vs-sustain-png-1" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/super-bowl-spike-vs-sustain.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/super-bowl-spike-vs-sustain.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/super-bowl-spike-vs-sustain.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/super-bowl-spike-vs-sustain.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/super-bowl-spike-vs-sustain.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/super-bowl-spike-vs-sustain.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/super-bowl-spike-vs-sustain.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/super-bowl-spike-vs-sustain.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/super-bowl-spike-vs-sustain.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/super-bowl-spike-vs-sustain.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/super-bowl-spike-vs-sustain.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/super-bowl-spike-vs-sustain.png"
alt="Super Bowl ad attribution gap showing real-time Cloudflare DNS traffic spikes of 24,875 percent for TurboTax, 8,118 percent for e.l.f. Cosmetics, and 7,329 percent for Poppi contrasted against Similarweb 28-day sustained lift of only approximately 1 percent across all Super Bowl advertisers"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.cfo.com/news/a-cfo-guide-to-super-bowl-ad-spend-jason-hershman-point-/811381/"&gt;CFO.com&amp;rsquo;s Hershman&lt;/a&gt; had the clearest framing for anyone trying to evaluate this honestly: marketing will come back with impressions, social mentions, and &amp;ldquo;earned media value,&amp;rdquo; which he described as Wall Street&amp;rsquo;s least favorite made-up metric. The only meaningful number is incremental contribution profit. At 40% gross margin, a $13 million all-in Super Bowl investment needs &lt;strong&gt;$32.5 million in incremental revenue&lt;/strong&gt; just to break even on pure acquisition economics.&lt;/p&gt;
&lt;h2 id="the-nfls-advertising-pricing-machine"&gt;The NFL&amp;rsquo;s advertising pricing machine&lt;/h2&gt;
&lt;p&gt;The NFL operates a price-discrimination machine that has outpaced inflation for 60 years. Super Bowl ad prices have increased from $37,500 in 1967 to $8 million in 2026, &lt;a href="https://www.superbowl-ads.com/cost-of-super-bowl-advertising-breakdown-by-year/"&gt;a 213x nominal increase&lt;/a&gt; and roughly 22–23x in real terms. The compound annual growth rate of approximately 9.6% is more than double average CPI inflation over the same period. Only three year-over-year price decreases have occurred in the entire 60-year history. &lt;a href="#lightbox-super-bowl-price-history-png-2" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/super-bowl-price-history.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/super-bowl-price-history.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/super-bowl-price-history.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/super-bowl-price-history.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/super-bowl-price-history.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/super-bowl-price-history.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/super-bowl-price-history.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/super-bowl-price-history.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/super-bowl-price-history.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/super-bowl-price-history.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/super-bowl-price-history.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/super-bowl-price-history.png"
alt="Super Bowl ad cost history from 1967 to 2026 showing price growth from $37,500 at Super Bowl I to $1.2M first seven figures to $2.1M at the Dot-Com Bowl to $8M at Super Bowl LX, a 213x nominal increase at 9.6 percent CAGR with only three year-over-year price decreases in 59 years"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;The NFL&amp;rsquo;s leverage comes from a structural scarcity that it actively maintains. Super Bowl ad inventory sells out months in advance. &lt;a href="https://www.hellomagazine.com/film/881951/2026-super-bowl-commercial-cost-breaks-records/"&gt;NBC sold out its 2026 inventory&lt;/a&gt; before the NFL season even started, with some companies paying $10 million or more due to what NBCUniversal&amp;rsquo;s Mike Marshall called &amp;ldquo;the marketplace demand.&amp;rdquo; &lt;a href="https://www.foxcorporation.com/news/business/2025/super-bowl-lix-on-fox-and-tubi-generates-more-than-800-million-in-gross-advertising-revenue/"&gt;Fox reported $800+ million&lt;/a&gt; in gross ad revenue from Super Bowl LIX in 2025, a record that industry analysts expect to become a billion dollars within two to three years. The mandatory companion buys force advertisers into additional network inventory they might not otherwise purchase, extracting surplus beyond the headline slot price.&lt;/p&gt;
&lt;p&gt;Viewership has cooperated. The Super Bowl drew 51.2 million viewers in 1967 and &lt;strong&gt;127.7 million&lt;/strong&gt; in 2025. Streaming hasn&amp;rsquo;t fragmented the audience; it&amp;rsquo;s expanded it. &lt;a href="https://www.foxsports.com/presspass/blog/2025/02/11/fox-sports-presentation-of-super-bowl-lix-delivers-most-watched-super-bowl-of-all-time-with-127-7-million-viewers-across-all-platforms/"&gt;Tubi alone delivered 13.6 million streaming viewers&lt;/a&gt; for Super Bowl LIX, a 94% increase over Fox&amp;rsquo;s previous Super Bowl. &lt;a href="https://www.tvtechnology.com/news/fox-sports-super-bowl-viewership-peaks-at-record-135-7-million"&gt;AdImpact data showed streaming at 49% of total viewership&lt;/a&gt;, up from 41.5% in 2024. The audience skews younger on streaming: Tubi&amp;rsquo;s Super Bowl audience was 38% more likely to be 18–34 than the overall game audience, which is exactly the demographic advertisers pay premiums to reach.&lt;/p&gt;
&lt;p&gt;The result is a market where the seller has near-monopoly pricing power, the buyers face a prisoner&amp;rsquo;s dilemma that prevents collective resistance, and the audience keeps growing. The NFL has essentially created a Veblen good in advertising: the price itself signals legitimacy, which makes the price self-sustaining. The &lt;a href="https://en.wikipedia.org/wiki/Dot-com_commercials_during_Super_Bowl_XXXIV"&gt;2000 &amp;ldquo;Dot-Com Bowl&amp;rdquo;&lt;/a&gt; saw 14+ internet companies advertise, using the Super Bowl as a credibility play. At least eight went bust within a decade. The &lt;a href="https://www.cnbc.com/2022/11/30/crypto-crash-may-leave-ad-supported-businesses-with-hole-in-budget.html"&gt;2022 &amp;ldquo;Crypto Bowl&amp;rdquo;&lt;/a&gt; featured Coinbase, FTX, Crypto.com, and eToro spending a collective $54 million. FTX collapsed into bankruptcy within nine months. The pattern repeats because the mechanism works: bubble industries pay the premium precisely because appearing in the Super Bowl signals they belong among established brands. That this signal is often false doesn&amp;rsquo;t reduce its price.&lt;/p&gt;
&lt;h2 id="the-cases-that-define-the-genre"&gt;The cases that define the genre&lt;/h2&gt;
&lt;p&gt;Apple&amp;rsquo;s &lt;a href="https://en.wikipedia.org/wiki/1984_(advertisement)"&gt;&amp;ldquo;1984&amp;rdquo; ad&lt;/a&gt; cost approximately $750,000–$900,000 to produce plus $800,000 in airtime, roughly $4 million in today&amp;rsquo;s dollars. Apple&amp;rsquo;s board hated it and ordered the time sold back. Steve Jobs intervened. The ad &lt;a href="https://heidicohen.com/content-quality-lesson-apple-1984-super-bowl-ad/"&gt;generated &lt;strong&gt;$155 million&lt;/strong&gt; in Macintosh sales&lt;/a&gt; within three months. Apple sold 250,000 Macs in the first year against a 30,000-unit break-even target. It sits &lt;a href="https://americanhistory.si.edu/explore/stories/remembering-apples-1984-super-bowl-ad"&gt;in the Smithsonian&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Coinbase&amp;rsquo;s 2022 QR code ad &lt;a href="https://www.thedrum.com/news/2022/02/14/ad-the-day-coinbase-breaks-internet-with-qr-code-super-bowl-stunt"&gt;cost $14 million&lt;/a&gt; for 60 seconds of a bouncing QR code on a black screen. The landing page &lt;a href="https://www.cnn.com/2022/02/14/investing/coinbase-qr-code-app"&gt;received &lt;strong&gt;20+ million hits in one minute&lt;/strong&gt;&lt;/a&gt;, crashing the app. Downloads &lt;a href="https://techcrunch.com/2022/02/17/super-bowl-ads-boosted-crypto-app-downloads-by-279-led-by-coinbase/"&gt;surged 309% week-over-week&lt;/a&gt;. The ad won the Clio &amp;ldquo;Super Clio&amp;rdquo; and finished dead last in USA Today&amp;rsquo;s Ad Meter consumer rankings simultaneously. Then the crypto market collapsed, Coinbase laid off 18% of staff, and the massive awareness evaporated. A reminder that advertising cannot fix a product&amp;rsquo;s relationship to reality.&lt;/p&gt;
&lt;p&gt;GoDaddy advertised in every Super Bowl from 2005 to 2015, deliberately courting controversy with provocative ads. Their first appearance &lt;a href="https://mbaknol.com/management-case-studies/case-study-godaddys-super-bowl-commercials/"&gt;generated a 378% website traffic spike&lt;/a&gt; and 51.4% share of voice among all advertisers, largely because &lt;a href="https://adage.com/article/special-report-super-bowl/fox-killed-airing-super-bowl-godaddy-ad/45076"&gt;Fox pulled the second airing&lt;/a&gt; and created a news cycle. Today over 60% of visitors go to GoDaddy.com directly rather than through search. The company grew to 21 million customers before going public. Provocation as a launch strategy worked, until the brand matured and pivoted away.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://system1group.com/blog/dont-forget-to-brand-why-codification-is-the-key-to-super-bowl-advertising-success"&gt;System1 data&lt;/a&gt; offers a sobering counterpoint to these highlights: &lt;strong&gt;21%&lt;/strong&gt; of viewers in 2025 couldn&amp;rsquo;t recall which brand was behind the ad they&amp;rsquo;d just watched. That means roughly one in five Super Bowl ads converts millions in ad spend into brandless entertainment. The audience enjoyed the show. They just have no idea who paid for it.&lt;/p&gt;
&lt;h2 id="what-the-economics-of-a-super-bowl-ad-tell-us"&gt;What the economics of a Super Bowl ad tell us&lt;/h2&gt;
&lt;p&gt;I keep coming back to &lt;a href="https://www.gsb.stanford.edu/faculty-research/publications/super-bowl-ads"&gt;Hartmann and Klapper&amp;rsquo;s central result&lt;/a&gt; because it&amp;rsquo;s the one that reshapes how you think about the entire exercise. The Super Bowl ad works brilliantly as an investment, but only when the advertiser has category exclusivity. The moment a competitor shows up, the gains evaporate. What looks like an advertising problem is actually a competitive strategy problem.&lt;/p&gt;
&lt;p&gt;Anheuser-Busch paid for exclusivity for 33 years because the company understood this. The &lt;a href="https://money.cnn.com/2016/02/05/news/anheuser-busch-super-bowl-advertising/"&gt;$278 million over a decade&lt;/a&gt; wasn&amp;rsquo;t a media buy. It was an entry barrier. The moment that barrier &lt;a href="https://www.cnn.com/2022/07/15/business-food/anheuser-busch-molson-coors-super-bowl-deal/index.html"&gt;fell in 2023&lt;/a&gt;, the category filled with nine competing brands and the collective value of Super Bowl beer advertising almost certainly declined. The NFL captured the difference.&lt;/p&gt;
&lt;p&gt;This means the honest answer to &amp;ldquo;should you buy a Super Bowl ad?&amp;rdquo; isn&amp;rsquo;t about CPMs or brand lift or even ROI in the traditional sense. It&amp;rsquo;s about whether your competitive position allows you to capture the value or whether you&amp;rsquo;re paying for a prisoner&amp;rsquo;s dilemma that the NFL designed.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://ro.co/perspectives/super-bowl-economics/"&gt;Reitano&amp;rsquo;s asymmetric upside thesis&lt;/a&gt; is logically sound for a company like Ro, which was advertising in a healthcare category without heavy Super Bowl competition and used the spot as a genuine brand awareness play. But the framework breaks down when applied generally. The 2026 Super Bowl featured Novo Nordisk, Ro, Hims &amp;amp; Hers, Novartis, Boehringer Ingelheim, and Eli Lilly all running health-related ads. Northwestern&amp;rsquo;s Tim Calkins &lt;a href="https://fortune.com/2026/02/06/super-bowl-ads-cost-budweiser-lays-amazon-meta-anthropic-ring/"&gt;called it&lt;/a&gt; the &amp;ldquo;GLP-1 Super Bowl.&amp;rdquo; If the Hartmann-Klapper result holds across categories, those brands collectively spent north of &lt;strong&gt;$100 million&lt;/strong&gt; on ads whose effects substantially cancelled each other out.&lt;/p&gt;</description></item><item><title>The Impossible Backhand</title><link>http://philippdubach.com/posts/the-impossible-backhand/</link><pubDate>Tue, 17 Feb 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/the-impossible-backhand/</guid><description>&lt;p&gt;In the latest issue of &lt;a href="https://lab.philippdubach.com"&gt;The AI Lab Newsletter&lt;/a&gt;, I featured a ByteDance &lt;a href="https://x.com/AngryTomtweets/status/2021194266517832057"&gt;Seedance 2.0&lt;/a&gt; clip: two men playing tennis at what looked like an ATP tournament. Photorealistic. I probably wouldn&amp;rsquo;t be able to tell it wasn&amp;rsquo;t real footage if I didn&amp;rsquo;t know. A co-worker who played junior pro-am tennis watched the same clip and said: &amp;ldquo;That backhand doesn&amp;rsquo;t exist. Nobody plays it like that.&amp;rdquo; His domain expertise spotted an error that probably fooled everyone else.&lt;/p&gt;
&lt;p&gt;We ended up in a long conversation about what that means. AI can get to maybe the 95th or 98th percentile of creating something that looks perfect, but then it isn&amp;rsquo;t, and if you have deep knowledge you can spot it immediately. The consensus narrative treats this as a temporary limitation. But it might be structural. And I think the evidence, once you lay it out, points to a genuinely contrarian conclusion: domain expertise is appreciating in value, not depreciating, precisely because AI hits a quality ceiling it can&amp;rsquo;t easily push past.&lt;/p&gt;
&lt;h2 id="approaching-the-ai-quality-ceiling"&gt;Approaching the AI quality ceiling&lt;/h2&gt;
&lt;p&gt;I&amp;rsquo;ve &lt;a href="http://philippdubach.com/posts/the-most-expensive-assumption-in-ai/"&gt;written before&lt;/a&gt; about Sara Hooker&amp;rsquo;s work on diminishing returns from scaling. The investment side of that argument, the &lt;a href="http://philippdubach.com/posts/the-saaspocalypse-paradox/"&gt;$690 billion in hyperscaler capex&lt;/a&gt; chasing a 4% revenue coverage ratio, has been well covered. What hasn&amp;rsquo;t been covered as precisely is why AI output quality hits a ceiling, and why that ceiling is structural rather than temporary.&lt;/p&gt;
&lt;p&gt;Ben Affleck, of all people, gave the clearest non-technical explanation on &lt;a href="https://faroutmagazine.co.uk/ben-affleck-dismisses-existential-potential-ai-hollywood/"&gt;The Joe Rogan Experience&lt;/a&gt; in January 2026:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;If you try to get ChatGPT or Claude or Gemini to write you something, it&amp;rsquo;s really shitty. And it&amp;rsquo;s shitty because by its nature it goes to the mean, to the average. Now, it&amp;rsquo;s a useful tool if you&amp;rsquo;re a writer&amp;hellip; but I don&amp;rsquo;t think it&amp;rsquo;s actually very likely that it&amp;rsquo;s going to write anything meaningful, or that it&amp;rsquo;s going to be making movies from whole cloth. That&amp;rsquo;s bullshit.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;He&amp;rsquo;s more right than he probably knows. The convergence to the mean isn&amp;rsquo;t a solvable engineering problem. It operates at three distinct levels, each compounding the others.&lt;/p&gt;
&lt;p&gt;(1) The mathematics of next-token prediction. LLMs generate the most statistically probable continuation of a sequence. Probable, by definition, means average. The model isn&amp;rsquo;t trying to produce the best output; it&amp;rsquo;s producing the most expected one given the distribution it learned. Outlier quality, the kind that makes writing or analysis distinctive, lives in the tails of the distribution. The architecture systematically avoids those tails.&lt;/p&gt;
&lt;p&gt;(2) RLHF makes it worse. Research shows that human annotators prefer familiar-sounding responses, and the learned reward function weights typicality at α=0.57. Models are quite literally being trained to sound typical rather than merely correct or good. The reinforcement signal pushes outputs toward the center of the quality distribution, not toward its upper bound.&lt;/p&gt;
&lt;p&gt;(3) model collapse. &lt;a href="https://www.nature.com/articles/s41586-024-07566-y"&gt;Shumailov et al.&lt;/a&gt; documented this in their Nature paper: as models increasingly train on AI-generated content, they &amp;ldquo;forget the true underlying data distribution,&amp;rdquo; losing the tails first and converging toward a point estimate with minimal variance. The internet is filling with AI-generated text. The next generation of models trains on that text. The tails shrink further. This is a positive feedback loop running in the wrong direction.&lt;/p&gt;
&lt;p&gt;MIT researchers &lt;a href="https://arxiv.org/abs/2007.05558"&gt;Thompson, Greenewald, Lee, and Manso&lt;/a&gt; quantified the cost side: computational resources scale with at least the fourth power of improvement in theory, the ninth power in practice. To halve an error rate requires more than 500× the computational resources. When AlexNet trained on two GPUs in 2012, it took six days. By 2018, NASNet-A cut the error rate in half using more than 1,000× as much compute. &lt;a href="#lightbox-ninth-power-curve-2-png-0" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/ninth-power-curve-2.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/ninth-power-curve-2.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/ninth-power-curve-2.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/ninth-power-curve-2.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ninth-power-curve-2.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/ninth-power-curve-2.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/ninth-power-curve-2.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/ninth-power-curve-2.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ninth-power-curve-2.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/ninth-power-curve-2.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/ninth-power-curve-2.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/ninth-power-curve-2.png"
alt="AI quality ceiling ninth-power scaling curve: computational cost scales from AlexNet in 2012 on two GPUs to NASNet-A in 2018 requiring over 1000x compute to halve error rate, showing diminishing returns that explain why AI output quality plateaus and domain expertise remains irreplaceable"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;Affleck captured the commercial implication of this better than most analysts:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I think a lot of that rhetoric comes from people who are trying to justify valuations around companies where they go, &amp;ldquo;We&amp;rsquo;re going to change everything in two years.&amp;rdquo; Well, the reason they&amp;rsquo;re saying that is because they need to ascribe a valuation for investment that can warrant the capex spend they&amp;rsquo;re going to make on these data centers. Except that ChatGPT 5 is about 25 percent better than ChatGPT 4, and costs about four times as much in the way of electricity and data.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;He&amp;rsquo;s describing the ninth-power curve in plain English. Each marginal improvement costs exponentially more. The curve bends away from you the harder you push.&lt;/p&gt;
&lt;h2 id="humanitys-last-exam"&gt;Humanity&amp;rsquo;s Last Exam&lt;/h2&gt;
&lt;p&gt;The hardest measurement of where AI actually stands against domain expertise is &lt;a href="https://artificialanalysis.ai/evaluations/humanitys-last-exam"&gt;Humanity&amp;rsquo;s Last Exam&lt;/a&gt; (HLE), published in Nature in early 2025 by the Center for AI Safety and Scale AI. Built with approximately 1,000 subject-matter experts across 500+ institutions, it consists of 2,500 expert-crafted questions spanning 100+ academic domains, designed to be &amp;ldquo;Google-proof&amp;rdquo;: questions that require genuine understanding rather than information retrieval.&lt;/p&gt;
&lt;p&gt;As of February 2026, the top model (Gemini 3 Pro Preview) scores &lt;strong&gt;37.5%&lt;/strong&gt;. Most models sit below 30%. Human domain experts average roughly &lt;strong&gt;90%&lt;/strong&gt;. That&amp;rsquo;s a 53-point gap. In specialized domains like advanced chemical kinetics or medieval philology, AI barely outperforms random guessing while experts score comfortably in the 80s and 90s. &lt;a href="#lightbox-hle-gap-chart-png-2" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/hle-gap-chart.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/hle-gap-chart.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/hle-gap-chart.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/hle-gap-chart.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/hle-gap-chart.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/hle-gap-chart.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/hle-gap-chart.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/hle-gap-chart.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/hle-gap-chart.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/hle-gap-chart.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/hle-gap-chart.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/hle-gap-chart.png"
alt="Humanity&amp;#39;s Last Exam 2026 benchmark scores showing 53-point gap between human domain experts at roughly 90 percent and top AI models including Gemini 3 Deep Think at 48.4 percent and Gemini 3 Pro Preview at 37.5 percent, evidence that AI capability frontier remains far behind human expertise on specialist questions"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
The models are also systematically overconfident. Calibration errors on HLE &lt;a href="https://www.letsdatascience.com/blog/humanitys-last-exam-the-test-thats-humbling-the-worlds-smartest-ai"&gt;range from 34% to 89%&lt;/a&gt;, meaning AI systems are saying &amp;ldquo;I&amp;rsquo;m 90% sure&amp;rdquo; when they should be saying &amp;ldquo;I&amp;rsquo;m guessing.&amp;rdquo; That gap between confidence and accuracy, that AI overconfidence, is where real-world harm concentrates.&lt;/p&gt;
&lt;p&gt;In legal applications, Yale researcher &lt;a href="https://law.stanford.edu/2024/01/11/hallucinating-law-legal-mistakes-with-large-language-models-are-pervasive/"&gt;Matthew Dahl&lt;/a&gt; found hallucination rates of 69% to 88% on specific queries. Damien Charlotin&amp;rsquo;s database now tracks 914 cases of AI-generated hallucinated content in legal filings worldwide, growing from two cases per week to two to three per day. In medicine, the &lt;a href="https://www.annfammed.org/content/23/1/1/tab-e-letters"&gt;Annals of Family Medicine&lt;/a&gt; warns that AI hallucinations are &amp;ldquo;far more insidious&amp;rdquo; because &amp;ldquo;a subtle misstep like a misplaced clinical guideline, an incorrect dosage, or an invented side effect may not raise immediate suspicion.&amp;rdquo; These aren&amp;rsquo;t edge cases. They&amp;rsquo;re the expected behavior of systems operating in professional domains where training data is sparse.&lt;/p&gt;
&lt;p&gt;The structural explanation is what Kandpal et al. demonstrated at ICML 2023: there&amp;rsquo;s a strong correlational and causal relationship between an LLM&amp;rsquo;s ability to answer questions and how many relevant documents appeared in pre-training data. Common knowledge gets learned well. Specialized knowledge appears infrequently online, so models learn it poorly. &lt;a href="https://x.com/alive_eth/status/1286650402356641792"&gt;Ali Yahya&lt;/a&gt; of a16z framed it sharply: neural networks are &amp;ldquo;fantastic interpolators but terrible extrapolators,&amp;rdquo; powerful pattern matchers that are &amp;ldquo;blind to the mechanisms that generate the data in the first place.&amp;rdquo; &lt;a href="#lightbox-domain-risk-map-png-3" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/domain-risk-map.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/domain-risk-map.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/domain-risk-map.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/domain-risk-map.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/domain-risk-map.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/domain-risk-map.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/domain-risk-map.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/domain-risk-map.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/domain-risk-map.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/domain-risk-map.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/domain-risk-map.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/domain-risk-map.png"
alt="AI hallucination rates across professional domains: legal research at 69 to 88 percent failure rated critical risk, clinical medicine rated critical with subtle errors, financial analysis at roughly 45 percent, expert academics at 62.5 percent failure on Humanity&amp;#39;s Last Exam, mapping the AI capability frontier by domain"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
My colleague who spotted the impossible backhand is a fantastic extrapolator. He has an embodied model of how tennis biomechanics work that no amount of video footage can teach a diffusion model. The model can produce outputs that are statistically plausible. He can identify outputs that are physically impossible. That distinction is the gap.&lt;/p&gt;
&lt;h2 id="the-centaur-model-for-human-ai-collaboration"&gt;The centaur model for human-AI collaboration&lt;/h2&gt;
&lt;p&gt;The consensus framing positions AI and human expertise as substitutes: AI gets better, humans become less relevant. The empirical evidence on AI augmentation versus replacement says the opposite. Human-AI collaboration, what researchers call the centaur model, outperforms either alone, consistently, across domains, and the quality of the human contribution matters a lot.&lt;/p&gt;
&lt;p&gt;The Harvard/BCG study tested 758 consultants, 7% of BCG&amp;rsquo;s consulting workforce, on realistic tasks using GPT-4. The researchers described a &amp;ldquo;&lt;a href="https://www.hbs.edu/faculty/Pages/item.aspx?num=64700"&gt;jagged technological frontier&lt;/a&gt;&amp;rdquo; where some tasks fall within AI&amp;rsquo;s capabilities and others, though seemingly similar, do not. For tasks within that frontier, consultants using AI completed 12.2% more tasks, finished 25.1% faster, and produced results 40% higher in quality. Below-average performers saw a 43% improvement in knowledge worker productivity. AI as skill equalizer. But for tasks outside AI&amp;rsquo;s frontier, consultants using AI were &lt;strong&gt;19 percentage points&lt;/strong&gt; less likely to produce correct solutions. The researchers observed that &amp;ldquo;professionals who had a negative performance when using AI tended to blindly adopt its output and interrogate it less.&amp;rdquo; &lt;a href="#lightbox-centaur-effect-png-5" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/centaur-effect.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/centaur-effect.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/centaur-effect.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/centaur-effect.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/centaur-effect.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/centaur-effect.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/centaur-effect.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/centaur-effect.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/centaur-effect.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/centaur-effect.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/centaur-effect.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/centaur-effect.png"
alt="Harvard BCG centaur model study results on human-AI collaboration and knowledge worker productivity: within AI capability frontier showing plus 40 percent quality, plus 12.2 percent more tasks, plus 25.1 percent faster; outside frontier showing minus 19 percentage points accuracy for blind delegators"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;That second finding doesn&amp;rsquo;t get enough attention. It means the value of the human in the loop depends entirely on whether the human can identify when the AI is wrong. Which requires precisely the domain expertise that AI supposedly makes obsolete.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://www.lsu.edu/business/news/2025/7/research-ai-collaboration.php"&gt;&amp;ldquo;centaur analyst&amp;rdquo; study from LSU Finance&lt;/a&gt; (winner of the Fama-DFA Best Paper Award) confirmed this human-AI partnership over an 18-year dataset. AI alone beat human stock analysts in 54.5% of cases. The human-AI hybrid outperformed AI-only in nearly 55% of forecasts and reduced extreme prediction errors by roughly 90% compared to human analysts alone. In clinical decision-making experiments with the Mayo Clinic, the ranking was consistent: human-algorithm centaur, then algorithm alone, then human experts alone. The human adds most value at the extremes, catching the cases where the model&amp;rsquo;s convergence to the mean produces confidently wrong answers.&lt;/p&gt;
&lt;p&gt;Affleck, who has thought about this more carefully than his reputation might suggest, landed on the same conclusion:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The way I see the technology and what it&amp;rsquo;s good at and what it&amp;rsquo;s not, it&amp;rsquo;s gonna be good at filling in all the places that are expensive and burdensome, and it&amp;rsquo;s always gonna rely fundamentally on the human artistic aspects of it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Labor economics research broadly confirms this. Oxford researchers&lt;a href="https://arxiv.org/abs/2412.19754"&gt; Mäkelä and Stephany&lt;/a&gt; analyzed 12 million U.S. job vacancies and found that complementary effects of AI are 1.7× larger than substitution effects. The World Economic Forum projects 170 million new jobs created by 2030 versus 92 million displaced, a net gain of 78 million. &lt;a href="https://www.nber.org/system/files/working_papers/w28257/revisions/w28257.rev1.pdf"&gt;Acemoglu, Autor, Hazell, and Restrepo&lt;/a&gt; found that while AI-exposed firms reduce hiring in non-AI positions:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;the aggregate impacts of AI-labor substitution on employment and wage growth&amp;hellip; is currently too small to be detectable.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;a href="https://www.mckinsey.com.br/capabilities/tech-and-ai/our-insights/building-the-ai-muscle-of-your-business-leaders"&gt;McKinsey&lt;/a&gt; captures the strategic implication: &amp;ldquo;When you have built a bench of AI-capable domain owners, your company has a real competitive advantage. That&amp;rsquo;s because these leaders are hard to replicate.&amp;rdquo; Yet only 23% of organizations believe they are building sustainable AI advantages, despite 79% reporting competitors are making similar investments.&lt;/p&gt;
&lt;h2 id="ai-deskilling-is-a-trap"&gt;AI deskilling is a trap&lt;/h2&gt;
&lt;p&gt;If a generation of junior analysts learns to use AI before developing independent judgment, they never build the pattern recognition that lets them spot when the model is wrong. If junior lawyers lean on AI for legal research before reading enough case law to develop intuition for what&amp;rsquo;s plausible, they can&amp;rsquo;t catch the 69-88% hallucination rates. If aspiring filmmakers generate scenes with Seedance 2.0 instead of learning how cameras, bodies, and physics actually interact, they can&amp;rsquo;t identify the impossible backhand. &lt;a href="https://www.gartner.com/en/articles/ai-lock-in"&gt;Gartner predicts&lt;/a&gt; that by 2030, half of enterprises will face irreversible skill shortages in at least two critical job roles because of unchecked automation. This AI skill erosion creates a vicious cycle: fewer skilled workers, greater dependence on AI, higher costs to fill the gaps.&lt;/p&gt;
&lt;p&gt;Acemoglu warns that technology &amp;ldquo;does not automatically benefit workers.&amp;rdquo; In 19th-century England, the benefits of mechanization only spread after decades of worker activism. The parallel risk with AI isn&amp;rsquo;t mass unemployment. It&amp;rsquo;s a hollowing out of the skill base that makes the centaur model function. You lose not the jobs but the expertise that makes the jobs valuable.&lt;/p&gt;
&lt;p&gt;David Autor&amp;rsquo;s vision is more optimistic: AI could &amp;ldquo;extend the relevance, reach, and value of human expertise,&amp;rdquo; democratizing it rather than eliminating it. I want to believe that&amp;rsquo;s right. But it requires treating AI as a tool that amplifies existing expertise rather than a shortcut that replaces the need to develop it. The 43% improvement that below-average BCG consultants saw from using GPT-4 is real. The 19-percentage-point penalty when those same consultants blindly trusted AI outside its frontier is equally real. The difference between those two outcomes is judgment. And judgment comes from experience, not from a larger context window.&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;m more confident in the centaur framework than in any specific prediction about timelines or magnitudes. The ninth-power scaling curve, the 53-point gap on Humanity&amp;rsquo;s Last Exam, the α=0.57 typicality bias in RLHF, the 69-88% hallucination rates in legal applications, and the 95% of &lt;a href="http://philippdubach.com/posts/enterprise-ai-strategy-is-backwards/"&gt;enterprises&lt;/a&gt; seeing no measurable P&amp;amp;L returns from AI investments all point in the same direction. The question of AI augmentation versus replacement has an empirical answer: AI is a tool that makes good practitioners better and bad practitioners worse. The &lt;a href="http://philippdubach.com/posts/is-ai-really-eating-the-world/"&gt;industry narrative&lt;/a&gt; demands a story about replacement. The data tells a story about partnership, one where the human&amp;rsquo;s contribution is not a relic of an earlier era but the irreducible ingredient that makes the whole system work.&lt;/p&gt;
&lt;p&gt;The ability to spot the impossible backhand isn&amp;rsquo;t going away. If anything, it&amp;rsquo;s worth more every day.&lt;/p&gt;</description></item><item><title>Europe's $24 Trillion Payment Breakup Is Really a Bet on Infrastructure Arbitrage</title><link>http://philippdubach.com/posts/europes-24-trillion-payment-breakup-is-really-a-bet-on-infrastructure-arbitrage/</link><pubDate>Mon, 16 Feb 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/europes-24-trillion-payment-breakup-is-really-a-bet-on-infrastructure-arbitrage/</guid><description>&lt;br&gt;
&lt;p&gt;On February 2, 2026, the European Payments Initiative signed a &lt;a href="https://epicompany.eu/media-insights/bancomat-bizum-epi-sibs-and-vipps-mobilepay-sign-mou-to-accelerate-the-rollout-of-sovereign-pan-european-payment-solutions"&gt;Memorandum of Understanding&lt;/a&gt; with the Alliance EuroPA, a consortium linking Spain&amp;rsquo;s Bizum, Italy&amp;rsquo;s Bancomat, Portugal&amp;rsquo;s SIBS, and the Nordic Vipps MobilePay system. The deal connects 130 million users across 13 countries into a single interoperable payment network. Headlines framed it as Europe breaking up with Visa and Mastercard. The actual story is more interesting: Europe is attempting an infrastructure arbitrage that, if it works, could reprice how money moves across the continent.&lt;/p&gt;
&lt;p&gt;This is not primarily a sovereignty play, though that is how politicians sell it. It is an attempt to exploit a structural pricing inefficiency in European payments that Visa and Mastercard have maintained for decades and that the EU&amp;rsquo;s own regulation accidentally made harder to dislodge.&lt;/p&gt;
&lt;h2 id="i-the-hidden-fee-structure"&gt;I. The hidden fee structure&lt;/h2&gt;
&lt;p&gt;The EU&amp;rsquo;s 2015 &lt;a href="https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=celex%3A32015R0751"&gt;Interchange Fee Regulation&lt;/a&gt; capped consumer debit interchange at 0.2% and credit at 0.3%. This was celebrated as a win for merchants. What happened next was predictable to anyone who has watched regulated industries: Visa and Mastercard shifted revenue to unregulated &amp;ldquo;scheme fees&amp;rdquo; for authorization, clearing, and settlement. According to &lt;a href="https://www.eurocommerce.eu/2025/06/ten-years-after-the-interchange-fee-regulation-we-need-new-action-to-tackle-new-wholesale-price-increases/"&gt;EuroCommerce&lt;/a&gt;, scheme fees rose by a cumulative 33.9% between 2018 and 2022, averaging 7.6% annually. The European Commission&amp;rsquo;s own data shows scheme fees increased by €1.46 billion between 2016 and 2021. &lt;a href="https://ecommerce-europe.eu/news-item/the-interchange-fee-regulation-turns-10/"&gt;Ecommerce Europe found&lt;/a&gt; that the average net merchant service charge nearly doubled from 0.27% to 0.44% between 2018 and 2022, effectively neutralizing the entire regulatory benefit.&lt;/p&gt;
&lt;p&gt;A card transaction through Visa or Mastercard can cost a European merchant up to 2% when all components are included. A SEPA Instant Credit Transfer, the rails that EPI&amp;rsquo;s Wero system uses, processes payments for a fraction of that with near-zero interchange and only processing fees. In Germany, S-Payment has proposed Wero merchant pricing at 0.77% plus gateway charges. That spread, roughly 100 to 120 basis points on every transaction, is the arbitrage opportunity. Applied to the &lt;a href="https://coinlaw.io/global-payment-network-statistics/"&gt;$4.7 trillion&lt;/a&gt; in combined Visa and Mastercard European volume, we are talking about tens of billions of euros annually in fees that could theoretically be disintermediated. &lt;a href="#lightbox-fee-arbitrage-european-payments-png-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/fee-arbitrage-european-payments.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/fee-arbitrage-european-payments.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/fee-arbitrage-european-payments.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/fee-arbitrage-european-payments.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/fee-arbitrage-european-payments.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/fee-arbitrage-european-payments.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/fee-arbitrage-european-payments.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/fee-arbitrage-european-payments.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/fee-arbitrage-european-payments.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/fee-arbitrage-european-payments.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/fee-arbitrage-european-payments.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/fee-arbitrage-european-payments.png"
alt="Horizontal bar chart comparing total merchant cost per transaction across payment methods. Card networks: Visa and Mastercard up to 2.0 percent, PayPal up to 2.3 percent. Account-to-account rails: Wero at 0.77 percent, iDEAL at a flat 0.29 euros, India UPI at 0.0 percent. Dual callout showing the IFR backfired as scheme fees rose 33.9 percent between 2018 and 2022, and the structural A2A arbitrage of 100 to 120 basis points per transaction"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;Most analysis I read over the past days focuses on whether Wero can beat Visa and Mastercard on user experience or brand recognition. But Wero does not need to win on UX. It needs to win on cost, and the cost advantage is structural because account-to-account payments simply skip an entire layer of intermediation. The question is whether that cost advantage is large enough to overcome the switching costs, and whether the political will exists to force adoption where market forces alone might not.&lt;/p&gt;
&lt;h2 id="ii-what-wero-actually-is-and-why-the-impact-of-the-europa-deal"&gt;II. What Wero actually is and why the impact of the EuroPA deal&lt;/h2&gt;
&lt;p&gt;Wero is a digital wallet built on top of SEPA Instant Credit Transfer infrastructure. Users access it through their existing banking app. Payments move directly between bank accounts in under 10 seconds using a phone number, email, or QR code. No card, no card network, no intermediary skimming basis points off each transaction.&lt;/p&gt;
&lt;p&gt;EPI launched Wero for peer-to-peer transfers in Germany on July 2, 2024, followed by France in September and Belgium in November of that year. E-commerce payments went live in Germany in November 2025, with merchants including Lidl, Decathlon, and Rossmann accepting it. Point-of-sale NFC tap payments are planned for 2026 to 2027.&lt;/p&gt;
&lt;p&gt;The 16 founding bank shareholders include &lt;a href="https://group.bnpparibas/en/news/bnp-paribas-partners-with-wero-for-e-commerce-payment-solutions"&gt;BNP Paribas&lt;/a&gt;, Crédit Agricole, Société Générale, Deutsche Bank, the Sparkassen-Finanzgruppe (which alone committed €150 million), ABN AMRO, ING, Rabobank, and pan-European acquirers Nexi and Worldline. Total committed capital sits at roughly €500 million. Membership has expanded to over 1,100 institutions, and &lt;a href="https://fintech.global/2025/12/08/n26-partners-with-epi-to-launch-wero-payment-option/"&gt;both Revolut and N26&lt;/a&gt; joined in 2025.&lt;/p&gt;
&lt;p&gt;Before the EuroPA deal, Wero was a Franco-German-Benelux payments app with roughly 47 million users and a geographic footprint that excluded most of southern and northern Europe. That is not a challenger to Visa and Mastercard. The EuroPA deal changes the math because it connects Wero with Bizum&amp;rsquo;s 30.6 million users in Spain, Bancomat&amp;rsquo;s dominant network in Italy, SIBS in Portugal, and Vipps MobilePay&amp;rsquo;s 12.5 million users across the Nordics. Crucially, it does this through a hub model rather than requiring each country to join EPI as a shareholder. This is a key architectural choice because the shareholder approach already failed once: in 2021 and 2022, &lt;a href="https://omdia.tech.informa.com/om022317/european-payments-initiative-project-pivots-after-20-banks-depart"&gt;roughly 20 banks withdrew from EPI&lt;/a&gt;, including all Spanish institutions, over disagreements about governance and cost sharing. &lt;a href="#lightbox-europa-network-scale-png-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/europa-network-scale.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/europa-network-scale.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/europa-network-scale.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/europa-network-scale.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/europa-network-scale.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/europa-network-scale.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/europa-network-scale.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/europa-network-scale.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/europa-network-scale.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/europa-network-scale.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/europa-network-scale.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/europa-network-scale.png"
alt="Data table showing the EuroPA alliance network after the February 2 2026 MoU. Wero core with 47 million users across Germany France Belgium Netherlands Luxembourg. Bizum EuroPA hub with 30.6 million users in Spain and 111000 merchants. Bancomat hub with approximately 30 million users in Italy. Vipps MobilePay hub with 12.5 million users across Norway Denmark Finland Sweden. SIBS MB WAY hub with approximately 6 million users in Portugal. iDEAL acquired with approximately 30 million users in Netherlands transitioning to Wero by end 2027. Combined network of over 130 million users across 13 countries covering 72 percent of EU population. Stats strip showing 1100 plus participating institutions, 500 million euros committed capital, 16 founding bank shareholders"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;The hub model lets national systems keep their local brands and governance while gaining cross-border interoperability. A Bizum user in Madrid will be able to pay a German merchant. An Italian Bancomat customer can transfer money to someone in France. 130 million users is not just a bigger number than 47 million, it is the difference between a niche product and something that forces merchant adoption.&lt;/p&gt;
&lt;p&gt;EPI also acquired two established national payment systems outright. &lt;a href="https://ideal.nl/en/epi-successfully-completes-acquisition-of-ideal-and-payconiq-international"&gt;iDEAL&lt;/a&gt; in the Netherlands processes 1.5 billion transactions annually and handles 72% of Dutch e-commerce. Payconiq/Bancontact dominates in Belgium and Luxembourg. Both acquisitions completed in October 2023. iDEAL will &lt;a href="https://epicompany.eu/media-insights/ideal-to-phase-into-wero"&gt;transition to Wero branding by end of 2027&lt;/a&gt;. In France, the pre-existing Paylib service with 35 million users was directly replaced by Wero at launch. These are not greenfield user acquisition plays. They are migrating existing transaction volumes onto a unified pan-European rail.&lt;/p&gt;
&lt;h2 id="iii-the-geopolitical-accelerant"&gt;III. The geopolitical accelerant&lt;/h2&gt;
&lt;p&gt;The economics alone might not have been enough to generate the political will for this kind of project. What changed was Russia. When Visa and Mastercard &lt;a href="https://www.americanbanker.com/news/how-visa-and-mastercards-ban-could-disrupt-russian-payments"&gt;suspended operations in Russia&lt;/a&gt; in March 2022 following the invasion of Ukraine, they severed a market where they controlled approximately 72% of card payments. The intended target was Moscow. The unintended lesson was Brussels: payment networks controlled by American corporations can be weaponized, and what gets deployed against Russia could theoretically be turned against Europe (see my earlier post on &lt;a href="https://philippdubach.com/posts/pozsars-bretton-woods-iii-three-years-later-2/2/"&gt;Bretton Woods III&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;ECB President Christine Lagarde has become the initiative&amp;rsquo;s most vocal political champion. In &lt;a href="https://www.irishtimes.com/business/2026/02/09/european-alternatives-to-visa-and-mastercard-urgently-needed-says-banking-chief/"&gt;early February 2026&lt;/a&gt; she told Irish radio that whether Europeans use a card or a phone, the transaction typically flows through Visa, Mastercard, PayPal, or Alipay, all of which originate from either the US or China. ECB Executive Board member &lt;a href="https://www.ecb.europa.eu/press/key/date/2025/html/ecb.sp250929~9a94367d26.en.html"&gt;Piero Cipollone&lt;/a&gt; has been more direct, arguing that Europe&amp;rsquo;s dependence on non-European payment solutions puts it at the mercy of decisions made elsewhere. In March 2025, ECB Chief Economist Philip Lane warned that this dependence leaves Europe &amp;ldquo;open to coercion.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;Trump&amp;rsquo;s second term has sharpened these concerns considerably. EPI CEO Martina Weimert &lt;a href="https://www.irishtimes.com/business/2026/02/09/european-alternatives-to-visa-and-mastercard-urgently-needed-says-banking-chief/"&gt;told the Financial Times&lt;/a&gt; that the problem with the digital euro is that it will arrive in a few years, perhaps after Trump&amp;rsquo;s term ends, so she thinks Europe is somewhat short on time. Tariff threats, territorial claims over Greenland, and a pro-crypto, anti-CBDC US policy agenda have turned European payment sovereignty from a technocratic aspiration into something closer to a defense priority. And European defense spending is the one area where political consensus currently exists across virtually all member states.&lt;/p&gt;
&lt;p&gt;This matters for understanding why Wero might succeed where its predecessors failed. The Monnet Project collapsed in 2012 when the European Commission refused to support multilateral interchange fees. The original EPI card-scheme vision was abandoned after the bank withdrawals. The Nordic P27 initiative collapsed in 2023. Each failure happened in a geopolitical context where the urgency was abstract. The urgency is no longer abstract. When 70 economists including Thomas Piketty published an &lt;a href="https://eutoday.net/parliament-pivotal-decision-on-ecb-digital-euro/"&gt;open letter in January 2026&lt;/a&gt; calling the digital euro &amp;ldquo;the only defence&amp;rdquo; against dependence on US payment systems, that represents a shift in the Overton window that did not exist even two years ago.&lt;/p&gt;
&lt;h2 id="iv-profitability"&gt;IV. Profitability&lt;/h2&gt;
&lt;p&gt;The EU&amp;rsquo;s own interchange fee regulation, the one designed to protect merchants from Visa and Mastercard, has inadvertently created what I think is one of the largest barriers to entry for any new European payment network.&lt;/p&gt;
&lt;p&gt;When interchange is capped at 0.2% for debit, the revenue pool available to fund a new network is tiny. Visa and Mastercard can sustain their European operations because they amortize costs across a $24 trillion global transaction base. A new European entrant has to build comparable infrastructure, convince hundreds of thousands of merchants to integrate, and acquire tens of millions of users, all while operating in a margin environment that was deliberately compressed by regulation. Weimert herself has estimated that building a viable full-scale alternative requires &amp;ldquo;several billion euros,&amp;rdquo; with private estimates cited by &lt;a href="https://fortune.com/2021/07/10/europe-digital-payments-network-epi-sepa-mastercard-visa/"&gt;Fortune&lt;/a&gt; ranging as high as €6 billion.&lt;/p&gt;
&lt;p&gt;This is what often happen with bad regulation. The regulation that was supposed to weaken the duopoly has actually strengthened its competitive moat by making the economics of entry worse. Visa and Mastercard responded to interchange caps by raising unregulated fees, so their total revenue per transaction barely changed. But a new entrant cannot charge those same scheme fees without undermining its cost advantage proposition. The revenue has to come from somewhere else.&lt;/p&gt;
&lt;p&gt;EPI&amp;rsquo;s answer is value-added services: buy-now-pay-later, digital identity, subscription management, loyalty programs. None of these exist yet. They are on the roadmap for 2027 and beyond. In the meantime, Wero operates as a cost center subsidized by its bank shareholders. The Sparkassen&amp;rsquo;s €150 million commitment is patient capital from a cooperative banking group with a 200-year time horizon. BNP Paribas and Crédit Agricole can absorb the costs as a strategic investment. But the question of when, or whether, Wero becomes self-sustaining is genuinely open.&lt;/p&gt;
&lt;h2 id="v-india-and-brazil-comparisons-are-both-more-and-less-instructive-than-they-appear"&gt;V. India and Brazil comparisons are both more and less instructive than they appear&lt;/h2&gt;
&lt;p&gt;Every article about Wero mentions India&amp;rsquo;s UPI and Brazil&amp;rsquo;s Pix as proof of concept. The numbers are undeniably impressive. UPI processed &lt;a href="https://meetanshi.com/blog/upi-statistics/"&gt;228.3 billion transactions worth approximately $3.6 trillion&lt;/a&gt; in 2025, up 29% year-over-year. The IMF &lt;a href="https://www.pib.gov.in/PressReleasePage.aspx?PRID=2200569&amp;amp;reg=3&amp;amp;lang=1"&gt;recognized it in June 2025&lt;/a&gt; as the world&amp;rsquo;s largest retail fast-payment system. Brazil&amp;rsquo;s Pix reached &lt;a href="https://en.wikipedia.org/wiki/Pix_(payment_system)"&gt;175 million users&lt;/a&gt; and processed 63.4 billion transactions worth $4.6 trillion in 2024, growing 53% year-over-year. Both systems achieved in a few years what Visa and Mastercard built over decades. &lt;a href="#lightbox-global-a2a-payment-scale-png-3" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/global-a2a-payment-scale.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/global-a2a-payment-scale.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/global-a2a-payment-scale.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/global-a2a-payment-scale.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/global-a2a-payment-scale.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/global-a2a-payment-scale.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/global-a2a-payment-scale.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/global-a2a-payment-scale.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/global-a2a-payment-scale.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/global-a2a-payment-scale.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/global-a2a-payment-scale.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/global-a2a-payment-scale.png"
alt="Horizontal bar chart comparing annual transaction volumes of sovereign account-to-account payment systems worldwide. Pix at 4.6 trillion dollars with 175 million users and 53 percent year over year growth. UPI at 3.6 trillion dollars with 491 million users and 29 percent year over year growth. Wero highlighted in red at less than 0.1 trillion dollars with 47 million users in its first year. MIR at approximately 1.4 trillion dollars estimated with 400 million plus cards and 66.7 percent domestic share. Callout noting Europe targets 4.7 trillion in annual Visa and Mastercard European volume but UPI and Pix had structural advantages Europe lacks including low card penetration and central bank mandates"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;But the structural conditions that enabled UPI and Pix do not map cleanly onto Europe. India had a large unbanked population and low existing card penetration. UPI did not have to displace an entrenched incumbent so much as fill a vacuum. Pix launched via central bank mandate requiring every financial institution to participate, with zero-cost transfers for individuals. Both countries also had single regulatory jurisdictions and populations accustomed to mobile-first payments.&lt;/p&gt;
&lt;p&gt;Europe has none of these conditions. Card penetration is high. Consumer habits are entrenched. The regulatory patchwork spans 27 member states plus associated countries, each with their own banking traditions and payment preferences. There is no single authority that can mandate participation the way Brazil&amp;rsquo;s central bank did.&lt;/p&gt;
&lt;p&gt;What Europe does have, and this is the part most analysts underweight, is a functioning SEPA infrastructure that already connects every bank account in the eurozone. Wero does not need to build new rails. It needs to build a user interface and merchant acceptance layer on top of rails that already exist and that already process trillions of euros annually. The &lt;a href="https://www.europeanpaymentscouncil.eu/news-insights/insight/wero-shaping-future-european-payments"&gt;SEPA Instant Credit Transfer regulation&lt;/a&gt; that became mandatory in 2025 means every eurozone bank must support real-time payments. Europe&amp;rsquo;s governments have already paid for the highway. Wero is building the on-ramps.&lt;/p&gt;
&lt;p&gt;The other underappreciated advantage is regulatory asymmetry. The EU&amp;rsquo;s July 2024 ruling forcing &lt;a href="https://www.macrumors.com/2024/07/11/apple-opens-iphone-nfc-access-eu/"&gt;Apple to open iPhone NFC access&lt;/a&gt; to third-party wallets means Wero can offer tap-to-pay on iPhones without going through Apple Pay. PSD3, expected in 2026, will likely further strengthen open banking requirements. The European Commission has &lt;a href="https://www.pymnts.com/news/regulation/2025/report-european-commission-looking-into-visa-and-mastercard-fees/"&gt;active investigations&lt;/a&gt; into Visa and Mastercard&amp;rsquo;s fee structures. In the UK, the &lt;a href="https://www.rte.ie/news/business/2025/0526/1514937-visa-mastercard-probe/"&gt;Competition Appeal Tribunal&lt;/a&gt; ruled unanimously in June 2025 that the networks&amp;rsquo; interchange fee structures breach competition law. Europe is building an alternative while simultaneously making the incumbent&amp;rsquo;s business model harder to sustain.&lt;/p&gt;
&lt;h2 id="vi-visa-and-mastercard"&gt;VI. Visa and Mastercard&lt;/h2&gt;
&lt;p&gt;Neither company has made extensive public statements about Wero, which is itself a strategy: do not elevate the challenger&amp;rsquo;s profile. When pressed, they emphasize the value they provide. Mastercard CEO Michael Miebach argued on an October 2025 earnings call that wherever cards are available in a competitive, level playing field, businesses and consumers opt for cards because of the protections they offer.&lt;/p&gt;
&lt;p&gt;But both companies are executing a quiet multi-rail pivot. Visa acquired European open banking leader &lt;a href="https://www.paymentsdive.com/news/visa-pay-by-bank-services-card-payments/717206/"&gt;Tink for approximately $2.2 billion&lt;/a&gt; in 2022, gaining the capability to offer the same account-to-account payment rails that Wero uses. Mastercard acquired cybersecurity firm &lt;a href="https://en.wikipedia.org/wiki/Mastercard"&gt;Recorded Future for $2.65 billion&lt;/a&gt; in September 2024, expanding into value-added services. Both are positioning themselves as payment technology platforms rather than pure card networks.&lt;/p&gt;
&lt;p&gt;This is rational. If account-to-account payments do take share from card networks in Europe, Visa and Mastercard want to be the infrastructure layer that processes those payments too. They have the merchant relationships, the fraud detection capabilities, and the global acceptance network. The risk for Wero is that even if it succeeds in shifting transactions off card rails, the toll collectors simply move to the new road.&lt;/p&gt;
&lt;h2 id="the-digital-euro"&gt;The digital euro&lt;/h2&gt;
&lt;p&gt;Running in parallel is the ECB&amp;rsquo;s digital euro project, a central bank digital currency that would serve as legal tender across the eurozone. The &lt;a href="https://www.consilium.europa.eu/en/press/press-releases/2025/12/19/single-currency-council-agrees-position-on-the-digital-euro-and-on-strengthening-the-role-of-cash/"&gt;EU Council agreed its negotiating position&lt;/a&gt; in December 2025. A European Parliament vote is expected in the first half of 2026, with potential first issuance &lt;a href="https://finance.yahoo.com/news/ecb-says-digital-euro-ready-025009411.html"&gt;around 2029&lt;/a&gt;. In October 2025, the ECB &lt;a href="https://www.ecb.europa.eu/euro/digital_euro/progress/html/ecb.deprp202510.en.html"&gt;completed its preparation phase&lt;/a&gt; and declared the digital euro technically ready.&lt;/p&gt;
&lt;p&gt;EPI positions Wero as complementary, handling private money while the digital euro handles public money. But the overlap in ambition is obvious, and it creates a coordination problem. Banks worry about deposit outflows and implementation costs estimated at &lt;a href="https://www.capco.com/intelligence/capco-intelligence/the-digital-euro-in-2025"&gt;€4 to 5.8 billion&lt;/a&gt;. There is no guaranteed parliamentary majority for the legislation. And Trump&amp;rsquo;s anti-CBDC stance, including signing the GENIUS Act for stablecoins while banning federal CBDCs, creates a strange dynamic where Europe might pursue a digital euro partly as a response to American policy that explicitly rejects the concept.&lt;/p&gt;
&lt;p&gt;My read is that the digital euro and Wero are less complementary than they are competing bets on the same thesis: that Europe needs sovereign payment infrastructure. The digital euro is the maximalist version. Wero is the pragmatic one. If I had to bet, I would bet on the pragmatic version arriving first and capturing enough transaction volume to make the digital euro&amp;rsquo;s incremental value harder to justify politically. But both could fail. And both failing would leave Europe exactly where it started, which is the outcome Visa and Mastercard are quietly optimizing for.&lt;/p&gt;
&lt;h2 id="whats-next"&gt;What&amp;rsquo;s next&lt;/h2&gt;
&lt;p&gt;The next 18 months are decisive. Cross-border P2P payments through the EuroPA hub launch in 2026. E-commerce expansion to France and Belgium follows in the second half of the year. Cross-border e-commerce and point-of-sale payments via the hub are targeted for 2027. iDEAL&amp;rsquo;s full migration to Wero should complete by end of 2027. &lt;a href="#lightbox-wero-execution-timeline-png-5" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/wero-execution-timeline.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/wero-execution-timeline.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/wero-execution-timeline.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/wero-execution-timeline.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/wero-execution-timeline.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/wero-execution-timeline.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/wero-execution-timeline.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/wero-execution-timeline.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/wero-execution-timeline.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/wero-execution-timeline.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/wero-execution-timeline.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/wero-execution-timeline.png"
alt="Timeline chart showing the critical execution window from 2024 to 2029 across five parallel tracks. Wero Core track showing P2P launch completed in 2024, e-commerce Germany completed in 2025, e-commerce France Belgium and NFC pilot active in 2026, POS rollout and BNPL digital ID planned for 2027. EuroPA Hub track showing MoU signed February 2 and cross-border P2P active in 2026, cross-border e-commerce and POS planned for 2027. iDEAL migration track showing co-branding started in 2025, dual-brand phase in 2026, full Wero migration planned end 2027. Regulation track showing Apple NFC forced open July 2024, SEPA Instant mandatory 2025, PSD3 expected and EP digital euro vote uncertain in 2026, potential digital euro issuance uncertain in 2029. Visa Mastercard response track showing acquisitions in 2025, European hub expansion 2026, multi-rail platform pivot 2027. Dual callout on the pragmatic bet versus the wildcard"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;The biggest risk is not technology or regulation. It is consumer inertia. Wero has 47 million users, but Mastercard alone has over 900 million branded cards in EU circulation. Credit cards offer credit facilities, rewards programs, and purchase protection that Wero currently cannot match. German adoption has been &lt;a href="https://insights.flagshipadvisorypartners.com/wero-the-european-challenger-digital-wallet"&gt;notably sluggish&lt;/a&gt;: despite being the first launch country, Germany accounts for only 5% of Wero&amp;rsquo;s transaction volume, with France dominating thanks to its Paylib migration. Dutch merchants have &lt;a href="https://blog.onlinepaymentplatform.com/en/weros-false-promise-higher-costs-and-more-risks"&gt;pushed back&lt;/a&gt; on the shift from iDEAL&amp;rsquo;s flat €0.29 per transaction to Wero&amp;rsquo;s percentage-based model.&lt;/p&gt;
&lt;p&gt;I think the outcome depends on whether European policymakers treat this as a market initiative or a strategic infrastructure project. If it is the former, the cost advantages may not be enough to overcome switching costs and consumer habit. If it is the latter, and if governments are willing to subsidize adoption the way they subsidize defense procurement, then the math works. The 130-million-user network created by the EuroPA deal gives Wero something no previous European payment initiative has achieved: a user base large enough to force merchant adoption through sheer volume. Whether that is enough depends on a political question, not a technical one.&lt;/p&gt;
&lt;p&gt;The $24 trillion figure in the headline refers to Visa and Mastercard&amp;rsquo;s combined global transaction volume. Europe&amp;rsquo;s share is roughly $4.7 trillion. Even capturing 10% of that would be a major rewiring of European payment infrastructure. The infrastructure arbitrage is real. The spread between card network fees and SEPA Instant costs is measurable and persistent. The question is execution, and execution in Europe is always the question.&lt;/p&gt;
&lt;aside class="disclaimer" role="note" aria-label="Disclaimer"&gt;
&lt;div class="disclaimer-content"&gt;&lt;p&gt;&lt;strong&gt;Disclaimer:&lt;/strong&gt; All opinions expressed are my own. This is not investment, financial, tax, or legal advice. Past performance does not indicate future results. Do your own research and consult qualified professionals before making financial decisions. No liability accepted for any losses.&lt;/p&gt;&lt;/div&gt;
&lt;/aside&gt;</description></item><item><title>Long Volatility Premium</title><link>http://philippdubach.com/posts/long-volatility-premium/</link><pubDate>Sat, 14 Feb 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/long-volatility-premium/</guid><description>&lt;blockquote&gt;
&lt;p&gt;The real value of tail hedging is not in the hedge itself. It&amp;rsquo;s in what the hedge enables.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;In &lt;a href="http://philippdubach.com/posts/the-variance-tax/"&gt;The Variance Tax&lt;/a&gt; I wrote about the ½σ² formula: compound returns equal arithmetic returns minus half the variance, and because the penalty is quadratic, large drawdowns destroy wealth in ways that are hard to recover from. A portfolio that falls 50% needs 100% just to break even. That piece was about the problem. This one is about a potential solution, and about whether paying for crash protection can actually improve total returns rather than drag them.&lt;/p&gt;
&lt;p&gt;There is a chart circulating in quantitative finance circles that should not exist. It shows a strategy that buys put options on the S&amp;amp;P 500 and, when layered on top of a stock portfolio, &lt;em&gt;improves&lt;/em&gt; total returns while simultaneously reducing volatility and maximum drawdown. The chart comes from Patrick Causley at One River Asset Management in a paper called &lt;a href="https://one-river.nyc3.cdn.digitaloceanspaces.com/alternatives-white-papers/October2025/OR%20-%20Heretical%20Thinking%20-%20The%20Long%20Volatility%20Premium%20-%20Oct%2025%20-%20Web.pdf"&gt;&amp;ldquo;Heretical Thinking: The Long Volatility Premium&amp;rdquo;&lt;/a&gt; and it makes a specific claim: that long volatility, properly constructed, is not a cost center but a compensated factor that deserves to sit alongside value, momentum, and trend in institutional portfolios.&lt;/p&gt;
&lt;p&gt;The conventional wisdom says buying puts is a losing game. The dominant empirical finding is that a &lt;a href="https://www.cboe.com/insights/posts/white-paper-shows-volatility-risk-premium-facilitated-higher-risk-adjusted-returns-for-put-index/"&gt;volatility risk premium&lt;/a&gt; (VRP) exists: from 1990 to 2018, the average VIX level was 19.3% while average realized S&amp;amp;P 500 volatility was just 15.1%, a persistent gap of 4.2 percentage points. Options are, on average, overpriced relative to what materializes. The &lt;a href="https://en.wikipedia.org/wiki/CBOE_S&amp;amp;P_500_PutWrite_Index"&gt;CBOE S&amp;amp;P 500 PutWrite Index&lt;/a&gt;, which systematically sells S&amp;amp;P 500 puts against cash collateral, rose 1,835% from 1986 to 2018. The CBOE 5% Put Protection Index, which buys puts as a hedge, rose only 708%. As &lt;a href="https://cdn.cboe.com/resources/education/research_publications/PutWriteCBOE19_v14_by_Prof_Oleg_Bondarenko_as_of_June_14.pdf"&gt;Bondarenko (2019)&lt;/a&gt; documented, the PUT Index achieved &lt;strong&gt;9.54%&lt;/strong&gt; annualized versus &lt;strong&gt;9.80%&lt;/strong&gt; for the S&amp;amp;P 500 but with far lower volatility (9.95% vs. 14.93%), yielding a Sharpe ratio of 0.65 versus 0.33 for put buyers.&lt;/p&gt;
&lt;p&gt;So selling options earns money. Buying them bleeds money. That is the consensus. This article is about why that framing, while technically correct, misses something important. &lt;a href="#lightbox-chart2-volatility-risk-premium-png-0" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/chart2-volatility-risk-premium.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/chart2-volatility-risk-premium.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/chart2-volatility-risk-premium.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/chart2-volatility-risk-premium.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/chart2-volatility-risk-premium.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/chart2-volatility-risk-premium.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/chart2-volatility-risk-premium.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/chart2-volatility-risk-premium.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/chart2-volatility-risk-premium.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/chart2-volatility-risk-premium.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/chart2-volatility-risk-premium.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/chart2-volatility-risk-premium.png"
alt="Dual panel chart showing VIX implied volatility consistently trading above realized S&amp;amp;P 500 volatility from 1990 to 2024, with the VRP spread averaging &amp;#43;4.2 percentage points. Bottom panel shows annual bar chart of the spread with 2008 and 2020 as notable inversions"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;h2 id="i-separating-beta-from-convexity"&gt;I. Separating beta from convexity&lt;/h2&gt;
&lt;p&gt;The key insight from the One River paper is that the raw return of a put option conflates two partially independent components, and conflating them has led to a categorical error in how most allocators think about tail hedging.&lt;/p&gt;
&lt;p&gt;When you buy a put, your P&amp;amp;L is driven by delta (directional exposure to the underlying), gamma (the acceleration of that exposure as the market moves), and vega (sensitivity to implied volatility). The problem with naively holding puts is that delta embeds a massive short-beta position. Since the equity risk premium is one of the most reliable premia in equity markets, you are fighting a powerful headwind. Your puts bleed value every day the market does not crash, and that bleed overwhelms the occasional windfall when it does.&lt;/p&gt;
&lt;p&gt;Causley&amp;rsquo;s move is straightforward. Neutralize the short-beta by adding enough long equity exposure to offset the embedded delta. What remains is a beta-neutral &amp;ldquo;long volatility factor&amp;rdquo; that isolates gamma and vega. Stack this on top of an equity program and the historical results over approximately 40 years are striking: the beta-1 portfolio with long volatility outperformed a portfolio without it while producing lower volatility and a shallower maximum drawdown. &lt;a href="#lightbox-chart1-growth-of-dollar-png-1" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/chart1-growth-of-dollar.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/chart1-growth-of-dollar.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/chart1-growth-of-dollar.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/chart1-growth-of-dollar.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/chart1-growth-of-dollar.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/chart1-growth-of-dollar.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/chart1-growth-of-dollar.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/chart1-growth-of-dollar.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/chart1-growth-of-dollar.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/chart1-growth-of-dollar.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/chart1-growth-of-dollar.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/chart1-growth-of-dollar.png"
alt="Growth of one dollar chart from 1986 to 2024 on logarithmic scale showing three lines: Beta-1 Long Volatility plus S&amp;amp;P 500 outperforming both the S&amp;amp;P 500 alone and the PPUT index, with event markers for the GFC and COVID crashes"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
The persistence of this phenomenon, even with a simplistic implementation using monthly 5% OTM puts from the CBOE&amp;rsquo;s PPUT index, is what makes the paper interesting rather than dismissible. A more sophisticated execution (better strike selection, dynamic sizing, multi-tenor rolls) would likely improve results further. But the baseline already makes the case.&lt;/p&gt;
&lt;h2 id="ii-why-would-a-long-volatility-premium-exist"&gt;II. Why would a long volatility premium exist?&lt;/h2&gt;
&lt;p&gt;If markets are efficient, a beta-adjusted long volatility position should not deliver a positive premium. Three mechanisms suggest why it might.&lt;/p&gt;
&lt;p&gt;The first is the rebalancing premium. When you hold negatively correlated assets and rebalance systematically, you extract what the literature calls a &amp;ldquo;rebalancing bonus&amp;rdquo; where the geometric return exceeds the weighted average of individual arithmetic returns. &lt;a href="https://www.tandfonline.com/doi/full/10.1080/10293523.2025.2553254"&gt;Recent work in the Investment Analysts Journal&lt;/a&gt; formalizes this for tail hedging specifically. A long volatility position that delivers explosive gains during crashes and modest losses during calm markets, rebalanced against equities, creates a structural tailwind. You systematically sell the hedge at high prices after crashes and buy it back cheaply during calm, monetizing mean reversion.&lt;/p&gt;
&lt;p&gt;The second is that stock-volatility correlation intensifies dramatically during crashes. When equities fall sharply, implied volatility does not just rise proportionally, it spikes exponentially. The hedge&amp;rsquo;s payoff is largest precisely when the portfolio most needs it. This convexity, once beta-adjusted, can more than compensate for the ongoing cost of the position.&lt;/p&gt;
&lt;p&gt;The third is a supply-demand imbalance. Institutional investors are structurally short volatility in numerous ways: through equity ownership itself, through structured products with embedded short option positions, and through strategies that implicitly sell insurance (risk parity, short vol ETFs, pension de-risking). Meanwhile, the supply of long volatility is limited by behavioral challenges. As Jody Deio of Aearon Risk Advisors &lt;a href="https://alphaarchitect.com/the-long-volatility-premium-short-the-market-get-paid/"&gt;explains&lt;/a&gt;: &amp;ldquo;People don&amp;rsquo;t have the patience to wear these exposures for any long period of time. You&amp;rsquo;re happy being basically a wasting asset. And a wasting asset is nothing that any investment committee or client meeting wants to deal with.&amp;rdquo; This behavioral gap between the demand for protection and the willingness to supply it may create a structural premium for those who can withstand the psychology.&lt;/p&gt;
&lt;p&gt;All three mechanisms connect back to the variance tax. The ½σ² drag on compound returns means that reducing drawdown severity has a nonlinear effect on terminal wealth. By truncating left-tail outcomes, even a costly hedge can increase compound wealth through the compounding channel alone. &lt;a href="#lightbox-chart4-volatility-tax-png-3" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/chart4-volatility-tax.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/chart4-volatility-tax.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/chart4-volatility-tax.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/chart4-volatility-tax.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/chart4-volatility-tax.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/chart4-volatility-tax.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/chart4-volatility-tax.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/chart4-volatility-tax.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/chart4-volatility-tax.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/chart4-volatility-tax.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/chart4-volatility-tax.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/chart4-volatility-tax.png"
alt="Exponential recovery curve showing that a 50 percent drawdown requires 100 percent to recover, with severity zones marked as moderate, severe, and catastrophic, illustrating Spitznagel&amp;#39;s volatility tax thesis"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;h2 id="iii-what-the-research-actually-says"&gt;III. What the research actually says&lt;/h2&gt;
&lt;p&gt;The claim that long volatility is a compensated factor runs against a large body of literature documenting the short volatility premium. But the two are not necessarily contradictory. They operate at different levels of analysis.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://indices.cib.barclays/dms/Public%20marketing/Volatility_Risk_Premium.pdf"&gt;Research from Barclays&lt;/a&gt; found that while the VRP has positive equity market beta, it also has excess alpha above that beta exposure. A linear regression of the VRP against S&amp;amp;P 500 returns found a significant positive intercept of roughly 3.48 volatility points independent of equity market direction. This suggests that both buying and selling volatility can capture distinct premia depending on how the trade is structured and what exposures are isolated.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://am.gs.com/en-dk/advisors/insights/article/2026/finding-true-value-tail-risk-hedging"&gt;Goldman Sachs Asset Management&amp;rsquo;s 2025 analysis&lt;/a&gt; has the clearest framing I have seen. Their key finding: even an idealized, 99% reliable tail-risk hedging strategy provides a standalone annual return boost of only about &lt;strong&gt;0.8 basis points&lt;/strong&gt;. Trivial. But that is not the point. The real value comes from what the hedge enables. Because tail-risk hedges reduce the impact of severe drops, they allow a portfolio to take on more equity risk, to increase beta. The gains from this &amp;ldquo;risk budget reallocation&amp;rdquo; can be substantial, especially for institutional investors with fixed drawdown constraints. In Goldman&amp;rsquo;s framing, tail-risk hedging is not a standalone return generator. It is an offensive weapon that enables more aggressive positioning in core assets. This is philosophically closer to how Formula 1 teams think about pit stops: they cost time, but soft tires allow faster laps, resulting in a faster overall race. &lt;a href="#lightbox-chart6-portfolio-construction-png-4" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/chart6-portfolio-construction.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/chart6-portfolio-construction.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/chart6-portfolio-construction.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/chart6-portfolio-construction.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/chart6-portfolio-construction.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/chart6-portfolio-construction.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/chart6-portfolio-construction.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/chart6-portfolio-construction.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/chart6-portfolio-construction.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/chart6-portfolio-construction.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/chart6-portfolio-construction.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/chart6-portfolio-construction.png"
alt="Grid table showing five portfolio constructions with inline metric bars comparing CAGR, volatility, max drawdown, and Sharpe ratio. The 97 percent equity plus 3 percent tail hedge portfolio achieves 12.3 percent CAGR, beating the S&amp;amp;P 500"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;The numbers from Universa&amp;rsquo;s live track record add color. During Q1 2020, as the COVID pandemic triggered a 34% crash in the S&amp;amp;P 500, &lt;a href="https://finance.yahoo.com/news/mark-spitznagel-univesa-cio-on-risk-mitigation-204157461.html"&gt;Universa delivered a 4,144% return&lt;/a&gt;. But Spitznagel himself downplays these headline figures, noting that &amp;ldquo;any punter can devise a trade that does well in a crash. The key is how you do in a crash relative to the rest of the time.&amp;rdquo; &lt;a href="https://en.wikipedia.org/wiki/Universa_Investments"&gt;The Wall Street Journal reported&lt;/a&gt; that a strategy consisting of just a &lt;strong&gt;3.3% allocation&lt;/strong&gt; to Universa with the rest in the S&amp;amp;P 500 had a compound annual return of &lt;strong&gt;12.3%&lt;/strong&gt; over 10 years through February 2018, beating the S&amp;amp;P 500 itself. A 3.3% tail position improving total portfolio returns over a decade is not intuitive. But it follows directly from the variance tax arithmetic.&lt;/p&gt;
&lt;h2 id="iv-puts-vs-trend-the-tortoise-and-the-hare"&gt;IV. Puts vs. trend: the tortoise and the hare&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://www.aqr.com/Insights/Research/White-Papers/Tail-Risk-Hedging-Contrasting-Put-and-Trend-Strategies"&gt;AQR&amp;rsquo;s research on tail hedging&lt;/a&gt;, published in the Journal of Systematic Investing, complicates the picture in a way I find genuinely useful for portfolio construction. They compare two fundamental approaches: buying out-of-the-money puts and multi-asset trend-following.&lt;/p&gt;
&lt;p&gt;Puts act as the hare. They deliver spectacular returns in sudden crashes like COVID, when put-buying strategies returned over &lt;strong&gt;+42%&lt;/strong&gt; in a single month. But they are expensive to maintain and their long-term expected return is negative. Trend-following acts as the tortoise. It &lt;a href="https://www.aqr.com/Insights/Research/Alternative-Thinking/Tail-Hedging-Strategies"&gt;cannot provide the same reliable downside protection as index puts&lt;/a&gt;, but has delivered surprisingly consistent safe-haven performance when most needed while earning positive long-run returns. &lt;a href="#lightbox-chart5-tortoise-vs-hare-png-5" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/chart5-tortoise-vs-hare.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/chart5-tortoise-vs-hare.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/chart5-tortoise-vs-hare.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/chart5-tortoise-vs-hare.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/chart5-tortoise-vs-hare.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/chart5-tortoise-vs-hare.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/chart5-tortoise-vs-hare.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/chart5-tortoise-vs-hare.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/chart5-tortoise-vs-hare.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/chart5-tortoise-vs-hare.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/chart5-tortoise-vs-hare.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/chart5-tortoise-vs-hare.png"
alt="Grouped horizontal bars comparing put hedging versus trend following returns across six major crises from the dot-com bust to COVID, showing puts dominate short crashes like COVID at plus 42 percent while trend following wins protracted drawdowns like the dot-com bust at plus 42 percent over 31 months"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.wealthmanagement.com/equities/hedging-tail-risks-the-tortoise-versus-the-hare"&gt;AQR&amp;rsquo;s follow-up paper&lt;/a&gt; examined the five largest 60/40 drawdowns since 2000 and found that options-based strategies outperformed in shorter drawdowns while trend-following posted its most impressive returns during protracted bear markets. Since longer drawdowns are arguably more damaging to long-term wealth (they impair compounding for extended periods, which brings us back to the variance tax), AQR leans toward trend-following as the more practical hedge for most investors.&lt;/p&gt;
&lt;p&gt;But the strategies are genuinely complementary. &lt;a href="https://www.tandfonline.com/doi/full/10.1080/10293523.2025.2553254"&gt;Recent academic work&lt;/a&gt; combining both approaches via a portable alpha framework found statistically significant alpha of 0.25% per month after controlling for traditional equity factors, with the strongest outperformance during periods of market turmoil. Puts for the fast crash, trend for the slow bleed.&lt;/p&gt;
&lt;h2 id="where-most-tail-hedges-fail-and-the-benchmark-problem"&gt;Where most tail hedges fail, and the benchmark problem&lt;/h2&gt;
&lt;p&gt;Here is where most investors get burned. A &lt;a href="https://www.caia.org/sites/default/files/2013-aiar-q1-comparison.pdf"&gt;CAIA Association paper&lt;/a&gt; compared multiple tail-risk strategies against a deliberately boring benchmark: holding cash. Cash achieved a reduction of 80% of portfolio tail risk and 81% of portfolio standard deviation compared to an S&amp;amp;P 500-only portfolio, with an information ratio of 0.67. Several popular tail-risk strategies, particularly those involving short-dated VIX futures and 1-month variance swaps, actually failed to beat this cash benchmark, with performance drags of 355 and 203 basis points respectively. &lt;a href="#lightbox-chart7-strategy-efficiency-png-6" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/chart7-strategy-efficiency.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/chart7-strategy-efficiency.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/chart7-strategy-efficiency.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/chart7-strategy-efficiency.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/chart7-strategy-efficiency.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/chart7-strategy-efficiency.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/chart7-strategy-efficiency.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/chart7-strategy-efficiency.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/chart7-strategy-efficiency.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/chart7-strategy-efficiency.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/chart7-strategy-efficiency.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/chart7-strategy-efficiency.png"
alt="Scatter quadrant chart plotting annual cost in basis points versus crisis return for six tail-risk strategies. Trend following sits in the ideal quadrant with low cost and high crisis return. VIX futures and variance swaps fall in the expensive quadrant, underperforming even a simple cash allocation"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;This is the finding that should make any allocator uncomfortable. If your sophisticated tail hedge cannot beat holding Treasury bills, you are paying for complexity that destroys value. The specific implementation matters a lot, and many &amp;ldquo;obvious&amp;rdquo; approaches (VIX futures being the most popular) are structurally flawed because of contango decay in the VIX term structure that steadily erodes returns during calm periods.&lt;/p&gt;
&lt;p&gt;A &lt;a href="https://onlinelibrary.wiley.com/doi/full/10.1002/fut.22602"&gt;2025 paper in the Journal of Futures Markets&lt;/a&gt; adds a related finding: naïve hedging strategies outperformed more complex models for tail-risk hedging, consistent with earlier findings on variance-minimizing hedges. The explanation lies in model risk. Sophisticated approaches require more assumptions about market dynamics, and when those assumptions are wrong (as they inevitably are during the very tail events you are trying to hedge) the resulting model misspecification can leave hedging portfolios with higher-than-expected risk. This is a familiar problem in ML: the more parameters you fit on in-sample data, the worse your out-of-sample performance when the regime changes. Tail events are precisely when regimes change.&lt;/p&gt;
&lt;p&gt;There is also a benchmark problem that poisons the conversation around tail hedging. A portfolio of stocks plus put options gets compared against a portfolio of just stocks. When the market rises steadily for years, the hedged portfolio naturally underperforms and the hedge looks like a waste of money. This comparison is intellectually dishonest. A portfolio with puts has less risk than a portfolio without them. Comparing them as equivalent is like comparing a levered equity portfolio to an unlevered one and concluding that leverage &amp;ldquo;works&amp;rdquo; because it outperformed during a bull market. The appropriate comparison for a hedged portfolio is against a portfolio with similar risk, whether achieved through lower equity allocation, higher cash balances, or other risk-reducing measures. When &lt;a href="https://www.tandfonline.com/doi/full/10.1080/10293523.2025.2553254"&gt;Bhansali and Davis (2010)&lt;/a&gt; conducted this more appropriate comparison, they found that offensive tail hedging, using the freed-up risk budget to increase equity exposure, resulted in superior risk-adjusted performance. The hedge was not a drag. It was an enabler.&lt;/p&gt;
&lt;h2 id="what-i-take-away"&gt;What I take away&lt;/h2&gt;
&lt;p&gt;Most of the interesting questions in finance are not about individual positions but about what positions enable. Tail hedging is boring in isolation. What it does to the rest of the portfolio, the willingness to stay invested during drawdowns, the capacity to hold concentrated positions, the ability to rebalance into cheap assets after crashes rather than capitulating, that is where the return comes from. Spitznagel and Goldman agree on this even if they agree on little else.&lt;/p&gt;
&lt;p&gt;The optimal tail hedge allocation is a psychological question, not a mathematical one. Most practitioners suggest 1 to 5% of portfolio value, sized to offset a meaningful portion of equity losses during a severe 30 to 50% drawdown. But the right number is the one that allows you to stay invested in your core portfolio through the worst of times without abandoning the strategy. If you cannot stomach three years of negative carry on a put overlay, the correct allocation for you is zero, not five percent.&lt;/p&gt;
&lt;p&gt;The framing I find most useful is Goldman&amp;rsquo;s. Do not evaluate the hedge in isolation. Evaluate what it enables. A 3% tail hedge allocation that reduces max drawdown from 50% to 25% frees up enough risk budget to increase equity exposure by 10 to 15 percentage points. The incremental return from that higher equity allocation over a full market cycle will, in most scenarios, more than compensate for the cost of the hedge. The hedge is the enabler, not the alpha.&lt;/p&gt;
&lt;p&gt;Whether you implement this with puts, trend-following, or both depends on your time horizon and what kind of drawdown keeps you up at night. Fast crashes favor puts. Slow bleeds favor trend. If you do not know which one is coming (you do not), blend them.&lt;/p&gt;
&lt;p&gt;The caveats are real. All backtests benefit from hindsight. Transaction costs and bid-ask spreads in options markets are material and not fully captured in the CBOE indices used as benchmarks. The behavioral challenge of holding a position that bleeds money most of the time is hard to overstate, especially for allocators who report to investment committees that look at monthly returns.&lt;/p&gt;
&lt;p&gt;But the turkey metaphor from One River&amp;rsquo;s presentation is apt. A statistician turkey who, right up until Thanksgiving, can prove with perfect p-values that the farmer is benevolent. The turkey&amp;rsquo;s model is flawless within the distribution of observed data. The problem is that the data does not contain the event that matters most. Tail hedging is the strategy of the paranoid turkey. The empirical evidence suggests this paranoia can be not just protective but profitable, provided you implement it with discipline and use it not as a way to avoid risk but as a foundation for taking more of it.&lt;/p&gt;
&lt;aside class="disclaimer" role="note" aria-label="Disclaimer"&gt;
&lt;div class="disclaimer-content"&gt;&lt;p&gt;&lt;strong&gt;Disclaimer:&lt;/strong&gt; All opinions expressed are my own. This is not investment, financial, tax, or legal advice. Past performance does not indicate future results. Do your own research and consult qualified professionals before making financial decisions. No liability accepted for any losses.&lt;/p&gt;&lt;/div&gt;
&lt;/aside&gt;</description></item><item><title>The SaaSpocalypse Paradox</title><link>http://philippdubach.com/posts/the-saaspocalypse-paradox/</link><pubDate>Fri, 13 Feb 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/the-saaspocalypse-paradox/</guid><description>&lt;blockquote&gt;
&lt;p&gt;The market is simultaneously pricing AI capex failure and AI destroying all software. Both cannot be true.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;a href="#lightbox-jpm-murphy-note-spread-png-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/jpm-murphy-note-spread.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/jpm-murphy-note-spread.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/jpm-murphy-note-spread.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/jpm-murphy-note-spread.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/jpm-murphy-note-spread.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/jpm-murphy-note-spread.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/jpm-murphy-note-spread.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/jpm-murphy-note-spread.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/jpm-murphy-note-spread.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/jpm-murphy-note-spread.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/jpm-murphy-note-spread.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/jpm-murphy-note-spread.png"
alt="JP Morgan research note on the February 2026 software sell-off by Mark R Murphy titled Software Collapse Broadens with Nowhere to Hide, questioning the leap from Claude Cowork Plugins to full enterprise software disruption"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;Anthropic released &lt;a href="https://github.com/anthropics/knowledge-work-plugins"&gt;11 open-source plugins&lt;/a&gt; for Claude Cowork on January 30. Apache-2.0 licensed, file-based, running in a macOS-only research preview. Within a week, the IGV software ETF had fallen &lt;strong&gt;32%&lt;/strong&gt; from its September peak to a 52-week low of $79.65, roughly $2 trillion in market cap had evaporated, and hedge funds had made &lt;a href="https://www.bnnbloomberg.ca/business/2026/02/04/us-software-stocks-hit-by-anthropic-wake-up-call-on-ai-disruption/"&gt;$24 billion&lt;/a&gt; shorting the sector. The RSI hit 18, the most oversold reading &lt;a href="https://articles.stockcharts.com/article/the-claude-crash-how-ai-triggered-a-historic-selloff-in-software-stocks/"&gt;since 1990&lt;/a&gt;. JP Morgan titled their note &amp;ldquo;&lt;a href="https://privatebank.jpmorgan.com/nam/en/insights/markets-and-investing/tmt/software-shock-ais-broken-logic"&gt;Software Collapse Broadens with Nowhere to Hide&lt;/a&gt;.&amp;rdquo; Jefferies coined the term SaaSpocalypse. It was the worst software stock crash since the dot-com bust.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://fortune.com/2026/02/04/why-saas-stocks-tech-selloff-freefall-like-deepseek-2025-overblown-paradox-irrational/"&gt;Bank of America&amp;rsquo;s Vivek Arya&lt;/a&gt; identified the paradox at the center of this: investors are simultaneously punishing hyperscaler stocks because AI capex might generate weak returns, while destroying software stocks because AI adoption will be so pervasive it renders all existing software obsolete. Both cannot hold simultaneously. If AI tools aren&amp;rsquo;t generating meaningful ROI, they&amp;rsquo;re not replacing enterprise software at scale. If they are replacing enterprise software at scale, the hyperscalers are earning extraordinary returns on their infrastructure investment. &lt;a href="#lightbox-saaspocalypse-paradox-png-1" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/saaspocalypse-paradox.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/saaspocalypse-paradox.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/saaspocalypse-paradox.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/saaspocalypse-paradox.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/saaspocalypse-paradox.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/saaspocalypse-paradox.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/saaspocalypse-paradox.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/saaspocalypse-paradox.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/saaspocalypse-paradox.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/saaspocalypse-paradox.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/saaspocalypse-paradox.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/saaspocalypse-paradox.png"
alt="The BofA AI paradox in the 2026 SaaSpocalypse showing two mutually exclusive narratives: AI capex generating weak returns with $670B spend and 4 percent coverage ratio, versus AI destroying all software with 32 percent IGV drawdown and $2 trillion lost despite 17 percent sector earnings growth"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;This paradox can only resolve in one of three ways: AI adoption is real and hyperscaler capex is justified, AI adoption stalls and software incumbents are fine, or the truth is somewhere in between and the market has mispriced both sides. The first two are internally consistent. The market is pricing neither.&lt;/p&gt;
&lt;h2 id="the-bear-case-for-enterprise-software"&gt;The bear case for enterprise software&lt;/h2&gt;
&lt;p&gt;The structural argument against enterprise software is serious and worth stating on its own terms.&lt;/p&gt;
&lt;p&gt;Enterprise software monetizes through per-seat licensing. The SaaS business model depends on a stable correlation between headcount and license count. AI agents break that correlation. If 10 agents do the work of 100 people, the software doesn&amp;rsquo;t get replaced directly, the headcount that justifies the seats does, and CRM seat revenue drops with it. &lt;a href="https://www.tekedia.com/ai-could-destroy-500b-in-enterprise-software-revenue/"&gt;AlixPartners estimates&lt;/a&gt; up to &lt;strong&gt;$500 billion&lt;/strong&gt; in enterprise software revenue could be at risk over time. &lt;a href="https://www.idc.com/resource-center/blog/is-saas-dead-rethinking-the-future-of-software-in-the-age-of-ai/"&gt;IDC predicts&lt;/a&gt; pure seat-based pricing will be obsolete by 2028.&lt;/p&gt;
&lt;p&gt;The moat question is equally uncomfortable. Enterprise software&amp;rsquo;s traditional defense was the trained-user-interface moat: the years of institutional muscle memory that makes switching costs prohibitive. Databricks CEO Ali Ghodsi &lt;a href="https://techcrunch.com/2026/02/09/databricks-ceo-says-saas-isnt-dead-but-ai-will-soon-make-it-irrelevant/"&gt;told TechCrunch&lt;/a&gt; that this moat collapses when the interface becomes natural language. If the value of Salesforce or ServiceNow lived in their UI rather than their data, and the UI can now be replicated by a general-purpose model, then the moat was shallower than anyone thought. VC has &lt;a href="https://www.calcalistech.com/ctechnews/article/hjlvyl7lze"&gt;fled traditional SaaS entirely&lt;/a&gt;; as one investor noted, &amp;ldquo;an entrepreneur approaching a VC fund today with a SaaS startup won&amp;rsquo;t even reach the pitch stage.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;The build-versus-buy equation is inverting in real time. &lt;a href="https://www.klarna.com/international/press/klarna-ai-assistant-handles-two-thirds-of-customer-service-chats-in-its-first-month/"&gt;Klarna&lt;/a&gt; ditched Salesforce and Workday, consolidated onto its own AI-augmented stack, and used an OpenAI-powered bot to handle work that previously required 700 employees. &lt;a href="https://www.saastr.com/the-2026-saas-crash-its-not-what-you-think/"&gt;SaaStr&amp;rsquo;s analysis&lt;/a&gt; of Gartner&amp;rsquo;s &lt;a href="https://www.gartner.com/en/newsroom/press-releases/2026-02-03-gartner-forecasts-worldwide-it-spending-to-grow-10-point-8-percent-in-2026-totaling-6-point-15-trillion-dollars"&gt;$1.43 trillion&lt;/a&gt; 2026 software spending forecast reveals that roughly 9 percentage points of the 14.7% headline growth is price increases on existing software, not net new demand. AI is eating SaaS budgets, redirecting IT spend toward infrastructure while reducing the headcount that generates software seats.&lt;/p&gt;
&lt;p&gt;This is the case priced into the IGV at $80.&lt;/p&gt;
&lt;h2 id="the-bull-case-for-software-stocks"&gt;The bull case for software stocks&lt;/h2&gt;
&lt;p&gt;The structural argument for enterprise software rests on a distinction the current sell-off is ignoring entirely.&lt;/p&gt;
&lt;p&gt;The bear case assumes a shrinking TAM. &lt;a href="https://www.goldmansachs.com/insights/articles/ai-agents-to-boost-productivity-and-size-of-software-market"&gt;Goldman Sachs Research&lt;/a&gt; argues the opposite: the application software market grows to $780 billion by 2030 at a 13% CAGR, with agents accounting for over 60% of the total. The profit pool shifts from SaaS seats to agentic workloads, but the overall market gets larger, not smaller. &lt;a href="https://a16z.com/ai-will-supercharge-modelbusters/"&gt;a16z&amp;rsquo;s Alex Rampell&lt;/a&gt; takes it further: if AI enables software to not just enhance productivity but actually complete work, the addressable market isn&amp;rsquo;t roughly $350 billion in enterprise software spend (about 1% of GDP). It&amp;rsquo;s the &lt;strong&gt;~$6 trillion&lt;/strong&gt; white-collar services market (~20% of GDP), a 20x expansion into work that was never software-addressable before.&lt;/p&gt;
&lt;p&gt;David Friedberg made the sharpest version of this argument on the All-In Podcast: software transitions from helping people do work, to completing work, to doing work humans cannot do. At that point, the SaaS pricing model transitions from per-seat to value-based, and &amp;ldquo;SaaS basically takes over the services economy.&amp;rdquo; His estimate: the combined market cap of software companies could be 4x to 10x higher in five years, but &amp;ldquo;not evenly distributed.&amp;rdquo; &lt;a href="#lightbox-tam-expansion-bull-case-png-3" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/tam-expansion-bull-case.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/tam-expansion-bull-case.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/tam-expansion-bull-case.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/tam-expansion-bull-case.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/tam-expansion-bull-case.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/tam-expansion-bull-case.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/tam-expansion-bull-case.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/tam-expansion-bull-case.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/tam-expansion-bull-case.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/tam-expansion-bull-case.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/tam-expansion-bull-case.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/tam-expansion-bull-case.png"
alt="TAM expansion analysis from $350B enterprise software at 1 percent of GDP to Goldman Sachs $780B projection by 2030 with over 60 percent AI agent share, to the a16z thesis of $6 trillion in white-collar services at 20 percent of GDP, a 20x expansion"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;The software vs semiconductor valuation picture strengthens this framing. The sector is delivering 17% aggregate earnings growth in 2026 while trading at November 2022 EV/Sales multiples, back when the Fed was aggressively hiking into recession fears. The Russell 1000 Software subsector now trades at 32.4x forward earnings versus 43.6x for semiconductors. Recurring-revenue businesses with 90%+ gross margins and 95%+ renewal rates trade at a lower multiple than cyclical chipmakers with 40-60% margins and concentrated customer bases. &lt;a href="https://www.cnbc.com/2026/02/10/jpmorgan-says-the-historic-software-selloff-has-gone-far-enough-10-stocks-to-buy-on-sale.html"&gt;Historically that&amp;rsquo;s an inversion&lt;/a&gt; that has not persisted. &lt;a href="#lightbox-earnings-vs-stock-disconnect-png-4" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/earnings-vs-stock-disconnect.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/earnings-vs-stock-disconnect.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/earnings-vs-stock-disconnect.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/earnings-vs-stock-disconnect.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/earnings-vs-stock-disconnect.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/earnings-vs-stock-disconnect.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/earnings-vs-stock-disconnect.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/earnings-vs-stock-disconnect.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/earnings-vs-stock-disconnect.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/earnings-vs-stock-disconnect.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/earnings-vs-stock-disconnect.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/earnings-vs-stock-disconnect.png"
alt="Q4 2025 earnings vs stock performance disconnect in the 2026 software sell-off: Palantir plus 70.5 percent revenue growth but minus 11.6 percent stock, ServiceNow plus 21 percent but minus 28 percent, Oracle plus 10 percent but minus 53 percent from peak, sector aggregate plus 17 percent earnings growth versus minus 32 percent IGV drawdown"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;a href="#lightbox-valuation-inversion-png-5" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/valuation-inversion.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/valuation-inversion.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/valuation-inversion.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/valuation-inversion.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/valuation-inversion.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/valuation-inversion.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/valuation-inversion.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/valuation-inversion.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/valuation-inversion.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/valuation-inversion.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/valuation-inversion.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/valuation-inversion.png"
alt="Software vs semiconductor valuation inversion in 2026: Russell 1000 Software at 32.4x forward PE trades below Russell 1000 Semiconductors at 43.6x, an 11.2x multiple gap, with IGV at $79.65 and S&amp;amp;P 500 software weight compressed from 12 percent to 8.4 percent"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;This is the case that BofA called a paradox and JP Morgan called a mispricing.&lt;/p&gt;
&lt;h2 id="the-hyperscaler-ai-capex-question-that-connects-both-sides"&gt;The hyperscaler AI capex question that connects both sides&lt;/h2&gt;
&lt;p&gt;There is a number that both cases have to account for, and it&amp;rsquo;s the one that determines which side of the paradox resolves first.&lt;/p&gt;
&lt;p&gt;Combined 2026 capex guidance from Microsoft, Alphabet, Amazon, Meta, and Oracle now approaches &lt;a href="https://www.cnbc.com/2026/02/06/google-microsoft-meta-amazon-ai-cash.html"&gt;&lt;strong&gt;$700 billion&lt;/strong&gt;&lt;/a&gt;, more than doubling from $256 billion in 2024. &lt;a href="https://fortune.com/2026/02/04/why-saas-stocks-tech-selloff-freefall-like-deepseek-2025-overblown-paradox-irrational/"&gt;Bank of America calculates&lt;/a&gt; this consumes 94% of operating cash flows after capital returns. The Big Five raised $108 billion in bonds in 2025. AI-related services generate roughly $25 billion in direct revenue against $400+ billion in annual infrastructure spending, a coverage ratio of about 4%. &lt;a href="#lightbox-hyperscaler-capex-vs-cashflow-png-6" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/hyperscaler-capex-vs-cashflow.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/hyperscaler-capex-vs-cashflow.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/hyperscaler-capex-vs-cashflow.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/hyperscaler-capex-vs-cashflow.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/hyperscaler-capex-vs-cashflow.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/hyperscaler-capex-vs-cashflow.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/hyperscaler-capex-vs-cashflow.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/hyperscaler-capex-vs-cashflow.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/hyperscaler-capex-vs-cashflow.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/hyperscaler-capex-vs-cashflow.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/hyperscaler-capex-vs-cashflow.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/hyperscaler-capex-vs-cashflow.png"
alt="FY2026 hyperscaler AI capex vs cash flow: MSFT META GOOGL AMZN and ORCL estimated cash from operations less dividends and buybacks versus guided capital expenditure, with only Microsoft generating a $5B surplus while Meta shows minus $23B, Google minus $20B, Amazon minus $18B, and Oracle minus $30B"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;If the bear case is right and AI agents are replacing enterprise software at scale, this capex should already be generating enormous returns. It isn&amp;rsquo;t. If the bull case is right and AI is expanding the TAM into the services economy, this capex is early-stage infrastructure investment that will compound over a decade. In that reading, $700 billion in annual spend is the foundation of a $6 trillion market, not a write-off. Both interpretations require the same capex figure to mean something fundamentally different. The market hasn&amp;rsquo;t decided which.&lt;/p&gt;
&lt;p&gt;Microsoft is the sharpest illustration of this tension. Quarterly capex went from $1 billion in early 2015 to a record &lt;a href="https://fintool.com/news/microsoft-q2-record-capex-cloud-ai"&gt;$37.5 billion in Q2 FY2026&lt;/a&gt;, with roughly two-thirds going to short-lived GPU/CPU assets. And yet Microsoft is the &lt;a href="https://www.gurufocus.com/news/8591224/microsoft-msft-maintains-resilient-cash-flow-amid-hyperscaler-spending-surge"&gt;only hyperscaler&lt;/a&gt; that can fund this buildout from operating cash flow. Azure grew &lt;a href="https://futurumgroup.com/insights/microsoft-q2-fy-2026-cloud-surpasses-50b-azure-up-38-cc/"&gt;39% in Q2 FY2026&lt;/a&gt;, crossing $50 billion in quarterly cloud revenue for the first time. The company is simultaneously the biggest AI capex spender, the one best positioned to generate returns on that spend, and the company whose products (365, Dynamics, Azure) are supposedly being disrupted by Claude plugins. The market is punishing all three at once. &lt;a href="#lightbox-msft-quarterly-capex-png-7" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/msft-quarterly-capex.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/msft-quarterly-capex.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/msft-quarterly-capex.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/msft-quarterly-capex.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/msft-quarterly-capex.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/msft-quarterly-capex.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/msft-quarterly-capex.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/msft-quarterly-capex.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/msft-quarterly-capex.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/msft-quarterly-capex.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/msft-quarterly-capex.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/msft-quarterly-capex.png"
alt="Microsoft quarterly AI capex from FY2015 to FY2026 showing growth from $1 billion to $37.5 billion per quarter, a 2048 percent increase, with recent quarters showing AI infrastructure acceleration"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;h2 id="bifurcation-not-extinction-the-saaspocalypse-resolved"&gt;Bifurcation, not extinction: the SaaSpocalypse resolved&lt;/h2&gt;
&lt;p&gt;A &lt;a href="https://am.jpmorgan.com/content/dam/jpm-am-aem/global/en/insights/eye-on-the-market/smothering-heights-amv.pdf"&gt;60% recession probability&lt;/a&gt;, a &lt;a href="https://www.cnbc.com/2026/02/02/fridays-jobs-report-will-be-delayed-because-of-the-partial-government-shutdown.html"&gt;partial government shutdown&lt;/a&gt;, &lt;a href="https://www.salesforceben.com/what-do-trumps-tariffs-mean-for-the-tech-sector/"&gt;elevated tariffs&lt;/a&gt;, and a structural pricing transition are being sold as a single story. They aren&amp;rsquo;t. Separating the macro from the structural requires asking which software categories are genuinely at risk and which are being sold by association.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.janushenderson.com/en-us/investor/article/how-ai-disruption-is-reshaping-the-software-sector-landscape/"&gt;Janus Henderson makes a useful distinction&lt;/a&gt; between &amp;ldquo;systems of record&amp;rdquo; and &amp;ldquo;systems of engagement.&amp;rdquo; Systems of record are deeply embedded in business processes, require regulatory compliance, and carry enormous switching costs: ERP, core finance, cybersecurity, observability. &lt;a href="https://pitchbook.com/news/articles/is-ais-threat-to-software-overblown-pitchbook-analysis"&gt;PitchBook described&lt;/a&gt; replacing one as &amp;ldquo;effectively open-heart surgery for an enterprise.&amp;rdquo; Systems of engagement are user-facing workflow tools where the interface is the product: content creation, tier-1 support, basic analytics. When the interface becomes natural language, that moat collapses. &lt;a href="#lightbox-software-bifurcation-map-png-8" style="display: block; width: 90%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/software-bifurcation-map.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/software-bifurcation-map.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/software-bifurcation-map.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/software-bifurcation-map.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/software-bifurcation-map.png 1200w"
sizes="90vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/software-bifurcation-map.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/software-bifurcation-map.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/software-bifurcation-map.png 1440w"
sizes="90vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/software-bifurcation-map.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/software-bifurcation-map.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/software-bifurcation-map.png 2000w"
sizes="90vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/software-bifurcation-map.png"
alt="Software bifurcation map by AI disruption risk: ERP cybersecurity and observability at low risk, core CRM and dev tools at medium risk, content creation tier-1 support and basic analytics at high risk, showing the market is pricing every category as if it faces equal threat"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;The bear case is correct about the second category. The bull case is correct about the first. The market is wrong to price them identically. Selling both at the same multiple compression implies that switching costs, regulatory requirements, data gravity, and enterprise procurement cycles have all vanished simultaneously. &lt;a href="https://www.gartner.com/en/newsroom/press-releases/2025-06-25-gartner-predicts-over-40-percent-of-agentic-ai-projects-will-be-canceled-by-end-of-2027"&gt;Gartner predicts&lt;/a&gt; over 40% of agentic AI projects will be cancelled by 2027. Salesforce&amp;rsquo;s Agentforce reached &lt;a href="https://www.salesforceben.com/salesforce-avoids-q3-danger-zone-with-explosive-agentforce-momentum/"&gt;18,500 customers&lt;/a&gt; in its first year, the fastest-adopted organic product in company history. These are not the behaviors of a category that has been disrupted. They are the behaviors of incumbents absorbing a new paradigm.&lt;/p&gt;
&lt;p&gt;Stated precisely: the bear case is a zero-sum repricing where AI agents compress existing software revenue by eliminating seats and commoditizing interfaces. The bull case is a positive-sum expansion where the surviving software companies capture the $6 trillion in white-collar services that was never software-addressable before. The cost of intelligence has fallen &lt;a href="https://a16z.com/ai-will-supercharge-modelbusters/"&gt;99.7% in two years&lt;/a&gt; (Stanford AI Index). Cumulative AI infrastructure investment is expected to exceed $3 trillion by 2030. That kind of capital deployment doesn&amp;rsquo;t produce a world where software shrinks. It produces a world where the definition of &amp;ldquo;software&amp;rdquo; expands to include most of the services economy.&lt;/p&gt;
&lt;p&gt;I wrote &lt;a href="https://philippdubach.com/posts/the-market-can-stay-irrational-longer-than-you-can-stay-solvent/"&gt;recently&lt;/a&gt; about how passive flows create mechanical, price-insensitive selling that overwhelms fundamental buyers. This software sell-off is a textbook case. JP Morgan&amp;rsquo;s Murphy &lt;a href="https://privatebank.jpmorgan.com/nam/en/insights/markets-and-investing/tmt/software-shock-ais-broken-logic"&gt;described&lt;/a&gt; index arbitrage basket selling, programmatic de-grossing, and passive flow liquidity vacuums. The IGV recorded its &lt;a href="https://articles.stockcharts.com/article/the-claude-crash-how-ai-triggered-a-historic-selloff-in-software-stocks/"&gt;highest single-day trading volume&lt;/a&gt; in 25 years. &lt;a href="https://www.cnbc.com/2026/02/10/jpmorgan-says-the-historic-software-selloff-has-gone-far-enough-10-stocks-to-buy-on-sale.html"&gt;JP Morgan&amp;rsquo;s follow-up&lt;/a&gt; argued the sell-off has gone far enough. &lt;a href="https://fortune.com/2026/02/04/why-saas-stocks-tech-selloff-freefall-like-deepseek-2025-overblown-paradox-irrational/"&gt;BofA called it&lt;/a&gt; a paradox that &amp;ldquo;doesn&amp;rsquo;t make any sense.&amp;rdquo; History suggests these kinds of extremes, the 2016 LinkedIn panic, the 2022 rate-shock drawdown, the January 2025 DeepSeek crash, tend to mark inflection points rather than starting points for further decline.&lt;/p&gt;
&lt;p&gt;The hardest trade right now is the one that requires distinguishing between stocks that are cheap because they&amp;rsquo;re broken and stocks that are cheap because the market is broken. The SaaSpocalypse priced into the IGV at $80, with a 30-year-extreme RSI, pricing in an extinction event that operating results don&amp;rsquo;t remotely support, looks a lot more like the latter.&lt;/p&gt;
&lt;aside class="disclaimer" role="note" aria-label="Disclaimer"&gt;
&lt;div class="disclaimer-content"&gt;&lt;p&gt;&lt;strong&gt;Disclaimer:&lt;/strong&gt; All opinions expressed are my own. This is not investment, financial, tax, or legal advice. Past performance does not indicate future results. Do your own research and consult qualified professionals before making financial decisions. No liability accepted for any losses.&lt;/p&gt;&lt;/div&gt;
&lt;/aside&gt;</description></item><item><title>Don't Go Monolithic; The Agent Stack Is Stratifying</title><link>http://philippdubach.com/posts/dont-go-monolithic-the-agent-stack-is-stratifying/</link><pubDate>Tue, 10 Feb 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/dont-go-monolithic-the-agent-stack-is-stratifying/</guid><description>&lt;blockquote&gt;
&lt;p&gt;The defensible asset in enterprise AI is not the model. It&amp;rsquo;s the organizational world model.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Every major compute era decomposes into specialized layers with different winners at each level. Cloud split into IaaS, PaaS, and SaaS. The modern data stack split into ingestion, warehousing, transformation, and BI. Each time, specialists beat the generalists because the layers have fundamentally different economics: different rates of change, different capital requirements, different sources of lock-in.&lt;/p&gt;
&lt;p&gt;The enterprise AI agent stack is doing the same thing right now. Arvind Jain, the CEO of Glean, recently published a &lt;a href="https://x.com/arvind2/status/2020920652950339694"&gt;structural analysis&lt;/a&gt; of the emerging enterprise agent architecture that crystallized something I&amp;rsquo;d been thinking about. His framing describes a stack decomposing into six layers (security, context, models, orchestration, agents, and interfaces) with different defensibility profiles at each level. Glean sits in the context layer so the usual positioning caveats apply, but the structural argument is sound regardless of who makes it.&lt;/p&gt;
&lt;p&gt;I want to take it further. There are three claims embedded in this agentic AI architecture that I think are underappreciated, and together they form a thesis about where durable advantage actually accrues in enterprise AI. &lt;a href="#lightbox-emerging-agent-stack-png-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/emerging-agent-stack.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/emerging-agent-stack.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/emerging-agent-stack.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/emerging-agent-stack.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/emerging-agent-stack.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/emerging-agent-stack.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/emerging-agent-stack.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/emerging-agent-stack.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/emerging-agent-stack.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/emerging-agent-stack.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/emerging-agent-stack.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/emerging-agent-stack.png"
alt="Enterprise AI agent stack diagram showing six layers ranked by defensibility: Context scores highest (hardest to rebuild), followed by Orchestration and Security, while Models and Interfaces have the lowest switching costs"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;h2 id="i-models-are-converging-toward-shared-infrastructure"&gt;I. Models are converging toward shared infrastructure&lt;/h2&gt;
&lt;p&gt;The model layer is the one most people obsess over, and it&amp;rsquo;s also the one converging fastest toward commodity economics. Training costs &lt;a href="https://epoch.ai/blog/how-much-does-it-cost-to-train-frontier-ai-models"&gt;scale roughly 2.4x per year&lt;/a&gt;, with current frontier runs costing hundreds of millions and &lt;a href="https://www.tomshardware.com/tech-industry/artificial-intelligence/ai-models-that-cost-dollar1-billion-to-train-are-in-development-dollar100-billion-models-coming-soon-largest-current-models-take-only-dollar100-million-to-train-anthropic-ceo"&gt;billion-dollar training runs already underway&lt;/a&gt;, according to Anthropic&amp;rsquo;s Dario Amodei. Only a handful of organizations on Earth can operate at this scale: OpenAI, Google DeepMind, Anthropic, Meta, and a few others including xAI and Mistral. This is textbook capital-intensive infrastructure, structurally identical to semiconductor fabs or cloud hyperscalers. The logical conclusion: foundation models become shared utilities, not enterprise moats.&lt;/p&gt;
&lt;p&gt;The industry has already internalized this. &lt;a href="https://a16z.com/ai-enterprise-2025/"&gt;37% of enterprises now use five or more models in production&lt;/a&gt;, up from 29% the prior year. Different tasks demand different models: Claude for code and tool use, GPT for extended reasoning, Gemini Flash for low-latency routing, specialized models for image generation and embeddings. Betting your enterprise stack on a single model provider is the new version of single-cloud risk. Open standards like Anthropic&amp;rsquo;s &lt;a href="https://www.anthropic.com/news/model-context-protocol"&gt;Model Context Protocol&lt;/a&gt;, now &lt;a href="https://www.anthropic.com/news/donating-the-model-context-protocol-and-establishing-of-the-agentic-ai-foundation"&gt;hosted by the Linux Foundation&lt;/a&gt; with 97 million monthly SDK downloads, and Google&amp;rsquo;s &lt;a href="https://developers.googleblog.com/en/a2a-a-new-era-of-agent-interoperability/"&gt;Agent-to-Agent protocol&lt;/a&gt; are making this multi-model enterprise AI architecture practical.&lt;/p&gt;
&lt;p&gt;If models are infrastructure, the differentiation question moves up the stack. And that&amp;rsquo;s where it gets interesting.&lt;/p&gt;
&lt;h2 id="ii-the-enterprise-ai-context-layer-has-two-depths-and-most-people-only-see-the-first"&gt;II. The enterprise AI context layer has two depths, and most people only see the first&lt;/h2&gt;
&lt;p&gt;This is the part of the thesis I find most intellectually compelling, and where I think the conventional understanding falls short.&lt;/p&gt;
&lt;p&gt;Most enterprise AI efforts operate at what I&amp;rsquo;d call Layer 1 context: connecting data sources, indexing content, enforcing permissions, retrieving relevant documents. This is the RAG-era problem set: familiar, well-understood, and increasingly commoditized. Virtually every enterprise AI platform offers connectors, vector stores, and retrieval pipelines. It matters, but it&amp;rsquo;s not where defensibility lives.&lt;/p&gt;
&lt;p&gt;Layer 2 is where the thesis gets genuinely novel: process-level understanding. Most enterprise knowledge systems capture decisions. What ends up in the CRM, the ticketing system, the ERP. But they don&amp;rsquo;t capture &lt;em&gt;how&lt;/em&gt; those decisions were made: the meetings, Slack threads, document iterations, handoffs, and informal coordination that produced the recorded outcome. &lt;a href="#lightbox-context-depth-comparison-png-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/context-depth-comparison.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/context-depth-comparison.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/context-depth-comparison.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/context-depth-comparison.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/context-depth-comparison.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/context-depth-comparison.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/context-depth-comparison.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/context-depth-comparison.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/context-depth-comparison.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/context-depth-comparison.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/context-depth-comparison.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/context-depth-comparison.png"
alt="Enterprise AI context layer depth comparison showing what Systems of Record capture (decisions, states, entities, relationships) versus what Context Graphs capture (processes, temporal traces, causal structure, variability), with ML lens annotations mapping to labels versus feature space and trajectory data"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Through a machine learning lens, the distinction is sharp: systems of record give you labels. Context graphs give you the feature space and trajectory data you&amp;rsquo;d actually need to learn the decision boundary. Consider a concrete example. Your CRM records that Deal X closed at $500K. That&amp;rsquo;s a label. The context graph captures the 14 meetings, 3 stakeholder handoffs, the pricing negotiation pattern, and the competitive displacement sequence that produced that outcome. Those are the features and the trajectory. An agent trained on labels alone can&amp;rsquo;t replicate the process that generated them.&lt;/p&gt;
&lt;p&gt;This is why so many early enterprise AI deployments produce outputs that are technically plausible but operationally useless. The agent has access to the what but not the how. It can retrieve the right documents but can&amp;rsquo;t reconstruct the reasoning process that a human would follow. Closing that gap, building systems that capture and encode process knowledge rather than just decision records, is the highest-value problem in enterprise AI right now.&lt;/p&gt;
&lt;h2 id="iii-context-and-orchestration-form-a-compounding-flywheel"&gt;III. Context and orchestration form a compounding flywheel&lt;/h2&gt;
&lt;p&gt;There&amp;rsquo;s a reinforcement learning analogy here that I think is underappreciated. The orchestrator is the policy. The context graph is the learned world model. Agent traces are the trajectories. Every successful execution reinforces good patterns. Every failure surfaces where context is missing or stale. Over time, the system builds an increasingly accurate representation of how the organization actually operates.&lt;/p&gt;
&lt;p&gt;And this loops back: more deployment produces richer traces, which improve the context graph, which improves agent decisions, which builds trust, which drives more deployment. &lt;a href="#lightbox-compounding-flywheel-png-3" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/compounding-flywheel.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/compounding-flywheel.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/compounding-flywheel.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/compounding-flywheel.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/compounding-flywheel.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/compounding-flywheel.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/compounding-flywheel.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/compounding-flywheel.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/compounding-flywheel.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/compounding-flywheel.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/compounding-flywheel.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/compounding-flywheel.png"
alt="Organizational world model compounding flywheel showing the five-step loop: Agent Executes → Traces Captured → Context Improves → Better Decisions → More Deployment, with ML analogy mapping table showing enterprise concepts mapped to RL primitives (policy rollout, trajectories, world model update, policy improvement, online learning loop)"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
This is the same compounding mechanism that makes recommendation engines and autonomous driving systems improve with scale. Netflix gets better at recommendations because every viewing session generates training signal. Waymo gets better at driving because every mile generates edge cases. The difference here is that the asset being built isn&amp;rsquo;t a product feature. It&amp;rsquo;s an organizational world model, a learned representation of how your specific company works.&lt;/p&gt;
&lt;p&gt;And unlike model weights, which any well-funded lab can approximate, your organization&amp;rsquo;s accumulated process knowledge is genuinely unique. No one else has your meeting patterns, your escalation sequences, your informal decision-making topology. That&amp;rsquo;s a moat.&lt;/p&gt;
&lt;h2 id="where-this-breaks-and-why-the-agentic-ai-failure-rate-will-be-high"&gt;Where this breaks, and why the agentic AI failure rate will be high&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://www.uctoday.com/unified-communications/gartner-predicts-40-of-enterprise-apps-will-feature-ai-agents-by-2026/"&gt;Gartner predicts 40% of enterprise applications will feature task-specific AI agents by 2026&lt;/a&gt;, up from less than 5% in 2025. &lt;a href="https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai"&gt;McKinsey&amp;rsquo;s latest survey shows 23% of organizations are already scaling agentic AI&lt;/a&gt;, with another 39% experimenting. But Gartner also warns that over 40% of agentic AI projects will be canceled by end of 2027 due to escalating costs and unclear business value.&lt;/p&gt;
&lt;p&gt;The gap between ambition and execution is the context problem in disguise. Without process knowledge, agents produce plausible outputs that don&amp;rsquo;t match how the organization actually works. They retrieve the right policy document but apply it without understanding the exceptions your team has developed over years. They draft the right kind of email but miss the relationship dynamics that would change the tone. The failure mode isn&amp;rsquo;t that the model is bad. It&amp;rsquo;s that the context is shallow. &lt;a href="#lightbox-lockin-vs-rebuild-scatter-png-4" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/lockin-vs-rebuild-scatter.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/lockin-vs-rebuild-scatter.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/lockin-vs-rebuild-scatter.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/lockin-vs-rebuild-scatter.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/lockin-vs-rebuild-scatter.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/lockin-vs-rebuild-scatter.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/lockin-vs-rebuild-scatter.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/lockin-vs-rebuild-scatter.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/lockin-vs-rebuild-scatter.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/lockin-vs-rebuild-scatter.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/lockin-vs-rebuild-scatter.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/lockin-vs-rebuild-scatter.png"
alt="Enterprise AI agent stack scatter plot showing six layers plotted by lock-in risk versus rebuild difficulty. Context sits alone in the top-right danger zone with highest lock-in and hardest rebuild. Models, Interfaces, and Agents cluster in the commodity zone at bottom-left. Orchestration and Security occupy the middle."
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
This chart tells the strategic story in one image. Models, interfaces, and agents cluster in the commodity zone: low lock-in, easy to replace. Context sits alone in the danger zone: highest lock-in risk and hardest to rebuild. That&amp;rsquo;s exactly where your due diligence should concentrate.&lt;/p&gt;
&lt;h2 id="what-to-actually-do-about-your-agentic-ai-architecture"&gt;What to actually do about your agentic AI architecture&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Don&amp;rsquo;t go monolithic.&lt;/strong&gt; Each layer evolves at a different rate. Models improve quarterly, context infrastructure evolves over months, security requirements shift with regulation. Coupling them into one vendor&amp;rsquo;s all-in-one platform forces you to upgrade at the speed of the slowest-moving layer. You inherit their architectural bets, their integration timeline, their roadmap priorities. The history of enterprise software is littered with platforms that tried to own every layer and ended up mediocre at all of them.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Insist on interoperability.&lt;/strong&gt; MCP, A2A, open connectors. If your vendor doesn&amp;rsquo;t support open standards, you&amp;rsquo;re absorbing limitations you can&amp;rsquo;t see yet. The pace of AI innovation is faster than any prior technology cycle, and you need the ability to swap in new capabilities the moment they appear without rebuilding your stack. The organizations that locked into single-vendor cloud stacks in 2015 spent years migrating out. Don&amp;rsquo;t repeat that mistake at the agent layer.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Treat context as portable IP.&lt;/strong&gt; Your organizational world model (process knowledge, interaction history, learned workflow patterns) is the hardest-to-rebuild and most valuable asset in the stack. Ensure it is not locked to any single vendor or model provider. The right architecture separates accumulated context from the model layer so you retain your organizational IP regardless of which models or platforms you use tomorrow.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Start the flywheel early.&lt;/strong&gt; The compounding advantage in context accrues with deployment, not with time spent evaluating. Every agent execution generates organizational learning. Companies that wait to &amp;ldquo;see how it plays out&amp;rdquo; forfeit years of compounding to first movers. This isn&amp;rsquo;t speculative. It&amp;rsquo;s the same math that governs every data flywheel business. The question isn&amp;rsquo;t whether to start. It&amp;rsquo;s whether you can afford the cost of starting late.&lt;/p&gt;
&lt;p&gt;The stack will stratify. Specialists will outperform monoliths. Models will converge toward shared infrastructure. The defensible asset in enterprise AI is not the model. It&amp;rsquo;s the organizational world model. The organizations that start building it now, maintaining it carefully, and keeping it portable will compound their lead in the agent era. Everyone else will be buying commodity inference and wondering why their agents don&amp;rsquo;t work.&lt;/p&gt;
&lt;aside class="disclaimer" role="note" aria-label="Disclaimer"&gt;
&lt;div class="disclaimer-content"&gt;&lt;p&gt;&lt;strong&gt;Disclaimer:&lt;/strong&gt; AI capabilities evolve rapidly; information may become outdated. Code and implementations provided as-is without warranty.&lt;/p&gt;&lt;/div&gt;
&lt;/aside&gt;</description></item><item><title>Where Mobile Money Goes Now</title><link>http://philippdubach.com/posts/where-mobile-money-goes-now/</link><pubDate>Sat, 07 Feb 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/where-mobile-money-goes-now/</guid><description>&lt;p&gt;Sensor Tower&amp;rsquo;s &lt;a href="https://sensortower.com/state-of-mobile-2026"&gt;State of Mobile 2026&lt;/a&gt; report confirms what had been building for years: the mobile app economy has permanently shifted. For the first decade of mobile, games made more money than everything else combined. Clash of Clans and Candy Crush built empires on freemium. King went public. Supercell sold for $10 billion. That changed in 2025.&lt;/p&gt;
&lt;h2 id="apps-overtake-games-in-mobile-revenue"&gt;Apps Overtake Games in Mobile Revenue&lt;/h2&gt;
&lt;p&gt;&lt;a href="#lightbox-apps_vs_games_revenue-png-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/apps_vs_games_revenue.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/apps_vs_games_revenue.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/apps_vs_games_revenue.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/apps_vs_games_revenue.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/apps_vs_games_revenue.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/apps_vs_games_revenue.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/apps_vs_games_revenue.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/apps_vs_games_revenue.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/apps_vs_games_revenue.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/apps_vs_games_revenue.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/apps_vs_games_revenue.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/apps_vs_games_revenue.png"
alt="Line chart showing apps overtaking games in mobile IAP revenue in 2025, with apps at $85.6B and games at $81.8B, per Sensor Tower&amp;#39;s State of Mobile 2026"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Non-game applications now generate more in-app purchase revenue than games. Apps crossed $85.6 billion in 2025, up 21% year-over-year. Games managed $81.8 billion, barely moving from the year before.&lt;/p&gt;
&lt;p&gt;Games peaked in 2021 and flatlined. Apps kept compounding. Subscriptions, which seemed like a novelty in 2018, became the dominant mobile monetization model for cloud storage, language learning, and now AI.&lt;/p&gt;
&lt;h2 id="genai-the-35-billion-growth-engine"&gt;GenAI: The $3.5 Billion Growth Engine&lt;/h2&gt;
&lt;p&gt;&lt;a href="#lightbox-genai_revenue_growth-png-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/genai_revenue_growth.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/genai_revenue_growth.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/genai_revenue_growth.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/genai_revenue_growth.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/genai_revenue_growth.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/genai_revenue_growth.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/genai_revenue_growth.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/genai_revenue_growth.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/genai_revenue_growth.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/genai_revenue_growth.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/genai_revenue_growth.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/genai_revenue_growth.png"
alt="Horizontal bar chart showing GenAI led mobile app revenue growth in 2025 with $3.5B added, more than any other category"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Generative AI was the biggest contributor to consumer spending on mobile apps. The category added $3.5 billion in IAP revenue in 2025, more than Movies &amp;amp; TV ($2.2B) or Social Media ($2.1B). It went from near-zero in 2022 to the top growth category in three years. &lt;a href="#lightbox-genai_rise-png-2" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/genai_rise.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/genai_rise.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/genai_rise.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/genai_rise.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/genai_rise.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/genai_rise.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/genai_rise.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/genai_rise.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/genai_rise.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/genai_rise.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/genai_rise.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/genai_rise.png"
alt="Combined bar and line chart showing GenAI app downloads rising from 0.05B in 2021 to 1.45B in 2024, with revenue hitting $1.25B"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
GenAI apps went from 50 million downloads in 2021 to 1.45 billion in 2024. Revenue jumped from essentially nothing to $1.25 billion. ChatGPT alone accounts for 40% of the category&amp;rsquo;s consumer spend. This is just in-app purchases and does not count subscriptions billed outside the app store or enterprise contracts.&lt;/p&gt;
&lt;h2 id="who-actually-uses-ai-apps"&gt;Who Actually Uses AI Apps&lt;/h2&gt;
&lt;p&gt;The demographics are interesting: AI app users look nothing like the broader internet population. &lt;a href="#lightbox-genai_demographics-png-3" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/genai_demographics.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/genai_demographics.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/genai_demographics.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/genai_demographics.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/genai_demographics.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/genai_demographics.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/genai_demographics.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/genai_demographics.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/genai_demographics.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/genai_demographics.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/genai_demographics.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/genai_demographics.png"
alt="Scatter plot showing GenAI user demographics cluster with Reddit and X (young, male-skewing), not Instagram or Pinterest"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
GenAI users cluster with Reddit and X. Young, male, tech-adjacent. They look nothing like Instagram (young women) or Pinterest (older women) or even Facebook (everyone&amp;rsquo;s parents). The AI audience is still a niche, even as GenAI app revenue scales.&lt;/p&gt;
&lt;h2 id="the-ai-advertising-playbook"&gt;The AI Advertising Playbook&lt;/h2&gt;
&lt;p&gt;This explains where AI companies advertise: &lt;a href="#lightbox-ai_advertising_skew-png-4" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/ai_advertising_skew.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/ai_advertising_skew.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/ai_advertising_skew.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/ai_advertising_skew.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai_advertising_skew.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/ai_advertising_skew.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/ai_advertising_skew.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/ai_advertising_skew.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai_advertising_skew.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/ai_advertising_skew.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/ai_advertising_skew.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/ai_advertising_skew.png"
alt="Horizontal bar chart showing AI companies over-index on LinkedIn (&amp;#43;45%) and under-index on Pinterest (-13%) and YouTube (-9%) for ad demographics"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
LinkedIn gets 45% more GenAI ad impressions than its share of the general population would suggest. Pinterest and YouTube get less. The AI advertising playbook is simple: find professionals, not consumers.&lt;/p&gt;
&lt;h2 id="ai-driven-retail-referral-traffic"&gt;AI-Driven Retail Referral Traffic&lt;/h2&gt;
&lt;p&gt;One place where AI has found consumers: shopping. &lt;a href="#lightbox-ai_retail_referrals-png-6" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/ai_retail_referrals.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/ai_retail_referrals.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/ai_retail_referrals.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/ai_retail_referrals.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai_retail_referrals.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/ai_retail_referrals.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/ai_retail_referrals.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/ai_retail_referrals.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai_retail_referrals.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/ai_retail_referrals.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/ai_retail_referrals.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/ai_retail_referrals.png"
alt="Stacked area chart showing GenAI referral traffic to major retailers growing from ~$5M to ~$51M between Oct 2024 and Dec 2025"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Referral traffic from AI tools to major retailers grew roughly 7x between October 2024 and December 2025. People are asking ChatGPT what to buy, and then buying it. Amazon captures the largest share, but Walmart, Target, and Home Depot have all seen triple-digit percentage growth in AI-driven traffic. Still less than 1% of total retail traffic. But growing fast.&lt;/p&gt;
&lt;h2 id="youtubes-cross-generational-dominance"&gt;YouTube&amp;rsquo;s Cross-Generational Dominance&lt;/h2&gt;
&lt;p&gt;One pattern stands out: &lt;a href="#lightbox-youtube_dominance-png-7" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/youtube_dominance.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/youtube_dominance.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/youtube_dominance.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/youtube_dominance.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/youtube_dominance.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/youtube_dominance.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/youtube_dominance.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/youtube_dominance.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/youtube_dominance.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/youtube_dominance.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/youtube_dominance.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/youtube_dominance.png"
alt="Table showing YouTube is the #1 app across every age group in the US (18-24, 25-34, 35-44, 45&amp;#43;) per Sensor Tower"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
YouTube is the top app across every age demographic. Every single one. 18-24, 25-34, 35-44, 45+. No other app has achieved this. Not TikTok (appears for youngest and oldest, vanishes in the middle). Not Instagram (fades with age). Not Facebook (rises with age). YouTube alone spans generations.&lt;/p&gt;
&lt;h2 id="waymos-quiet-expansion"&gt;Waymo&amp;rsquo;s Quiet Expansion&lt;/h2&gt;
&lt;p&gt;Finally, Waymo: &lt;a href="#lightbox-waymo_penetration-png-9" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/waymo_penetration.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/waymo_penetration.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/waymo_penetration.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/waymo_penetration.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/waymo_penetration.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/waymo_penetration.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/waymo_penetration.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/waymo_penetration.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/waymo_penetration.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/waymo_penetration.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/waymo_penetration.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/waymo_penetration.png"
alt="Line chart showing Waymo&amp;#39;s autonomous ride-hailing penetration of Lyft and Uber users rising to ~4% and ~3% respectively by Q4 2025"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Waymo accounts for about 4% of Lyft users and 3% of Uber users nationally, despite operating in only a handful of cities. In its active markets (San Francisco, Phoenix), market share is closer to 15%. The company has driven 127 million autonomous miles and tripled its ride volume to 15 million trips in 2025.&lt;/p&gt;
&lt;p&gt;Mobile is no longer a platform question. It is a distribution question. The app economy winners so far: AI companies targeting professionals, YouTube serving everyone, and autonomous vehicles growing quietly in the background.&lt;/p&gt;</description></item><item><title>Variance Tax</title><link>http://philippdubach.com/posts/variance-tax/</link><pubDate>Fri, 06 Feb 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/variance-tax/</guid><description>&lt;p&gt;Let&amp;rsquo;s say your portfolio returned +60% in 2024, then fell 40% in 2025. That&amp;rsquo;s an annualized average return of +10%. Actual return after two years: minus 4% (i.e $100 * 1.6 * 0.6 = $96).&lt;/p&gt;
&lt;p&gt;That 14-point gap is what we call the variance tax aka &lt;a href="https://www.bogleheads.org/wiki/Variance_drain"&gt;variance drain&lt;/a&gt; or volatility drag and it&amp;rsquo;s one of the least intuitive forces in investing.&lt;/p&gt;
&lt;p&gt;Take any series of returns with arithmetic mean μ and volatility σ. The compound growth rate, the one that actually determines your wealth, is approximately:&lt;/p&gt;
$$G ≈ μ − ½σ²$$&lt;p&gt;This comes from a &lt;a href="https://en.wikipedia.org/wiki/Taylor%27s_theorem#Example"&gt;second-order Taylor expansion&lt;/a&gt; of ln(1+r). Take expectations, and the mean log return equals the arithmetic mean minus half the variance. Everything else drops out. Half the variance. That is the tax. The same correction term appears when you solve &lt;a href="https://en.wikipedia.org/wiki/Geometric_Brownian_motion"&gt;geometric Brownian motion&lt;/a&gt; via &lt;a href="https://en.wikipedia.org/wiki/It%C3%B4%27s_lemma"&gt;Itô&amp;rsquo;s lemma&lt;/a&gt; (the drift of log(S) is μ − σ²/2, not μ) so whether you come at it from discrete compounding or continuous-time stochastic calculus, you land in the same place. And because it is quadratic, doubling volatility does not double the cost. It quadruples it. And what we learned during covid, if anything at all, is that we generally &lt;a href="https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0242839"&gt;have a hard time to mentally abstract exponential growth&lt;/a&gt; rates.&lt;a href="#lightbox-variance_drain_by_vol-png-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/variance_drain_by_vol.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/variance_drain_by_vol.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/variance_drain_by_vol.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/variance_drain_by_vol.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/variance_drain_by_vol.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/variance_drain_by_vol.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/variance_drain_by_vol.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/variance_drain_by_vol.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/variance_drain_by_vol.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/variance_drain_by_vol.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/variance_drain_by_vol.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/variance_drain_by_vol.png"
alt="Chart showing the variance tax as a quadratic curve ½σ², with labeled data points for Bonds (5% vol, 0.1% drain), S&amp;amp;P 500 (16%, 1.3%), Nasdaq (22%, 2.4%), Emerging Markets (25%, 3.1%), 2x Leveraged S&amp;amp;P (32%, 5.1%), 3x Leveraged S&amp;amp;P (48%, 11.5%), and Bitcoin (60%, 18%)"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Treasury bonds at 5% vol pay about 0.1% per year in variance drain. Barely noticeable. The S&amp;amp;P 500 at 16% vol pays 1.3%. A 3x leveraged ETF at 48% vol pays 11.5%. &lt;a href="https://people.bu.edu/jacquier/papers/geom.faj0312.pdf"&gt;Jacquier, Kane, and Marcus (2003)&lt;/a&gt; studied S&amp;amp;P 500 returns from 1926 to 2001: arithmetic mean 12.49%, geometric mean 10.51%. The gap is 1.98 percentage points. The formula predicts ½ × 0.203² = 2.06%. &lt;a href="#lightbox-variance_table-png-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/variance_table.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/variance_table.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/variance_table.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/variance_table.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/variance_table.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/variance_table.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/variance_table.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/variance_table.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/variance_table.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/variance_table.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/variance_table.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/variance_table.png"
alt="Table showing variance drain by asset class: US Bonds (5% vol, 0.1% drain), S&amp;amp;P 500 (16% vol, 1.3% drain), Nasdaq (22% vol, 2.4% drain), Emerging Markets (25% vol, 3.1% drain), 2x Leveraged S&amp;amp;P (32% vol, 5.1% drain), 3x Leveraged S&amp;amp;P (48% vol, 11.5% drain)"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Looking at the last row, we see that tripling leverage triples the arithmetic return but delivers nearly the same compound return as 2x. The linear gain gets eaten by the quadratic penalty. &lt;a href="#lightbox-compound_wealth_growth-png-2" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/compound_wealth_growth.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/compound_wealth_growth.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/compound_wealth_growth.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/compound_wealth_growth.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/compound_wealth_growth.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/compound_wealth_growth.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/compound_wealth_growth.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/compound_wealth_growth.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/compound_wealth_growth.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/compound_wealth_growth.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/compound_wealth_growth.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/compound_wealth_growth.png"
alt="Line chart showing $100 invested at 10% arithmetic return over 30 years at four volatility levels: 0% vol reaches $1,745, 15% vol reaches $1,280, 30% vol reaches $498, and 50% vol loses most of the original investment"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Same 10% arithmetic return, different volatility. After 30 years, the zero-volatility path reaches $1,745. At 15% vol, $1,280. At 30%, $498. At 50% vol you have lost more than half your money despite averaging +10% per year.&lt;/p&gt;
&lt;p&gt;Now apply leverage. If you lever an asset by factor L, the arithmetic return scales linearly (Lμ) but the variance drain scales quadratically (½L²σ²). The compound return becomes:&lt;/p&gt;
$$G(L) ≈ r + L(μ − r) − ½L²σ²$$&lt;p&gt;Take the derivative, set to zero. The leverage that maximizes compound wealth:&lt;/p&gt;
$$L^{\ast} = (μ − r) / σ²$$
&lt;p&gt;For the S&amp;amp;P 500 with roughly 7% excess return and 16% vol, L* comes out to about 2.7x.
&lt;a href="#lightbox-leverage_curve-png-4" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/leverage_curve.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/leverage_curve.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/leverage_curve.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/leverage_curve.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/leverage_curve.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/leverage_curve.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/leverage_curve.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/leverage_curve.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/leverage_curve.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/leverage_curve.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/leverage_curve.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/leverage_curve.png"
alt="The leverage curve for S&amp;amp;P 500 parameters showing compound return peaking at Kelly optimal leverage L*=2.7x, with labeled points at 1x, 2x, and 3x leverage. Returns decline beyond the Kelly optimum and eventually turn negative"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
This is the &lt;a href="https://en.wikipedia.org/wiki/Kelly_criterion"&gt;Kelly criterion&lt;/a&gt; (&lt;em&gt;which you might know from utility theory or gambling heuristics but in fact, as we see here, it falls straight out of the variance tax formula.&lt;/em&gt;) Beyond Kelly, every dollar of additional leverage costs more in variance drain than it earns in expected return. The curve bends over and eventually goes negative. In practice, most practitioners use &amp;ldquo;half-Kelly&amp;rdquo; — sizing positions at L*/2 — because the formula assumes you know μ and σ precisely, and you don&amp;rsquo;t. Estimation error in either parameter can push you past the peak and onto the losing side of the curve. Half-Kelly sacrifices roughly 25% of the theoretical growth rate but dramatically reduces drawdown risk.
&lt;a href="#lightbox-UPRO_factsheet-png-5" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/UPRO_factsheet.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/UPRO_factsheet.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/UPRO_factsheet.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/UPRO_factsheet.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/UPRO_factsheet.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/UPRO_factsheet.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/UPRO_factsheet.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/UPRO_factsheet.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/UPRO_factsheet.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/UPRO_factsheet.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/UPRO_factsheet.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/UPRO_factsheet.png"
alt="Extract of ProShares UltraPro S&amp;amp;P 500 Factsheet Total Return"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
You can see this play out in practice. &lt;a href="https://www.proshares.com/our-etfs/leveraged-and-inverse/upro"&gt;ProShares UPRO&lt;/a&gt;, the 3x S&amp;amp;P 500 ETF, has returned roughly 28% annualized over the past decade during one of the strongest bull markets in history. The S&amp;amp;P 500 compounded at about 10% over the same period. Linear 3x leverage would imply roughly 30%. Variance drain accounts for the gap, and that was in a favorable environment. In 2022, when the S&amp;amp;P fell about 19%, UPRO dropped 70%. The effect is even starker in higher-volatility underlyings: &lt;a href="https://www.proshares.com/our-etfs/leveraged-and-inverse/tqqq"&gt;ProShares TQQQ&lt;/a&gt;, the 3x Nasdaq-100 ETF, sat roughly flat from its 2021 highs through early 2025 while the unlevered QQQ had long since recovered — a textbook case of variance drain overwhelming the leverage premium in a choppy market.&lt;/p&gt;
&lt;p&gt;The same half-sigma-squared shows up across finance. It is why stock prices follow &lt;a href="https://en.wikipedia.org/wiki/Log-normal_distribution"&gt;log-normal distributions&lt;/a&gt;, not normal ones. Why put options cost more than equidistant calls. Why the &lt;a href="https://en.wikipedia.org/wiki/Black%E2%80%93Scholes_model"&gt;Black-Scholes&lt;/a&gt; d₁ and d₂ terms carry a ½σ²t adjustment. Why a $100 stock&amp;rsquo;s true geometric midpoint between $150 up and $50 down is not $100 but $86.60, because ln(150/100) = ln(100/66.67). Wherever returns compound and volatility is nonzero, the variance tax is being collected.&lt;/p&gt;</description></item><item><title>Claude Opus 4.6: Anthropic's New Flagship AI Model for Agentic Coding</title><link>http://philippdubach.com/posts/claude-opus-4.6-anthropics-new-flagship-ai-model-for-agentic-coding/</link><pubDate>Thu, 05 Feb 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/claude-opus-4.6-anthropics-new-flagship-ai-model-for-agentic-coding/</guid><description>&lt;p&gt;Anthropic just released Claude Opus 4.6, the latest frontier AI model in the Claude family. It&amp;rsquo;s a big upgrade over Opus 4.5 and probably the most agentic-focused LLM release from any lab this year.&lt;/p&gt;
&lt;p&gt;Key upgrades: better agentic AI coding capabilities (plans more carefully, sustains longer tasks, catches its own mistakes), a 1M token context window (a first for Opus-class models), and 128K output tokens. Pricing holds at $5/$25 per million tokens.&lt;a href="#lightbox-claude46-png-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/claude46.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/claude46.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/claude46.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/claude46.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/claude46.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/claude46.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/claude46.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/claude46.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/claude46.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/claude46.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/claude46.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/claude46.png"
alt="Claude Opus 4.6 release announcement on claude.ai showing the new flagship model from Anthropic"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;h3 id="llm-benchmark-results-how-claude-opus-46-compares"&gt;LLM Benchmark Results: How Claude Opus 4.6 Compares&lt;/h3&gt;
&lt;p&gt;The benchmark numbers are strong across the board. Opus 4.6 hits state-of-the-art on Terminal-Bench 2.0 (65.4% for agentic coding in the terminal), Humanity&amp;rsquo;s Last Exam (complex multidisciplinary reasoning), and BrowseComp (agentic web search). It beats GPT-5.2 by roughly 144 Elo points on GDPval-AA, the benchmark that measures real-world knowledge work across 44 professional occupations.&lt;a href="#lightbox-opus46-elo-png-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/opus46-elo.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/opus46-elo.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/opus46-elo.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/opus46-elo.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/opus46-elo.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/opus46-elo.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/opus46-elo.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/opus46-elo.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/opus46-elo.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/opus46-elo.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/opus46-elo.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/opus46-elo.png"
alt="GDPval-AA Elo benchmark comparison chart: Claude Opus 4.6 at 1,606 Elo vs GPT-5.2 at 1,462 Elo vs Claude Opus 4.5 at 1,416 Elo for real-world knowledge work"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
The standout is ARC-AGI-2, which tests abstract reasoning on problems easy for humans but hard for AI. Opus 4.6 scores 68.8%, a dramatic leap from Opus 4.5&amp;rsquo;s 37.6%. For comparison, GPT-5.2 scores 54.2% and Gemini 3 Pro hits 45.1%. That gap matters because ARC-AGI-2 resists memorization — it measures whether models can actually generalize.&lt;/p&gt;
&lt;p&gt;On coding-specific evaluations, Terminal Bench 2.0 rises to 65.4% (from 59.8% for Opus 4.5), and OSWorld for agentic computer use jumps from 66.3% to 72.7%, putting Opus ahead of both GPT-5.2 and Gemini 3 Pro on those particular tests. SWE-bench Verified shows a small regression — worth watching, though the model excels on the benchmarks that better reflect real production work.&lt;a href="#lightbox-opus46-benchmarks-png-2" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/opus46-benchmarks.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/opus46-benchmarks.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/opus46-benchmarks.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/opus46-benchmarks.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/opus46-benchmarks.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/opus46-benchmarks.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/opus46-benchmarks.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/opus46-benchmarks.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/opus46-benchmarks.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/opus46-benchmarks.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/opus46-benchmarks.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/opus46-benchmarks.png"
alt="Claude Opus 4.6 LLM benchmark comparison: SOTA on Terminal-Bench 2.0, Humanity&amp;#39;s Last Exam, BrowseComp, and GDPval-AA with 90.2% on BigLaw Bench"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;h3 id="what-can-you-do-with-a-1-million-token-context-window"&gt;What Can You Do With a 1 Million Token Context Window?&lt;/h3&gt;
&lt;p&gt;The 1M context window paired with the new context compaction feature is the upgrade that matters most in practice. To put it in perspective: 1M tokens covers roughly 750 novels, an entire enterprise codebase of several thousand files, or a full legal discovery set — processed in a single prompt.&lt;/p&gt;
&lt;p&gt;Compaction automatically summarizes older context when approaching limits, which means agents can theoretically run indefinitely without hitting the wall that&amp;rsquo;s plagued long-running AI tasks. Combined with the model&amp;rsquo;s improved ability to catch its own mistakes through better code review and debugging, you&amp;rsquo;re looking at agents that can actually finish what they start.&lt;/p&gt;
&lt;p&gt;The long-context retrieval jump tells the story. On MRCR v2, which tests whether a model can find and reason over specific facts buried in massive prompts, Opus 4.6 scores 76% compared to Sonnet 4.5&amp;rsquo;s 18.5%. That&amp;rsquo;s not an incremental improvement — it&amp;rsquo;s a different capability class.&lt;a href="#lightbox-opus46-context-png-3" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/opus46-context.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/opus46-context.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/opus46-context.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/opus46-context.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/opus46-context.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/opus46-context.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/opus46-context.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/opus46-context.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/opus46-context.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/opus46-context.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/opus46-context.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/opus46-context.png"
alt="Long-context retrieval benchmark: Claude Opus 4.6 scores 76% vs Claude Sonnet 4.5 at 18.5% on MRCR v2 needle-in-a-haystack reasoning test"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
That said, bigger context doesn&amp;rsquo;t automatically mean better. Research from Factory.ai and others shows attention degrades across very long sequences, and prefill latency at 1M tokens can exceed two minutes before you get your first output token. The premium pricing tier for prompts exceeding 200K tokens ($10/$37.50) reflects this cost — Anthropic isn&amp;rsquo;t subsidizing power users anymore. The real question for enterprise deployments is whether stuffing your entire codebase into context beats a well-designed RAG pipeline. The answer, as usual, depends on the use case.&lt;/p&gt;
&lt;h3 id="agentic-ai-coding-agent-teams-and-claude-code-updates"&gt;Agentic AI Coding: Agent Teams and Claude Code Updates&lt;/h3&gt;
&lt;p&gt;The headline numbers impress, but the real story is the agentic focus. Anthropic isn&amp;rsquo;t just making Claude smarter. They&amp;rsquo;re making it more useful for the actual work people want AI to do: sustained, multi-step tasks in large codebases.&lt;/p&gt;
&lt;p&gt;New API features reinforce this direction: adaptive thinking lets the model decide when to reason deeper based on contextual cues, effort controls give developers fine-grained tradeoffs between intelligence, speed, and cost (low/medium/high/max), and context compaction keeps long-running agents within limits without manual intervention.&lt;/p&gt;
&lt;p&gt;Claude Code gets the headline feature: &lt;strong&gt;Agent Teams&lt;/strong&gt; that work in parallel. Multiple subagents can coordinate autonomously on read-heavy work like codebase reviews, with each agent handling a different branch via git worktrees before merging back. This ships as a research preview, but it&amp;rsquo;s clearly aimed at the production workflows where agentic coding tools like Cursor, GitHub Copilot, and OpenAI&amp;rsquo;s Codex are competing hard. The timing isn&amp;rsquo;t accidental — Apple just announced Xcode 26.3 with native support for Claude Agent and OpenAI&amp;rsquo;s Codex via MCP (Model Context Protocol), making agentic coding a standard part of the developer toolchain rather than an experiment.&lt;/p&gt;
&lt;h3 id="enterprise-deployment-why-gdpval-aa-matters"&gt;Enterprise Deployment: Why GDPval-AA Matters&lt;/h3&gt;
&lt;p&gt;The GDPval-AA benchmark matters because it measures performance on real-world knowledge work — not toy problems or academic puzzles. Beating GPT-5.2 by 144 Elo points (and Opus 4.5 by 190) suggests meaningful improvements in the tasks that matter for enterprise AI adoption: financial analysis, legal reasoning, and multi-step professional workflows.&lt;/p&gt;
&lt;p&gt;The product expansions signal where Anthropic sees the market going. Claude in Excel now handles long-running tasks and unstructured data. Claude in PowerPoint reads layouts and slide masters for brand consistency. These aren&amp;rsquo;t research demos — they&amp;rsquo;re enterprise-ready integrations designed for knowledge workers who need AI that fits into existing toolchains.&lt;/p&gt;
&lt;p&gt;For teams evaluating which frontier model to standardize on, the picture is nuanced. Claude Opus 4.6 leads on agentic coding and enterprise knowledge work. GPT-5.2 still holds advantages in abstract reasoning (ARC-AGI-2, though the gap narrowed significantly) and math. Gemini 3 Pro offers the best cost efficiency and multimodal processing with its own 1M context window. The multi-model workflow trend is real — the smartest enterprise teams aren&amp;rsquo;t picking one model; they&amp;rsquo;re routing tasks to whichever model handles them best.&lt;/p&gt;
&lt;h3 id="safety-profile-and-the-zero-day-question"&gt;Safety Profile and the Zero-Day Question&lt;/h3&gt;
&lt;p&gt;One detail worth noting: the safety profile. Anthropic claims Opus 4.6 is &amp;ldquo;just as well-aligned as Opus 4.5, which was the most-aligned frontier model to date.&amp;rdquo; Given the enhanced cybersecurity capabilities — Opus 4.6 independently discovered over 500 previously unknown zero-day vulnerabilities in open-source code during Anthropic&amp;rsquo;s pre-release testing — they developed six new detection probes specifically for this release.&lt;/p&gt;
&lt;p&gt;Whether that&amp;rsquo;s reassuring or concerning depends on your priors about AI capabilities research. The vulnerabilities ranged from system-crashing bugs to memory corruption flaws in widely-used tools like GhostScript and OpenSC. As Logan Graham, head of Anthropic&amp;rsquo;s frontier red team, put it: it&amp;rsquo;s a race between defenders and attackers, and Anthropic wants defenders to have the tools first.&lt;/p&gt;
&lt;h3 id="what-this-means-for-the-competitive-landscape"&gt;What This Means for the Competitive Landscape&lt;/h3&gt;
&lt;p&gt;The competitive picture just got more interesting. GPT-5.2 and Gemini 3 Pro now have a new benchmark to chase, and Anthropic has clearly staked its claim on agentic coding as the primary battleground. With pricing unchanged at $5/$25 per million tokens — significantly more expensive than GPT-5.2 at $2/$10 but competitive for the performance tier — the value proposition comes down to whether the agentic improvements translate to fewer retries, less hand-holding, and faster task completion in your specific workflow.&lt;/p&gt;
&lt;p&gt;For developers, the move is straightforward: swap in &lt;code&gt;claude-opus-4-6&lt;/code&gt; via the API and test it on your hardest tasks. For enterprise decision makers, the GDPval-AA results and Agent Teams feature are worth a serious evaluation cycle. The model is available now on claude.ai, the API, and all major cloud platforms (AWS Bedrock, Azure Foundry, GCP Vertex AI).&lt;/p&gt;</description></item><item><title>Buying the Haystack Might Not Work This Year</title><link>http://philippdubach.com/posts/buying-the-haystack-might-not-work-this-year/</link><pubDate>Sat, 31 Jan 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/buying-the-haystack-might-not-work-this-year/</guid><description>&lt;p&gt;I&amp;rsquo;ve been reading the January 2026 state of markets reports from &lt;a href="https://docs.google.com/presentation/d/e/2PACX-1vQXsMMv5ZCWm77za7oXJcz1X-Th5Mz15g5nYBxbUjnomStVcjn8lXPjE5LzAlvc_hg4yHKgwASWLo5a/pub?start=false&amp;amp;loop=false&amp;amp;delayms=3000&amp;amp;slide=id.g3b6e2578ab2_8_4858"&gt;Andreessen Horowitz&lt;/a&gt; and &lt;a href="https://www.aqr.com/Insights/Research/Alternative-Thinking/2026-Capital-Market-Assumptions-for-Major-Asset-Classes"&gt;AQR&lt;/a&gt;, and their conclusions on the AI bubble question in 2026 are almost impossible to reconcile.&lt;/p&gt;
&lt;p&gt;The a16z view is straightforward: AI fundamentals are real, and current prices reflect that reality. Their evidence is compelling. The top 50 private AI companies now generate &lt;strong&gt;$40.6 billion in annual revenue&lt;/strong&gt;. Companies like ElevenLabs and Cursor are hitting $100 million ARR faster than Slack or Twilio ever did. GPUs are running at &lt;strong&gt;80% utilization&lt;/strong&gt;, compared to the 7% utilization rate for fiber optic cables during the dotcom bubble. This isn&amp;rsquo;t speculation, they argue. It&amp;rsquo;s demand exceeding supply.&lt;a href="#lightbox-a16z-gpu-utilization-vs-fiber-png-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/a16z-gpu-utilization-vs-fiber.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/a16z-gpu-utilization-vs-fiber.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/a16z-gpu-utilization-vs-fiber.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/a16z-gpu-utilization-vs-fiber.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/a16z-gpu-utilization-vs-fiber.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/a16z-gpu-utilization-vs-fiber.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/a16z-gpu-utilization-vs-fiber.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/a16z-gpu-utilization-vs-fiber.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/a16z-gpu-utilization-vs-fiber.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/a16z-gpu-utilization-vs-fiber.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/a16z-gpu-utilization-vs-fiber.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/a16z-gpu-utilization-vs-fiber.png"
alt="GPU utilization at 80% in AI datacenters compared to just 7% fiber optic cable utilization during the early 2000s dotcom bubble"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
AQR looks at the same market and sees something else entirely. Their capital market assumptions put the U.S. CAPE ratio at the &lt;strong&gt;96th percentile since 1980&lt;/strong&gt;. Expected real returns for U.S. large cap equities over the next 5-10 years? &lt;strong&gt;3.9%&lt;/strong&gt;. For a global 60/40 portfolio, just &lt;strong&gt;3.4%&lt;/strong&gt;, well below the long-term average of roughly 5% since 1900. Risk premia, in their framework, are compressed across nearly every asset class. The narrative doesn&amp;rsquo;t enter their models.&lt;a href="#lightbox-aqr-expected-returns-summary-png-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/aqr-expected-returns-summary.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/aqr-expected-returns-summary.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/aqr-expected-returns-summary.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/aqr-expected-returns-summary.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/aqr-expected-returns-summary.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/aqr-expected-returns-summary.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/aqr-expected-returns-summary.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/aqr-expected-returns-summary.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/aqr-expected-returns-summary.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/aqr-expected-returns-summary.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/aqr-expected-returns-summary.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/aqr-expected-returns-summary.png"
alt="AQR medium-term expected real returns summary showing U.S. equities at 3.9%, non-U.S. developed at 5.3%, and global 60/40 at 3.4%"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
a16z points to earnings growth. The market rally hasn&amp;rsquo;t been driven by multiple expansion, they note, but by actual EPS growth. Tech P/E multiples sit around 30-35x, elevated but nowhere near the 70-80x of 2000. Tech margins have &amp;ldquo;lapped the field&amp;rdquo; at 25%+ compared to 5-8% for the rest of the S&amp;amp;P 500. The fundamentals, they insist, are doing the work.&lt;a href="#lightbox-a16z-pe-multiples-vs-dotcom-png-2" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/a16z-pe-multiples-vs-dotcom.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/a16z-pe-multiples-vs-dotcom.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/a16z-pe-multiples-vs-dotcom.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/a16z-pe-multiples-vs-dotcom.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/a16z-pe-multiples-vs-dotcom.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/a16z-pe-multiples-vs-dotcom.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/a16z-pe-multiples-vs-dotcom.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/a16z-pe-multiples-vs-dotcom.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/a16z-pe-multiples-vs-dotcom.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/a16z-pe-multiples-vs-dotcom.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/a16z-pe-multiples-vs-dotcom.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/a16z-pe-multiples-vs-dotcom.png"
alt="Earnings multiples are high but nowhere near dotcom levels: large cap tech trailing P/E around 30-35x today versus 70-80x in 2000"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;a href="#lightbox-a16z-tech-margins-png-3" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/a16z-tech-margins.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/a16z-tech-margins.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/a16z-tech-margins.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/a16z-tech-margins.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/a16z-tech-margins.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/a16z-tech-margins.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/a16z-tech-margins.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/a16z-tech-margins.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/a16z-tech-margins.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/a16z-tech-margins.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/a16z-tech-margins.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/a16z-tech-margins.png"
alt="Tech margins have lapped the field: Tech and Interactive Media at 25%&amp;#43; compared to 5-8% for the rest of the S&amp;amp;P 500"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
AQR&amp;rsquo;s response would be that fundamentals always look good near peaks. Their research shows a &lt;strong&gt;50% probability&lt;/strong&gt; that realized equity returns will miss estimates by more than 3 percentage points annually over the next decade. Compressed premia don&amp;rsquo;t announce themselves with blaring headlines. They just quietly erode returns until investors notice they&amp;rsquo;ve been running in place.&lt;/p&gt;
&lt;p&gt;Cumulative hyperscaler capex is projected to reach &lt;strong&gt;$4.8 trillion by 2030&lt;/strong&gt;. To achieve a 10% hurdle rate on that investment, AI revenue needs to hit roughly &lt;strong&gt;$1 trillion annually by 2030&lt;/strong&gt;, about 1% of global GDP excluding China. &lt;a href="https://fortune.com/2025/11/17/is-ai-a-bubble-goldman-sachs-market-already-priced-in-19-trillion/"&gt;Goldman Sachs estimates&lt;/a&gt; that $9 trillion in revenue could flow from the AI buildout, which at 20% margins and a 22x P/E multiple would create $35 trillion in new market cap. Only about $24 trillion has been pulled forward so far, leaving $11 trillion &amp;ldquo;on the table.&amp;ldquo;&lt;a href="#lightbox-a16z-ai-revenue-capex-targets-png-4" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/a16z-ai-revenue-capex-targets.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/a16z-ai-revenue-capex-targets.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/a16z-ai-revenue-capex-targets.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/a16z-ai-revenue-capex-targets.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/a16z-ai-revenue-capex-targets.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/a16z-ai-revenue-capex-targets.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/a16z-ai-revenue-capex-targets.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/a16z-ai-revenue-capex-targets.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/a16z-ai-revenue-capex-targets.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/a16z-ai-revenue-capex-targets.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/a16z-ai-revenue-capex-targets.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/a16z-ai-revenue-capex-targets.png"
alt="Required AI-enabled revenue to meet return on capital targets: cumulative AI investment reaching $4.8 trillion by 2030 requires roughly $1 trillion in annual AI revenue"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Or not. AQR would point out that the expected return for U.S. buyouts, private equity&amp;rsquo;s bread and butter, is now &lt;strong&gt;4.2%&lt;/strong&gt;. That&amp;rsquo;s barely above the 3.9% for public large caps. The illiquidity premium has essentially vanished. If sophisticated PE firms can&amp;rsquo;t find excess returns, why should AI capex be different?&lt;/p&gt;
&lt;p&gt;I find myself uncertain, which feels like the more honest position. Neither source is disinterested. a16z manages billions in venture capital and growth equity; bullish AI narratives support their portfolio valuations and fundraising. AQR runs systematic strategies that benefit when investors diversify away from concentrated U.S. tech exposure toward international equities and alternatives. Both are talking their book, which doesn&amp;rsquo;t make either wrong, but it&amp;rsquo;s worth noting.&lt;/p&gt;
&lt;p&gt;The a16z data on utilization and revenue growth is hard to dismiss. 80% GPU utilization isn&amp;rsquo;t vaporware. Harvey users nearly tripled their time on the platform in nine months. Navan&amp;rsquo;s AI handles half of all customer interactions at satisfaction levels matching human agents. These are real products generating real engagement. But AQR&amp;rsquo;s valuation work has a longer track record. Their models don&amp;rsquo;t care about narratives, and historically that discipline has been valuable. When they say U.S. equities offer the lowest expected returns among major markets, that&amp;rsquo;s not pessimism. It&amp;rsquo;s arithmetic.&lt;/p&gt;
&lt;p&gt;The reconciliation might be this: AI winners could thrive spectacularly while broad market indices disappoint. a16z&amp;rsquo;s portfolio companies operate in a different universe than the average S&amp;amp;P 500 constituent. Compressed risk premia can coexist with individual companies generating enormous returns. The question is whether you&amp;rsquo;re buying the index or picking the winners.&lt;/p&gt;
&lt;p&gt;Non-U.S. developed markets, by the way, offer expected returns of around 5%, versus 3.9% for U.S. large caps. The valuation gap is real even if the AI story is true. &lt;a href="#lightbox-aqr-expected-returns-equities-png-6" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/aqr-expected-returns-equities.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/aqr-expected-returns-equities.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/aqr-expected-returns-equities.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/aqr-expected-returns-equities.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/aqr-expected-returns-equities.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/aqr-expected-returns-equities.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/aqr-expected-returns-equities.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/aqr-expected-returns-equities.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/aqr-expected-returns-equities.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/aqr-expected-returns-equities.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/aqr-expected-returns-equities.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/aqr-expected-returns-equities.png"
alt="AQR expected local real returns for equities: U.S. Large at 3.9%, Eurozone at 5.0%, UK at 4.9%, Japan at 4.9%"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;aside class="disclaimer" role="note" aria-label="Disclaimer"&gt;
&lt;div class="disclaimer-content"&gt;&lt;p&gt;&lt;strong&gt;Disclaimer:&lt;/strong&gt; All opinions expressed are my own. This is not investment, financial, tax, or legal advice. Past performance does not indicate future results. Do your own research and consult qualified professionals before making financial decisions. No liability accepted for any losses.&lt;/p&gt;&lt;/div&gt;
&lt;/aside&gt;</description></item><item><title>Bandits and Agents: Netflix and Spotify Recommender Stacks in 2026</title><link>http://philippdubach.com/posts/bandits-and-agents-netflix-and-spotify-recommender-stacks-in-2026/</link><pubDate>Fri, 30 Jan 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/bandits-and-agents-netflix-and-spotify-recommender-stacks-in-2026/</guid><description>&lt;p&gt;Hyperscalers spent over &lt;a href="https://www.goldmansachs.com/insights/articles/why-ai-companies-may-invest-more-than-500-billion-in-2026"&gt;$350 billion on AI infrastructure&lt;/a&gt; in 2025 alone, with projections exceeding $500 billion in 2026. The trillion-dollar question is not whether machines can reason, but whether anyone can afford to let them. Hybrid recommender systems sit at the center of this tension. Large Language Models promised to transform how Netflix suggests your next show or how Spotify curates your morning playlist. Instead, the industry has split into two parallel universes, divided not by capability but by cost.&lt;/p&gt;
&lt;p&gt;On one side sits what engineers call the &amp;ldquo;classical stack&amp;rdquo;: matrix factorization, two-tower embedding models, and contextual bandits. These methods respond in microseconds, scale linearly with users, and run on nothing more complicated than dot products. A query costs a fraction of a cent. On the other side is the &amp;ldquo;agentic stack&amp;rdquo;: LLM-based reasoning engines that can handle requests like &amp;ldquo;find me a sci-fi movie that feels like Blade Runner but was made in the 90s.&amp;rdquo; This second approach consumes thousands of tokens per recommendation. The cost difference is not incremental; it is &lt;a href="https://www.softwareseni.com/understanding-inference-economics-and-why-ai-costs-spiral-beyond-proof-of-concept/"&gt;orders of magnitude&lt;/a&gt;. LLM inference cost economics, more than any algorithmic breakthrough, is now the dominant force shaping recommender architecture.&lt;/p&gt;
&lt;p&gt;The 2026 consensus is a hybrid architecture: use the cheap, fast models for candidate generation from millions of items, then invoke the expensive reasoning layer only for the final dozen items a user actually sees. This &amp;ldquo;funnel&amp;rdquo; pattern — retrieval, then ranking, then re-ranking — is the only way to make the economics work. The smartest model is reserved for the fewest items.&lt;/p&gt;
&lt;p&gt;What makes this work in practice goes back to a formalism from &lt;a href="https://www.jstor.org/stable/2332286"&gt;1933&lt;/a&gt;: the multi-armed bandit. Imagine a gambler facing a row of slot machines, each with an unknown payout rate. She wants to maximize her winnings over a night of play. If she always pulls the arm with the highest observed payout, she might miss a better machine she never tried. If she explores too much, she wastes money on losers. The mathematics of this exploration–exploitation tradeoff define &lt;em&gt;regret&lt;/em&gt;:&lt;/p&gt;
$$
R(T) = \mu^* \cdot T - \sum_{t=1}^{T} \mu(a_t)
$$&lt;p&gt;Here μ* is the best possible average reward, and μ(aₜ) is the reward from whatever arm she actually pulled at time t. Total regret is how much she left on the table by not knowing the optimal choice in advance. The goal of every multi-armed bandit algorithm in recommender systems is to drive this quantity sublinear in T — to learn fast enough that the cost of exploration vanishes relative to the horizon. &lt;a href="#lightbox-slide10-png-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/slide10.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/slide10.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/slide10.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/slide10.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/slide10.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/slide10.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/slide10.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/slide10.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/slide10.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/slide10.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/slide10.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/slide10.png"
alt="Multi-armed bandit recommender system diagram: a Learner taking Actions and receiving Rewards from an Environment, with the goal to maximize cumulative reward or minimize cumulative regret"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
The three main exploration strategies each take a different approach: epsilon-greedy adds random noise to avoid getting stuck; Upper Confidence Bound (UCB) prefers actions with uncertain values; Thompson Sampling selects actions according to the probability they are optimal. In practice, Thompson Sampling tends to outperform the others because its exploration is guided by posterior uncertainty rather than arbitrary randomness — it explores where it matters most. &lt;a href="#lightbox-slide12-png-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/slide12.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/slide12.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/slide12.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/slide12.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/slide12.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/slide12.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/slide12.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/slide12.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/slide12.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/slide12.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/slide12.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/slide12.png"
alt="Principles of Exploration in recommender systems: Naive Exploration (ε-greedy), Optimism in the Face of Uncertainty (UCB), and Probability Matching (Thompson Sampling)"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Every recommendation you see on &lt;a href="https://research.netflix.com/publication/lessons-learnt-from-consolidating-ml-models-in-a-large-scale-recommendation"&gt;Netflix&amp;rsquo;s homepage&lt;/a&gt; is the output of an algorithm trying to minimize exactly this quantity, whether it realizes it or not.&lt;/p&gt;
&lt;p&gt;Netflix&amp;rsquo;s recommendation algorithm architecture runs this optimization across &lt;a href="https://www.slideshare.net/slideshow/a-multiarmed-bandit-framework-for-recommendations-at-netflix/102629078"&gt;three computation layers&lt;/a&gt;. Offline systems crunch terabytes of viewing history to train deep collaborative filtering models, a process that takes hours and happens on a schedule. Nearline systems update user embeddings seconds after a click, keeping the recommendations fresh without the cost of full retraining. Online systems respond to each page load in milliseconds, combining the precomputed signals with real-time context like time of day and device type. The architecture is a &lt;a href="https://netflixtechblog.com/post-training-generative-recommenders-with-advantage-weighted-supervised-finetuning-61a538d717a9"&gt;latency-cost tradeoff&lt;/a&gt;: deep analysis happens in batch, while the user-facing layer stays fast. &lt;a href="#lightbox-slide28-png-3" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/slide28.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/slide28.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/slide28.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/slide28.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/slide28.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/slide28.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/slide28.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/slide28.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/slide28.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/slide28.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/slide28.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/slide28.png"
alt="Netflix recommendation algorithm architecture: Member Activity and Contextual Information flow through an Offline System for model training, then to an Online System where the Multi-Armed Bandit produces recommendations"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
What Netflix learned from a decade of experimentation is counterintuitive. The goal is not to recommend what users will definitely watch, but what they would not have found on their own. They call this &amp;ldquo;incrementality.&amp;rdquo; A greedy algorithm that always surfaces the highest-probability titles just confirms what users already knew — it exploits without exploring, and in doing so collapses the discovery space. A better approach is to measure the &lt;em&gt;causal effect&lt;/em&gt; of the recommendation: how much does showing this thumbnail increase the probability of a play compared to not showing it? Some titles have low baseline interest but high incrementality. Those are the ones worth featuring. This is the exploration–exploitation tradeoff made concrete: the value of a recommendation is not its predicted rating, but its marginal contribution to discovery. &lt;a href="#lightbox-slide41-png-4" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/slide41.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/slide41.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/slide41.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/slide41.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/slide41.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/slide41.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/slide41.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/slide41.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/slide41.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/slide41.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/slide41.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/slide41.png"
alt="Netflix incrementality analysis: scatter plot showing incremental probability vs baseline probability, where Title A has low baseline but high incremental lift, while Title C has high baseline but less benefit from featuring"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Spotify&amp;rsquo;s AI DJ recommender system takes a different approach to the same problem. Their &amp;ldquo;&lt;a href="https://research.atspotify.com/2025/9/you-say-search-i-say-recs-a-scalable-agentic-approach-to-query-understanding"&gt;AI DJ&lt;/a&gt;&amp;rdquo; feature uses what engineers internally call the &amp;ldquo;agentic router.&amp;rdquo; When you ask for &amp;ldquo;music for a rainy reading session in 1990s Seattle,&amp;rdquo; the router decides whether to invoke the expensive LLM reasoning layer or just fall back to keyword matching against collaborative filtering embeddings. Complex queries get the big model; simple ones get the fast path. This router is the economic governor of the entire system — an inference cost optimizer disguised as a product feature. Underneath the DJ&amp;rsquo;s personality, built on Spotify&amp;rsquo;s Sonantic voice synthesis and LLM-generated contextual narratives, sits a bandit framework called BaRT (Bandits for Recommendations as Treatments) that quietly balances what you know you like against what you might not yet know you need.&lt;/p&gt;
&lt;p&gt;Not everyone is convinced the algorithms are making us better off. My own &lt;a href="https://philippdubach.com/posts/social-media-success-prediction-bert-models-for-post-titles/"&gt;analysis of social media success prediction&lt;/a&gt; found that sophisticated language models often just memorize temporal patterns rather than learning what actually makes content good. They learn the news cycle, not the news.&lt;/p&gt;
&lt;p&gt;The risk is that we build hybrid recommender systems that are technically brilliant but experientially hollow, engineering away the serendipity that made discovery meaningful in the first place. The recommender is becoming a curator, and the curator is becoming an agent. The architecture will keep evolving — foundation models for recommendations, reinforcement learning from human feedback applied to discovery, inference costs that continue their &lt;a href="https://a16z.com/llmflation-llm-inference-cost/"&gt;10× annual decline&lt;/a&gt; — but the open question for 2026 is whether we want to be the curators of our own lives, or merely consumers of an optimized feed.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Slides courtesy of &amp;ldquo;&lt;a href="https://www.slideshare.net/slideshow/a-multiarmed-bandit-framework-for-recommendations-at-netflix/102629078"&gt;A Multi-Armed Bandit Framework for Recommendations at Netflix&lt;/a&gt;&amp;rdquo; by Jaya Kawale, Netflix.&lt;/em&gt;&lt;/p&gt;</description></item><item><title>Is Private Equity Just Beta With a Lockup?</title><link>http://philippdubach.com/posts/is-private-equity-just-beta-with-a-lockup/</link><pubDate>Thu, 29 Jan 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/is-private-equity-just-beta-with-a-lockup/</guid><description>&lt;p&gt;The pitch used to be simple: accept illiquidity, get rewarded. Lock up your capital for seven years, tolerate capital calls and J-curves, and in exchange you&amp;rsquo;d earn returns that public markets couldn&amp;rsquo;t touch. It was the defining bargain of institutional investing for two decades.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.aqr.com/Insights/Research/Alternative-Thinking/2026-Capital-Market-Assumptions-for-Major-Asset-Classes"&gt;AQR&amp;rsquo;s latest capital market assumptions&lt;/a&gt; make for uncomfortable reading if you&amp;rsquo;re an allocator to private markets. Their expected real return for U.S. buyouts over the next 5-10 years is &lt;strong&gt;4.2%&lt;/strong&gt;. For U.S. large cap public equities, it&amp;rsquo;s &lt;strong&gt;3.9%&lt;/strong&gt;. That&amp;rsquo;s a 30 basis point premium for accepting years of lockup, unpredictable capital calls, limited transparency, and the very real risk of picking the wrong manager.&lt;a href="#lightbox-aqr-expected-returns-private-assets-png-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/aqr-expected-returns-private-assets.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/aqr-expected-returns-private-assets.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/aqr-expected-returns-private-assets.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/aqr-expected-returns-private-assets.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/aqr-expected-returns-private-assets.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/aqr-expected-returns-private-assets.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/aqr-expected-returns-private-assets.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/aqr-expected-returns-private-assets.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/aqr-expected-returns-private-assets.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/aqr-expected-returns-private-assets.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/aqr-expected-returns-private-assets.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/aqr-expected-returns-private-assets.png"
alt="AQR Exhibit 6: Expected real returns for private assets showing U.S. Buyouts at 4.2%, U.S. Real Estate at 3.1%, and U.S. Private Credit at 2.6%"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Private credit looks even worse. Expected returns dropped &lt;strong&gt;0.5 percentage points&lt;/strong&gt; year over year as spreads narrowed and base rates came down. The asset class that was supposed to be the sensible alternative to stretched equity valuations now offers less compensation than it did twelve months ago.&lt;/p&gt;
&lt;p&gt;This isn&amp;rsquo;t a temporary dislocation. It&amp;rsquo;s the logical endpoint of too much capital chasing the same opportunities. When every pension fund, endowment, and sovereign wealth fund decides they need &lt;a href="https://www.cbh.com/insights/reports/u.s.-alternative-investment-industry-report-2025"&gt;20-30% allocation to alternatives&lt;/a&gt;, the returns that made alternatives attractive get arbitraged away. The money didn&amp;rsquo;t find alpha. It became beta (with a lockup).&lt;/p&gt;
&lt;p&gt;I read more reports and the &lt;a href="https://docs.google.com/presentation/d/e/2PACX-1vQXsMMv5ZCWm77za7oXJcz1X-Th5Mz15g5nYBxbUjnomStVcjn8lXPjE5LzAlvc_hg4yHKgwASWLo5a/pub?start=false&amp;amp;loop=false&amp;amp;delayms=3000&amp;amp;slide=id.g3b6e2578ab2_8_4858"&gt;a16z State of the Markets 2026&lt;/a&gt; isn&amp;rsquo;t less interesting. The dispersion numbers tell an interesting story. In venture capital, top decile managers generate &lt;strong&gt;31.7% IRR&lt;/strong&gt; while bottom decile managers return &lt;strong&gt;negative 7%&lt;/strong&gt;. The spread between winners and losers is enormous. But that spread is precisely why average returns have compressed. Access to top-tier funds has always been limited, and everyone else is fighting over what&amp;rsquo;s left.&lt;a href="#lightbox-a16z-irr-dispersion-by-strategy-png-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/a16z-irr-dispersion-by-strategy.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/a16z-irr-dispersion-by-strategy.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/a16z-irr-dispersion-by-strategy.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/a16z-irr-dispersion-by-strategy.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/a16z-irr-dispersion-by-strategy.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/a16z-irr-dispersion-by-strategy.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/a16z-irr-dispersion-by-strategy.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/a16z-irr-dispersion-by-strategy.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/a16z-irr-dispersion-by-strategy.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/a16z-irr-dispersion-by-strategy.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/a16z-irr-dispersion-by-strategy.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/a16z-irr-dispersion-by-strategy.png"
alt="Net IRR dispersion by strategy for 2002-2019 vintages showing venture capital with top decile at 31.7% and bottom decile at negative 7%"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
AQR&amp;rsquo;s framework suggests something that few allocators want to hear: the illiquidity premium might be negative for most investors. If you&amp;rsquo;re not in the top quartile of manager selection, you&amp;rsquo;re accepting lockup risk for returns you could approximate in public markets with better liquidity and lower fees.&lt;/p&gt;
&lt;p&gt;The counterargument, and it&amp;rsquo;s a reasonable one, is that private markets offer exposure to companies you simply can&amp;rsquo;t access in public markets anymore. This part is true. &lt;strong&gt;&lt;a href="https://www.apolloacademy.com/many-more-private-firms-in-the-us/"&gt;87% of U.S. companies with more than $100 million in revenue are now private&lt;/a&gt;&lt;/strong&gt;. The top 10 private companies represent 38% of total unicorn valuation, and that share has nearly doubled since 2020. SpaceX, OpenAI, Anthropic, Databricks, Stripe: these are category-defining businesses, and they&amp;rsquo;re not on any exchange.&lt;a href="#lightbox-a16z-companies-public-vs-private-png-3" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/a16z-companies-public-vs-private.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/a16z-companies-public-vs-private.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/a16z-companies-public-vs-private.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/a16z-companies-public-vs-private.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/a16z-companies-public-vs-private.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/a16z-companies-public-vs-private.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/a16z-companies-public-vs-private.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/a16z-companies-public-vs-private.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/a16z-companies-public-vs-private.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/a16z-companies-public-vs-private.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/a16z-companies-public-vs-private.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/a16z-companies-public-vs-private.png"
alt="Share of U.S. companies with annual revenue greater than $100M showing private companies dominate"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;a href="#lightbox-a16z-top-10-private-companies-png-4" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/a16z-top-10-private-companies.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/a16z-top-10-private-companies.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/a16z-top-10-private-companies.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/a16z-top-10-private-companies.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/a16z-top-10-private-companies.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/a16z-top-10-private-companies.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/a16z-top-10-private-companies.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/a16z-top-10-private-companies.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/a16z-top-10-private-companies.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/a16z-top-10-private-companies.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/a16z-top-10-private-companies.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/a16z-top-10-private-companies.png"
alt="Top 10 private companies represent 38% of total unicorn valuation in 2025, including SpaceX, OpenAI, Anthropic, Databricks, and Stripe"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
But access isn&amp;rsquo;t the same as returns. You can have exposure to the most exciting companies in the world and still underperform a boring index fund if you pay too much or pick the wrong vintage. The S&amp;amp;P 500 minimum market cap eligibility has &lt;a href="https://press.spglobal.com/2025-07-01-S-P-Dow-Jones-Indices-Announces-Update-to-S-P-Composite-1500-Market-Cap-Guidelines"&gt;tripled since 2019 to $22.7 billion&lt;/a&gt;. Companies are staying private longer, which means more value creation happens before public investors get a chance. It also means private investors are paying up for that privilege.&lt;/p&gt;
&lt;p&gt;Value creation has moved earlier in the company lifecycle. For IPOs between 2014-2019, only &lt;strong&gt;12% of median value&lt;/strong&gt; was created in private markets. For 2020-2023 IPOs, that number jumped to &lt;strong&gt;55%&lt;/strong&gt;. If you want to capture returns from the next generation of important companies, you probably need private market exposure.&lt;a href="#lightbox-a16z-value-creation-shift-private-png-5" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/a16z-value-creation-shift-private.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/a16z-value-creation-shift-private.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/a16z-value-creation-shift-private.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/a16z-value-creation-shift-private.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/a16z-value-creation-shift-private.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/a16z-value-creation-shift-private.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/a16z-value-creation-shift-private.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/a16z-value-creation-shift-private.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/a16z-value-creation-shift-private.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/a16z-value-creation-shift-private.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/a16z-value-creation-shift-private.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/a16z-value-creation-shift-private.png"
alt="Return potential has shifted to private markets: median value created in private markets went from 12% for 2014-2019 IPOs to 55% for 2020-2023 IPOs"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
The question really is what you&amp;rsquo;re paying for it.At 4.2% expected returns versus 3.9% for public equities, you&amp;rsquo;re paying in liquidity and flexibility for almost nothing in expected return. The premium that justified the allocation model has been competed away. If you&amp;rsquo;re in the top 5% of venture funds earning 60%+ IRR, none of this applies. For everyone else, the world has moved on.&lt;/p&gt;
&lt;aside class="disclaimer" role="note" aria-label="Disclaimer"&gt;
&lt;div class="disclaimer-content"&gt;&lt;p&gt;&lt;strong&gt;Disclaimer:&lt;/strong&gt; All opinions expressed are my own. This is not investment, financial, tax, or legal advice. Past performance does not indicate future results. Do your own research and consult qualified professionals before making financial decisions. No liability accepted for any losses.&lt;/p&gt;&lt;/div&gt;
&lt;/aside&gt;</description></item><item><title>Britain's Strategic Limbo</title><link>http://philippdubach.com/posts/britains-strategic-limbo/</link><pubDate>Wed, 28 Jan 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/britains-strategic-limbo/</guid><description>&lt;p&gt;The UK is the country with no bloc.&lt;/p&gt;
&lt;p&gt;At Davos, Britain &lt;a href="https://www.washingtonpost.com/world/2026/01/22/trump-board-peace-davos-countries-involved/"&gt;refused to join Trump&amp;rsquo;s Board of Peace&lt;/a&gt;, citing commitment to international law and rejection of the &amp;ldquo;pay-to-play&amp;rdquo; model. France, Germany, Sweden, Norway made the same choice. The difference is that those countries have somewhere else to go. Britain doesn&amp;rsquo;t.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://ukandeu.ac.uk/explainers/explainer-security-action-for-europe-safe/"&gt;SAFE instrument&lt;/a&gt;, the EU&amp;rsquo;s €150 billion fund for joint defense procurement, is designed explicitly for strategic autonomy. Strict &amp;ldquo;Buy European&amp;rdquo; provisions limit non-EU subcontractors to 15-35% of contract value, phased out within two years. Canada, remarkably, &lt;a href="https://www.pm.gc.ca/en/news/news-releases/2025/12/01/prime-minister-carney-secures-canadas-participation-european-unions"&gt;negotiated access&lt;/a&gt; and now has preferential treatment on par with EU firms. The UK &lt;a href="https://behorizon.org/safe-mechanism-reshaping-eu-defence-integration/"&gt;remains excluded&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Talks broke down in late 2025. London viewed the EU&amp;rsquo;s requirements for third-country participation as an infringement on sovereignty. The same sovereignty concerns that drove Brexit now lock Britain out of the emerging European defense architecture. The &amp;ldquo;mid-Atlantic bridge&amp;rdquo; was always a metaphor. Britain positioned itself as the hinge between American power and European integration, useful to both, dependent on neither. That positioning assumed both poles wanted a bridge. Now the US treats allies as protection rackets and the EU is building walls around its defense industrial base. The bridge has nowhere to land.&lt;/p&gt;
&lt;p&gt;What does the Starmer government do? The choices were supposed to be theoretical. Align with Washington and accept the transactional terms of the &lt;a href="https://en.wikipedia.org/wiki/Donroe_Doctrine"&gt;Donroe Doctrine&lt;/a&gt;. Align with Brussels and accept the sovereignty constraints of SAFE participation. Or go it alone, with a defense budget that can&amp;rsquo;t sustain independent capability against peer competitors.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://www.iiss.org/research-paper/2025/12/the-safe-regulation-and-its-implications-for-non-eu-defence-suppliers/"&gt;IISS analysis&lt;/a&gt; of SAFE&amp;rsquo;s implications for non-EU suppliers is blunt: firms outside the bloc face structural disadvantages that compound over time. Procurement cycles last decades. If British defense firms are locked out of European contracts now, the gap widens with each passing year. The industrial base erodes.&lt;/p&gt;
&lt;p&gt;&amp;ldquo;Global Britain&amp;rdquo; was the slogan after Brexit, a vision of nimble bilateral relationships unconstrained by Brussels bureaucracy. The reality is that global influence requires either hard power or bloc membership. Britain has neither the military budget for the former nor the political will for the latter.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://philippdubach.com/posts/the-rise-of-middle-power-realism/"&gt;Canada&amp;rsquo;s pivot&lt;/a&gt; is instructive. Facing similar pressure from Washington, Carney diversified, joining SAFE, negotiating with Beijing, building horizontal coalitions with other middle powers. Britain has done none of this. It refused the Board of Peace on principle but hasn&amp;rsquo;t found an alternative structure to join on pragmatism.&lt;/p&gt;
&lt;p&gt;Principles without alternatives is just isolation. The UK is learning what it means to be a middle power without a coalition, morally opposed to the new American order but structurally excluded from the European one.&lt;/p&gt;</description></item><item><title>The Rise of Middle Power Realism</title><link>http://philippdubach.com/posts/the-rise-of-middle-power-realism/</link><pubDate>Tue, 27 Jan 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/the-rise-of-middle-power-realism/</guid><description>&lt;p&gt;At Davos 2026, Canadian Prime Minister &lt;a href="https://en.wikipedia.org/wiki/Mark_Carney"&gt;Mark Carney&lt;/a&gt; delivered a speech that received something rare at these gatherings: a standing ovation. Carney told the assembled elites what they already knew but hadn&amp;rsquo;t said aloud: &lt;a href="https://www.weforum.org/stories/2026/01/davos-2026-special-address-by-mark-carney-prime-minister-of-canada/"&gt;the world is not in a &amp;ldquo;transition&amp;rdquo; but a &amp;ldquo;rupture.&amp;rdquo;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The speech drew on Václav Havel&amp;rsquo;s 1978 essay &lt;em&gt;The Power of the Powerless&lt;/em&gt;, specifically &lt;a href="https://www.nonviolent-conflict.org/resource/the-power-of-the-powerless/"&gt;the parable of the greengrocer&lt;/a&gt; who displays the slogan &amp;ldquo;Workers of the World, Unite!&amp;rdquo; in his shop window. The grocer doesn&amp;rsquo;t believe the slogan. He displays it to signal submission, to live in harmony with the regime. Carney&amp;rsquo;s application was pointed: for years, US allies have displayed the signs of the liberal international order, pretending the partnership was mutual, that rules mattered, that values were shared. Even as reality diverged.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&amp;ldquo;It is time for companies and countries to take their signs down.&amp;rdquo;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;What followed the speech was more interesting than the speech itself. Days later, Canada became the first non-European G7 nation to join the EU&amp;rsquo;s &lt;a href="https://www.pm.gc.ca/en/news/news-releases/2025/12/01/prime-minister-carney-secures-canadas-participation-european-unions"&gt;SAFE defense initiative&lt;/a&gt;, a €150 billion fund for joint European defense procurement. Canadian firms now have &lt;a href="https://www.squirepattonboggs.com/insights/publications/canadian-companies-to-be-allowed-preferential-access-under-eu-safe-defence-investment-program/"&gt;preferential access&lt;/a&gt; to the European defense market, treated on par with EU companies. Days before Davos, Carney had traveled to Beijing to secure a &lt;a href="https://www.china-briefing.com/news/china-canada-trade-deal-preliminary-agreement/"&gt;preliminary trade agreement&lt;/a&gt; on electric vehicles, 49,000 units at 6.1% tariff, compared to the 100% tariff the US imposes.&lt;/p&gt;
&lt;p&gt;The intellectual framework Carney articulated has a name now: &amp;ldquo;middle power realism.&amp;rdquo; It&amp;rsquo;s built on three observations.&lt;/p&gt;
&lt;p&gt;(1) The US is no longer a reliable partner. Not because of Trump specifically, but because American politics has shifted in ways that make transactional unilateralism the new baseline. The &lt;a href="https://en.wikipedia.org/wiki/Donroe_Doctrine"&gt;&amp;ldquo;Donroe Doctrine&amp;rdquo;&lt;/a&gt;, a portmanteau of &amp;ldquo;Donald&amp;rdquo; and &amp;ldquo;Monroe&amp;rdquo;, asserts American hegemony over the Western Hemisphere with a resource-driven, security-focused twist. It treats allies as protection rackets and international law as an impediment.&lt;/p&gt;
&lt;p&gt;(2) Nostalgia is dangerous. The pre-2016 order isn&amp;rsquo;t coming back. Waiting for &amp;ldquo;normal&amp;rdquo; to return is a strategy for decline. Middle powers that don&amp;rsquo;t build domestic strength and horizontal coalitions will find themselves, as Carney put it invoking Thucydides, &lt;a href="https://www.theguardian.com/business/live/2026/jan/20/davos-von-der-leyen-he-macron-carney-wef-greenland-trump-uk-unemployment-business-live-news-updates"&gt;&amp;ldquo;on the menu.&amp;rdquo;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;(3) Sovereignty requires the capacity to say no. That means diversified partnerships, even with rivals. Canada&amp;rsquo;s China deal infuriated Trump, who &lt;a href="https://www.washingtonexaminer.com/news/world/4433043/carney-no-intention-signing-free-trade-deal-china/"&gt;accused Carney of allowing a &amp;ldquo;Trojan Horse&amp;rdquo;&lt;/a&gt; into the continent. But from Ottawa&amp;rsquo;s perspective, the ability to trade with Beijing is precisely what makes Canadian sovereignty credible. You can&amp;rsquo;t negotiate from strength if you have no alternatives.&lt;/p&gt;
&lt;p&gt;The European response follows similar logic. During the Greenland crisis, when Trump threatened tariffs on eight European nations and refused to rule out military force to &amp;ldquo;secure&amp;rdquo; the island, the EU &lt;a href="https://www.theguardian.com/commentisfree/2026/jan/23/europe-trump-climbdown-genuflecting-tacos-greenland"&gt;threatened to deploy its Anti-Coercion Instrument&lt;/a&gt; against the United States. For the first time, the bloc signaled willingness to engage in a trade war with its primary security guarantor to protect the sovereignty of a member state.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://defence-industry-space.ec.europa.eu/eu-defence-industry/safe-security-action-europe_en"&gt;SAFE instrument&lt;/a&gt; itself is designed for strategic autonomy. Strict &amp;ldquo;Buy European&amp;rdquo; provisions limit subcontractors from non-EU countries to 15-35% of contract value, phased out within two years. The explicit goal is ITAR-free supply chains, defense procurement that doesn&amp;rsquo;t depend on American permission. Meanwhile, the UK, which refused Trump&amp;rsquo;s Board of Peace but remains &lt;a href="https://behorizon.org/safe-mechanism-reshaping-eu-defence-integration/"&gt;excluded from SAFE&lt;/a&gt; due to post-Brexit negotiating failures, finds itself in strategic limbo. Alienated from Washington, locked out of European defense architecture, the &amp;ldquo;mid-Atlantic bridge&amp;rdquo; is collapsing.&lt;/p&gt;
&lt;p&gt;There&amp;rsquo;s a strange inversion happening in the international system. At Davos, &lt;a href="https://timesofindia.indiatimes.com/world/china/its-not-about-gaza-is-un-real-target-of-trumps-board-of-peace-china-emerges-as-unlikely-defender/articleshow/126967201.cms"&gt;China positioned itself as the defender of the UN Charter&lt;/a&gt;, rejecting Trump&amp;rsquo;s &amp;ldquo;Board of Peace&amp;rdquo; as a parallel structure that undermines international law. The authoritarian superpower defending liberal institutions while the democratic superpower seeks to dismantle them. China benefits from a multipolar system with weak enforcement mechanisms. The US benefits from a unipolar system where it makes the rules. Middle powers benefit from rules that constrain the strong, which is why the Global South &lt;a href="https://en.wikipedia.org/wiki/2026_Mark_Carney_speech_at_the_World_Economic_Forum"&gt;found validation in Carney&amp;rsquo;s speech&lt;/a&gt;. The admission that the &amp;ldquo;Rules-Based Order&amp;rdquo; was often cover for Western interests resonated with nations that experienced that hypocrisy firsthand.&lt;/p&gt;
&lt;p&gt;The term &lt;em&gt;&amp;ldquo;middle power&amp;rdquo;&lt;/em&gt; has always been slightly embarrassing, an admission of limits, a confession that you&amp;rsquo;re not at the top table. But there&amp;rsquo;s a realism emerging in these countries that the great powers lack. They can&amp;rsquo;t afford illusions about the international system because they don&amp;rsquo;t control it. They have to see clearly or get crushed.&lt;/p&gt;
&lt;p&gt;Carney&amp;rsquo;s greengrocer metaphor cuts both ways. Yes, taking down the sign exposes the illusion. But it also means operating without the protection the illusion provided. The grocer who removes the slogan faces consequences. So do countries. Canada is betting it can navigate between giants, trading with China, defending alongside Europe, maintaining what leverage it has with Washington. The EU is betting it can build autonomous defense capacity fast enough to matter. Japan, Australia, and others are making similar calculations, hedging relationships that used to be taken for granted.&lt;/p&gt;</description></item><item><title>The Most Expensive Assumption in AI</title><link>http://philippdubach.com/posts/the-most-expensive-assumption-in-ai/</link><pubDate>Mon, 26 Jan 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/the-most-expensive-assumption-in-ai/</guid><description>&lt;p&gt;Sara Hooker&amp;rsquo;s paper arrived with impeccable timing. &lt;em&gt;&lt;a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5877662"&gt;On the slow death of scaling&lt;/a&gt;&lt;/em&gt; dropped just as hyperscalers are committing another $500 billion to GPU infrastructure, bringing total industry deployment into the scaling thesis somewhere north of a trillion dollars. I&amp;rsquo;ve been &lt;a href="http://philippdubach.com/posts/how-ai-is-shaping-my-investment-portfolio-for-2026/"&gt;tracking these capital flows&lt;/a&gt; for my own portfolio. Either Hooker is early to a generational insight or she&amp;rsquo;s about to be very publicly wrong.&lt;a href="#lightbox-hyperscaler_capex2-png-0" style="display: block; width: 100%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/hyperscaler_capex2.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/hyperscaler_capex2.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/hyperscaler_capex2.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/hyperscaler_capex2.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/hyperscaler_capex2.png 1200w"
sizes="100vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/hyperscaler_capex2.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/hyperscaler_capex2.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/hyperscaler_capex2.png 1440w"
sizes="100vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/hyperscaler_capex2.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/hyperscaler_capex2.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/hyperscaler_capex2.png 2000w"
sizes="100vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/hyperscaler_capex2.png"
alt="Hyperscaler AI capital expenditure 2019-2025"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
The core argument is very simple: bigger is not always better. &lt;a href="https://www.tii.ae/news/falcon-2-uaes-technology-innovation-institute-releases-new-ai-model-series-outperforming-metas"&gt;Llama-3 8B outperforms Falcon 180B&lt;/a&gt;. &lt;a href="https://arxiv.org/abs/2211.05100"&gt;Aya 23 8B beats BLOOM 176B&lt;/a&gt; despite having only 4.5% of the parameters. These are not isolated flukes. Hooker plots submissions to the Open LLM Leaderboard over two years and finds a systematic trend where compact models consistently outperform their bloated predecessors. The bitter lesson, as Rich Sutton framed it, was that brute force compute always wins. Hooker&amp;rsquo;s counter is that maybe we&amp;rsquo;ve been held hostage to &amp;ldquo;a painfully simple formula&amp;rdquo; that&amp;rsquo;s now breaking down.&lt;a href="#lightbox-model_size_vs_performance2-png-1" style="display: block; width: 100%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/model_size_vs_performance2.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/model_size_vs_performance2.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/model_size_vs_performance2.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/model_size_vs_performance2.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/model_size_vs_performance2.png 1200w"
sizes="100vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/model_size_vs_performance2.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/model_size_vs_performance2.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/model_size_vs_performance2.png 1440w"
sizes="100vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/model_size_vs_performance2.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/model_size_vs_performance2.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/model_size_vs_performance2.png 2000w"
sizes="100vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/model_size_vs_performance2.png"
alt="Model size vs benchmark performance showing smaller models outperforming larger ones"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Scaling laws, she notes, only reliably predict pre-training test loss. When you look at actual downstream performance, the results are &amp;ldquo;murky or inconsistent.&amp;rdquo; The term &amp;ldquo;emergent properties&amp;rdquo; gets thrown around to describe capabilities that appear suddenly at scale, but Hooker points out this is really just a fancy way of admitting we have no idea what&amp;rsquo;s coming. If your scaling law can&amp;rsquo;t predict emergence, it&amp;rsquo;s not much of a law.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Gary_Marcus"&gt;Gary Marcus&lt;/a&gt; has been making a related argument from a different angle. The cognitive scientist, whose 2001 book predicted hallucination problems, calls LLMs &amp;ldquo;glorified memorization machines&amp;rdquo; that work because the internet contains answers to most common queries. His framing is less academic and more market-oriented: the jump from GPT-1 to GPT-4 showed obvious qualitative leaps requiring no benchmarks. The jump from GPT-4 to GPT-5? Marginal improvements requiring careful measurement. The textbook definition of diminishing returns.&lt;/p&gt;
&lt;p&gt;The market signals are worth watching. According to &lt;a href="https://www.ft.com/content/a081aa60-eaca-4413-ba15-489762154c57"&gt;Goldman Sachs data&lt;/a&gt;, hedge fund short interest in utilities now sits at the 99th percentile relative to the past five years. Utilities. The bet appears to be that AI data center demand, the premise on which &lt;a href="https://www.reuters.com/business/energy/american-electric-power-signs-265-billion-deal-fuel-cells-2026-01-08/"&gt;American Electric Power trades at $65 billion&lt;/a&gt;, may not materialize as expected. Meanwhile, names like Bloom Energy, Oracle, and various AI-adjacent plays are showing up on heavily-shorted lists. Hedge funds aren&amp;rsquo;t yet betting against Nvidia directly, but they&amp;rsquo;re circling the weaker members of the herd.&lt;/p&gt;
&lt;p&gt;There&amp;rsquo;s a certain irony here that Hooker captures well. Academia was effectively priced out of meaningful AI research by the compute arms race. The explosion in necessary compute &amp;ldquo;marginalized academia from meaningfully participating in AI progress.&amp;rdquo; Industry labs stopped publishing to preserve commercial advantage. Now, as scaling hits diminishing returns, the skills that matter shift back toward algorithmic cleverness, data quality, and architectural innovation. Things that don&amp;rsquo;t require a billion-dollar data center. If you got priced out of the game, the game may be coming back to you. Hooker writes,&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The less reliable gains from compute makes our purview as computer scientists interesting again&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The quiet tell is how frontier labs are actually behaving. Major players are now incorporating classical symbolic tools, things like Python interpreters and code execution, into LLM pipelines. These symbolic components run on CPUs, not GPUs. &lt;a href="https://en.wikipedia.org/wiki/Ilya_Sutskever"&gt;Ilya Sutskever&lt;/a&gt;, coauthor of the 2012 ImageNet paper and OpenAI cofounder, publicly stated that&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We need to go back to the age of research&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Shorting the scaling thesis has been a widow-maker trade for the better part of three years. Nvidia is up roughly 800% since 2022. As I&amp;rsquo;ve &lt;a href="http://philippdubach.com/posts/the-market-can-stay-irrational-longer-than-you-can-stay-solvent/"&gt;written before&lt;/a&gt;, the market can remain irrational longer than you can remain solvent, and that applies to both directions. OpenAI reportedly burns around $3 billion monthly with a $40 billion funding round implying perhaps 13 months of runway. If the next mega-round prices down or requires distressed terms, that&amp;rsquo;s your signal. Until then, the thesis may be directionally correct on the technical limitations while the timing remains treacherous.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We can only see a short distance ahead, but we can see plenty there that needs to be done.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;As Alan Turing put it, and Hooker quotes approvingly. The scaling era produced real capabilities alongside real capital misallocation. What comes next is genuinely uncertain. That uncertainty cuts both ways.&lt;/p&gt;</description></item><item><title>Against All Odds: The Mathematics of 'Provably Fair' Casino Games</title><link>http://philippdubach.com/posts/against-all-odds-the-mathematics-of-provably-fair-casino-games/</link><pubDate>Sun, 25 Jan 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/against-all-odds-the-mathematics-of-provably-fair-casino-games/</guid><description>&lt;br&gt;
&lt;blockquote&gt;
&lt;p&gt;Gambling can be harmful and lead to significant losses. Participation is subject to local laws and age restrictions. Always gamble responsibly. Need help? Visit BeGambleAware.org&lt;/p&gt;
&lt;/blockquote&gt;
&lt;br&gt;
&lt;p&gt;Crash games represent a category of online gambling where players place bets on an increasing multiplier that can &lt;em&gt;&amp;lsquo;crash&amp;rsquo;&lt;/em&gt; at any moment. The fundamental mechanic requires players to cash out before the crash occurs; successful cash-outs yield the bet amount multiplied by the current multiplier, while failure results in total loss of the wager.&lt;/p&gt;
&lt;a href="#lightbox-flight-game-gif-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/flight-game.gif 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/flight-game.gif 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/flight-game.gif 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/flight-game.gif 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/flight-game.gif 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/flight-game.gif 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/flight-game.gif 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/flight-game.gif 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/flight-game.gif 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/flight-game.gif 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/flight-game.gif 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/flight-game.gif"
alt="Crash game showing an airplane flying with increasing multiplier until it crashes"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;The specific game I came across is a variant that employs an aircraft flight metaphor. Let&amp;rsquo;s call it &lt;em&gt;Plane Game&lt;/em&gt;. What intrigued me wasn&amp;rsquo;t the game itself but that it said &amp;ldquo;provably fair&amp;rdquo; on the startup screen, which I assumed to be a typo at first. I stand corrected:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;A provably fair gambling system uses cryptography to let players verify that each outcome was generated from fixed inputs, rather than chosen or altered by the operator after a bet is placed. The casino commits to a hidden &amp;ldquo;server seed&amp;rdquo; via a public hash, combines it with a player-controlled &amp;ldquo;client seed&amp;rdquo; and a per-bet nonce, and later reveals the server seed so anyone can recompute and confirm the result.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The stated Return-to-Player (RTP) of that specific game is 97%, implying a 3% &lt;a href="https://www.investopedia.com/articles/personal-finance/110415/why-does-house-always-win-look-casino-profitability.asp"&gt;house edge&lt;/a&gt;. After watching a few rounds, the perceived probability felt off. And if there&amp;rsquo;s something that gets my attention, it&amp;rsquo;s &lt;a href="http://philippdubach.com/posts/counting-cards-with-computer-vision/"&gt;the combination of games and statistics&lt;/a&gt;. So I did what any reasonable person would do: I watched another 20,000 rounds over six days (112 hours total) and wrote &lt;a href="https://static.philippdubach.com/pdf/202601_PD_DUBACH_The%20Online%20Gambling%20Fairness%20Paradox.pdf"&gt;a paper about it&lt;/a&gt;.&lt;a href="#lightbox-crash_game_stats-png-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/crash_game_stats.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/crash_game_stats.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/crash_game_stats.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/crash_game_stats.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/crash_game_stats.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/crash_game_stats.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/crash_game_stats.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/crash_game_stats.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/crash_game_stats.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/crash_game_stats.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/crash_game_stats.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/crash_game_stats.png"
alt="Script recording 20000 rounds over six days (112 hours total)"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;The distribution below shows the classic heavy tail: most rounds crash quickly at low multipliers, while rare events produce 100x or even 1000x payouts. The maximum I observed was 10,000x. This extreme variance creates the illusion of big wins just around the corner while the house edge operates relentlessly over time.&lt;a href="#lightbox-fig_distribution2-png-2" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/fig_distribution2.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/fig_distribution2.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/fig_distribution2.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/fig_distribution2.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/fig_distribution2.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/fig_distribution2.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/fig_distribution2.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/fig_distribution2.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/fig_distribution2.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/fig_distribution2.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/fig_distribution2.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/fig_distribution2.png"
alt="Heavy-tailed distribution of crash multipliers on log-log scale showing most rounds end at low multipliers while rare events exceed 100x or 1000x, with maximum observed at 10,000x"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
For a crash game with RTP = r (where 0 &amp;lt; r &amp;lt; 1), the crash multiplier M follows a specific probability distribution. The survival function is particularly relevant:&lt;/p&gt;
$$P(M \geq m) = \frac{r}{m}$$&lt;p&gt;This means the probability of reaching at least multiplier m before crashing equals r/m. For any cash-out target, the expected value of a unit bet works out to:&lt;/p&gt;
$$E[\text{Profit}] = P(M \geq m) \times m - 1 = \frac{r}{m} \times m - 1 = r - 1 = -0.03$$
&lt;p&gt;This mathematical property makes crash games theoretically &amp;ldquo;strategy-proof&amp;rdquo; in expectation. No cash-out timing strategy should yield better long-term results than another.&lt;a href="#lightbox-fig_survival_annotated-png-4" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/fig_survival_annotated.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/fig_survival_annotated.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/fig_survival_annotated.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/fig_survival_annotated.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/fig_survival_annotated.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/fig_survival_annotated.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/fig_survival_annotated.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/fig_survival_annotated.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/fig_survival_annotated.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/fig_survival_annotated.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/fig_survival_annotated.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/fig_survival_annotated.png"
alt="Survival probability curve on log-log scale showing probability of reaching target multiplier: 2x succeeds 48.5% of the time, 5x at 19.6%, 10x at 9.7%, 50x at 2.0%, and 100x at just 1.1%"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
The empirical data matches theory almost perfectly. A 2x target succeeds about 48.5% of the time. Aiming for 10x? That works only 9.7% of rounds. The close fit between my observations and the theoretical line confirms the stated 97% RTP.&lt;/p&gt;
&lt;p&gt;So is the game fair? My analysis says yes. Using three different statistical methods (log-log regression, maximum likelihood, and the Hill estimator), I estimated the probability density function exponent at α ≈ 1.98, within 2.2% of the theoretical value of 2.0. This contrasts with &lt;a href="https://www.nature.com/articles/s41598-019-50168-2"&gt;Wang and Pleimling&amp;rsquo;s 2019 research&lt;/a&gt; that found exponents of 1.4 to 1.9 for player cashout distributions. The key distinction: their deviations reflect player behavioral biases (probability weighting), not game manipulation. The random number generator produces fair outcomes.&lt;a href="#lightbox-fig_qq_enhanced-png-5" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/fig_qq_enhanced.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/fig_qq_enhanced.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/fig_qq_enhanced.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/fig_qq_enhanced.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/fig_qq_enhanced.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/fig_qq_enhanced.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/fig_qq_enhanced.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/fig_qq_enhanced.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/fig_qq_enhanced.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/fig_qq_enhanced.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/fig_qq_enhanced.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/fig_qq_enhanced.png"
alt="Q-Q plot comparing empirical vs theoretical quantiles with perfect fit line and 10% confidence band, showing close alignment confirming fair random number generation"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
I then ran Monte Carlo simulations of 10,000 betting sessions under four different strategies: conservative 1.5x cashouts, moderate 2.0x, aggressive 3.0x, and high-risk 5.0x targets.&lt;a href="#lightbox-fig_strategies-png-6" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/fig_strategies.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/fig_strategies.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/fig_strategies.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/fig_strategies.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/fig_strategies.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/fig_strategies.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/fig_strategies.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/fig_strategies.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/fig_strategies.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/fig_strategies.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/fig_strategies.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/fig_strategies.png"
alt="Strategy comparison boxplot showing session returns for 100 rounds: 1.5x Conservative averages -2.9%, 2.0x Moderate -2.4%, 3.0x Aggressive -3.3%, and 5.0x High Risk -3.5%, all negative"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Every single strategy produces negative expected returns. The conservative approach has lower variance but still loses. The aggressive strategies lose faster with higher variance.&lt;a href="#lightbox-fig_trajectories-png-7" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/fig_trajectories.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/fig_trajectories.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/fig_trajectories.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/fig_trajectories.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/fig_trajectories.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/fig_trajectories.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/fig_trajectories.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/fig_trajectories.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/fig_trajectories.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/fig_trajectories.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/fig_trajectories.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/fig_trajectories.png"
alt="Simulated player sessions using 1.5x strategy over 200 rounds showing multiple trajectories trending toward expected loss line of -3% per round"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
The consumer protection angle is what concerns me most. My data revealed 179 rounds per hour with 16-second median intervals. At that pace, with a 3% house edge per round, players face expected losses exceeding 500% of amounts wagered per hour of play. The manual cashout mechanic creates an illusion of control, masking the deterministic nature of losses.&lt;/p&gt;
&lt;p&gt;The game is provably fair in the cryptographic sense. The mathematics check out. But mathematical fairness doesn&amp;rsquo;t ensure consumer safety. The house always wins, and it wins fast.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The only winning strategy is not to play&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The full paper preprint with methodology and statistical details is &lt;a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6065213"&gt;available on SSRN&lt;/a&gt;. Code and data are on &lt;a href="https://github.com/philippdubach/stats-gambling"&gt;GitHub&lt;/a&gt;.&lt;/p&gt;</description></item><item><title>Enterprise AI Strategy is Backwards</title><link>http://philippdubach.com/posts/enterprise-ai-strategy-is-backwards/</link><pubDate>Thu, 22 Jan 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/enterprise-ai-strategy-is-backwards/</guid><description>&lt;p&gt;That’s the claim made by LinkedIn co-founder &lt;a href="https://en.wikipedia.org/wiki/Reid_Hoffman"&gt;Reid Hoffman&lt;/a&gt;. It’s a bold assertion, so I set out to investigate whether the data supports it.&lt;a href="#lightbox-download_overview-png-0" style="display: block; width: 100%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/download_overview.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/download_overview.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/download_overview.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/download_overview.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/download_overview.png 1200w"
sizes="100vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/download_overview.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/download_overview.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/download_overview.png 1440w"
sizes="100vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/download_overview.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/download_overview.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/download_overview.png 2000w"
sizes="100vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/download_overview.png"
alt="Report Header Overview"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
The result is a comprehensive report, backed by more than 30 sources. You can download &lt;a href="https://static.philippdubach.com/pdf/Enterprise_AI_Strategy2026_philippdubach.pdf"&gt;the full report&lt;/a&gt;
and the &lt;a href="https://static.philippdubach.com/pdf/Enterprise_AI_Strategy2026_Deck_philippdubach.pdf"&gt;accompanying presentation&lt;/a&gt; for free.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;Global AI spending hit $13.8 billion; a six-fold increase since late 2023. Yet 85% of AI projects never reach production. Only 26% of companies can translate pilots into outcomes. The gap between ambition and execution has become so predictable that Gartner now officially places generative AI in the &amp;ldquo;&lt;a href="https://www.snaplogic.com/lp/gartner-magic-quadrant-ipaas-2025?utm_source=GOOG&amp;amp;utm_medium=PS&amp;amp;utm_campaign=Content_AR_Gartner-iPaas-MQ-2025&amp;amp;_bt=778769312143&amp;amp;_bk=gartner%20ipaas%20magic%20quadrant&amp;amp;_utm_term=gartner%20ipaas%20magic%20quadrant&amp;amp;_bm=b&amp;amp;_bn=g&amp;amp;saf_src=google_g&amp;amp;saf_pt=&amp;amp;saf_kw=gartner%20ipaas%20magic%20quadrant&amp;amp;saf_dv=&amp;amp;saf_cam=23125873381&amp;amp;saf_grp=186359808906&amp;amp;saf_ad=778769312143&amp;amp;saf_acc=4847116121&amp;amp;saf_cam_tp=search&amp;amp;gad_source=1&amp;amp;gad_campaignid=23125873381&amp;amp;gbraid=0AAAAAD3MpSl-QdXUDpLVTClnJRS_g2cQ-&amp;amp;gclid=Cj0KCQiA1czLBhDhARIsAIEc7ugOJcXK_OoRuxk2au4MhOAaluMKdTwxFcl3uPdWSMcYdLd0JAogI7QaAvbeEALw_wcB"&gt;trough of disillusionment&lt;/a&gt;.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;There&amp;rsquo;s an economic concept called &lt;a href="https://en.wikipedia.org/wiki/Jevons_paradox"&gt;Jevons paradox&lt;/a&gt; &lt;em&gt;(yes, I &lt;a href="https://notes.philippdubach.com/0005"&gt;referenced this before&lt;/a&gt;)&lt;/em&gt;. When efficiency improves for a resource, consumption increases, not decreases. Coal-efficient steam engines didn&amp;rsquo;t reduce coal usage, they made coal so useful that demand exploded. The same logic applies to organizational communication. Email was supposed to reduce meetings. Slack was supposed to reduce email. AI was supposed to reduce everything.&lt;/p&gt;
&lt;p&gt;Instead, the average employee now spends 57% of their workday on coordination: communicating, updating, aligning. Meetings alone cost the US economy $532 billion per year. This is the coordination layer, where organizations actually run, and where organizations quietly bleed.&lt;/p&gt;
&lt;p&gt;Three observations:&lt;/p&gt;
&lt;p&gt;(1) Only 26% of companies have the maturity to translate AI pilots into outcomes. The rest are layering AI on legacy workflows instead of redesigning them.&lt;br&gt;
(2) Language models bridge the gap between messy human communication and structured data. Transcripts to CRM fields. Teams using these tools report 30% higher win rates and 80% less manual work.&lt;br&gt;
(3) AI gains compound when shareable. A summary helps one person. A system that captures and distributes knowledge helps everyone downstream.&lt;/p&gt;
&lt;p&gt;The coordination layer isn&amp;rsquo;t glamorous. It&amp;rsquo;s transcripts, status updates, action items, CRM entries. It&amp;rsquo;s the administrative exhaust of getting anything done with other people. And it&amp;rsquo;s almost entirely composed of language. We have language models now. Models that extract structured data from messy transcripts, convert meeting notes into CRM fields with 99% accuracy. Sales teams using these tools report 30% higher win rates and 80% less manual work.&lt;/p&gt;
&lt;p&gt;Yet most enterprise AI strategies ignore this entirely. They&amp;rsquo;re focused on chatbots and demos for board presentations. Meanwhile, the language processing that constitutes the primary workload of any modern business remains stuck in the same recursive loops. The winners won&amp;rsquo;t be companies with great AI announcements. They&amp;rsquo;ll be the ones building daily habits early enough for the gains to stack.&lt;/p&gt;</description></item><item><title>Big in Japan</title><link>http://philippdubach.com/posts/big-in-japan/</link><pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/big-in-japan/</guid><description>&lt;p&gt;Japan holds roughly &lt;a href="https://pbs.twimg.com/media/G_j8tfLXEAA1djy?format=jpg&amp;amp;name=medium"&gt;$5 trillion in foreign assets&lt;/a&gt;. The US alone accounts for &lt;a href="https://pbs.twimg.com/media/G_j8tfKWwAAesgX?format=jpg&amp;amp;name=medium"&gt;¥342 trillion&lt;/a&gt; in bonds and equities.&lt;/p&gt;
&lt;p&gt;Japanese &lt;a href="https://pbs.twimg.com/media/G_j8tfQWoAADp2D?format=jpg&amp;amp;name=medium"&gt;30-year yields sat below 1%&lt;/a&gt; from 2019 through early 2024. They&amp;rsquo;re now above 3%. The &lt;a href="https://pbs.twimg.com/media/G_j8tfRXgAAgGPV?format=jpg&amp;amp;name=medium"&gt;yield spread&lt;/a&gt; between developed market bonds and JGBs has collapsed from 400 basis points to roughly 100. The yen carry trade that defined Japanese institutional behavior since the 1990s, borrow cheap at home and invest abroad for yield, suddenly has added friction.&lt;/p&gt;
&lt;p&gt;Japanese life insurers and pension funds have duration-matching obligations. If domestic yields offer adequate returns with lower currency risk, the marginal incentive to hold Treasuries weakens. GPIF, the world&amp;rsquo;s largest pension fund, doesn&amp;rsquo;t need to reach for yield in US credit markets when JGBs pay 3%.&lt;/p&gt;
&lt;p&gt;This doesn&amp;rsquo;t mean Japanese investors dump everything tomorrow. Institutional rebalancing is glacial. Currency hedging costs matter and existing positions have different maturity profiles. Treasury market depth has deteriorated since 2020. Primary dealers hold smaller inventories. Liquidity provision is thinner. A sustained seller of size, which Japanese institutions would be, arrives into a market less equipped to absorb flow than at any point since the GFC.&lt;/p&gt;
&lt;p&gt;The second-order effects compound. Japanese selling pressures Treasury yields higher. Higher yields strengthen the dollar near-term but raise US borrowing costs. If Japan&amp;rsquo;s repatriation triggers broader reserve manager concern about duration exposure, the feedback loop accelerates.&lt;/p&gt;
&lt;p&gt;The consensus view remains that Japan is trapped. Any meaningful tightening implodes JGB markets where the BOJ owns half of outstanding supply. But the data suggests something else. Yields are rising, volatility is elevated, and the market is absorbing it. The trap might be less binding than assumed.&lt;/p&gt;
&lt;p&gt;The yen carry trade unwound violently in August 2024 and the S&amp;amp;P dropped 6% in three days. That was positioning adjustment. Repatriation of actual assets would be slower but larger.&lt;/p&gt;
&lt;p&gt;When a $5 trillion portfolio starts rebalancing toward domestic assets, you don&amp;rsquo;t need to predict the timing. You need to be positioned for a situation where the marginal Treasury buyer becomes a marginal seller. What happens in Japan doesn&amp;rsquo;t stay in Japan.&lt;/p&gt;</description></item><item><title>Ozempic is Reshaping the Fast Food Industry</title><link>http://philippdubach.com/posts/ozempic-is-reshaping-the-fast-food-industry/</link><pubDate>Fri, 16 Jan 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/ozempic-is-reshaping-the-fast-food-industry/</guid><description>&lt;p&gt;Something strange is happening in the food industry. &lt;a href="https://www.wsj.com/health/wellness/us-dietary-food-guidelines-trump-rfk-jr-aaf51714"&gt;New US dietary guidelines call for more protein and less sugar&lt;/a&gt;. Greggs, the UK bakery chain, just warned of &lt;a href="https://www.ft.com/content/7ab5e9b8-45fe-4ba2-97f2-41d417561ce3"&gt;&amp;ldquo;flatlining profits&amp;rdquo;&lt;/a&gt; in the food-to-go market. Food companies are racing to overhaul their brands, ditching artificial dyes and packing protein into products. Earnings calls across the sector blame &amp;ldquo;inflation&amp;rdquo; and &amp;ldquo;subdued consumer confidence.&amp;rdquo; Nobody mentions the elephant in the room: GLP-1 medications.&lt;/p&gt;
&lt;p&gt;New &lt;a href="https://doi.org/10.1177/00222437251412834"&gt;research from Cornell&lt;/a&gt; finally puts numbers to what the food industry doesn&amp;rsquo;t want to discuss. Using transaction data from 150,000 households linked to survey responses on medication adoption, Sylvia Hristakeva, Jūra Liaukonytė, and Leo Feler tracked exactly how Ozempic and Wegovy users change their spending. The results deserve attention from anyone holding food stocks.&lt;/p&gt;
&lt;p&gt;The headline: households with a GLP-1 user cut grocery spending by &lt;strong&gt;5.3%&lt;/strong&gt; within six months. For high-income households, that figure jumps to &lt;strong&gt;8.2%&lt;/strong&gt;. Fast food takes an even harder hit, with spending at limited-service restaurants falling &lt;strong&gt;8.0%&lt;/strong&gt;. These aren&amp;rsquo;t people switching brands or trading down. They&amp;rsquo;re simply eating less.&lt;/p&gt;
&lt;p&gt;The category-level data tells the real story. Savory snacks see the largest decline at &lt;strong&gt;10.1%&lt;/strong&gt;. Sweets, baked goods, cookies, all down. Even staples like meat, eggs, and bread decline. In the entire grocery basket, only one category shows a statistically significant increase: yogurt. Fresh fruit and nutrition bars trend up slightly, but yogurt is the lone winner with statistical confidence.&lt;a href="#lightbox-glp1_category_spending-png-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/glp1_category_spending.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/glp1_category_spending.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/glp1_category_spending.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/glp1_category_spending.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/glp1_category_spending.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/glp1_category_spending.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/glp1_category_spending.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/glp1_category_spending.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/glp1_category_spending.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/glp1_category_spending.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/glp1_category_spending.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/glp1_category_spending.png"
alt="Horizontal bar chart showing GLP-1 users&amp;#39; grocery spending changes: savory snacks -10.1%, sweet snacks -6.8%, baked goods -5.4%, with yogurt as only significant increase at &amp;#43;3.4%"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
As of July 2024, &lt;strong&gt;16.3%&lt;/strong&gt; of U.S. households have at least one GLP-1 user. The adoption curve is steepening. Nearly half of adopters report taking the medication specifically for weight loss rather than diabetes management. These weight-loss users tend to be younger, higher income, and more willing to pay out of pocket. They&amp;rsquo;re also the most profitable customers for fast food chains, the ones who don&amp;rsquo;t flinch at price increases.&lt;/p&gt;
&lt;p&gt;This creates what the researchers call a &amp;ldquo;double whammy&amp;rdquo; for the food industry. Companies are losing their highest-margin customers to a biological shift in appetite while being left with a more price-sensitive demographic that actually &lt;em&gt;does&lt;/em&gt; respond to inflation. When McDonald&amp;rsquo;s CEO Chris Kempczinski talks about &lt;a href="https://www.youtube.com/watch?v=srH8f_Fa82A"&gt;losing lower-income customers to home cooking&lt;/a&gt;, he&amp;rsquo;s describing the wrong problem.&lt;a href="#lightbox-glp1_adoption_timeline-png-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/glp1_adoption_timeline.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/glp1_adoption_timeline.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/glp1_adoption_timeline.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/glp1_adoption_timeline.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/glp1_adoption_timeline.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/glp1_adoption_timeline.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/glp1_adoption_timeline.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/glp1_adoption_timeline.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/glp1_adoption_timeline.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/glp1_adoption_timeline.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/glp1_adoption_timeline.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/glp1_adoption_timeline.png"
alt="Line chart showing GLP-1 adoption from Jan 2023 to Jul 2024: weight loss users surpassed diabetes control users by July 2023, reaching over 1,200 users by end of period"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
The research also suggests why food executives might be keeping quiet. About &lt;strong&gt;34%&lt;/strong&gt; of GLP-1 users discontinue within the sample period. When they stop, their spending doesn&amp;rsquo;t just return to baseline. It becomes &lt;em&gt;less healthy&lt;/em&gt;. Candy and chocolate purchases rise &lt;strong&gt;11.4%&lt;/strong&gt; above pre-adoption levels after stopping the medication.&lt;/p&gt;
&lt;p&gt;If you&amp;rsquo;re running a snack company, the math might look survivable: lose customers to Ozempic for a year, then welcome them back once they quit. The drugs suppress appetite biologically; they don&amp;rsquo;t teach new habits. When the biology reverts, so does the behavior.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://youtu.be/JTG5uMWDKXk"&gt;Scott Galloway&lt;/a&gt; has called the food industry an &amp;ldquo;obesity index&amp;rdquo; and predicted a &amp;ldquo;tsunami of shareholder destruction.&amp;rdquo; The Cornell data suggests he&amp;rsquo;s directionally right but possibly too aggressive on timing. The industry has a built-in buffer: medication discontinuation. The question is whether that buffer lasts as drugs get cheaper, side effects improve, and insurance coverage expands.&lt;/p&gt;
&lt;p&gt;The deeper issue is about the persistence of dietary change. &lt;a href="https://jn.nutrition.org/article/S0022-3166(25)00647-9/fulltext"&gt;Previous studies found&lt;/a&gt; that even major life events, a diabetes diagnosis, job loss, childbirth, produce only modest and short-lived changes in diet. Information campaigns and price nudges have mixed results at best. GLP-1 medications work differently because they alter the biological reward system directly. Users describe the experience as &amp;ldquo;silencing food noise,&amp;rdquo; a constant background hum of cravings that simply disappears.&lt;/p&gt;
&lt;p&gt;But this biological dependence cuts both ways. The changes don&amp;rsquo;t stick without the drug. Stopping medication means losing both the appetite suppression and whatever habits might have formed during treatment. The Cornell team notes that &amp;ldquo;GLP-1s could complement existing nutritional interventions&amp;rdquo; but cautions that &amp;ldquo;their broader public health relevance ultimately depends on sustained adherence.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;For investors, the practical question is positioning. Companies selling hyperpalatable, calorie-dense products face structural headwinds. Companies selling protein-rich, nutrient-dense foods in smaller portions have tailwinds. The data shows users shifting toward yogurt, fresh fruit, and nutrition bars. Package sizes may need to shrink. Marketing strategies may need to pivot from &amp;ldquo;craveable&amp;rdquo; to &amp;ldquo;satisfying.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;The next few quarters of earnings calls will be interesting. At some point, an analyst will ask the GLP-1 question directly. The honest answer from management would be: we don&amp;rsquo;t know the full impact yet, but 16% of households having a user, 8% declines in fast food spending, and the fastest-growing prescription category in the country is not something we can ignore.&lt;/p&gt;
&lt;aside class="inline-newsletter" aria-label="Newsletter signup"&gt;
&lt;div class="inline-newsletter-content"&gt;
&lt;p class="inline-newsletter-headline"&gt;Enjoy this writing? Get new posts, projects, and articles delivered monthly.&lt;/p&gt;
&lt;form id="inline-newsletter-3-form" class="inline-newsletter-form"&gt;
&lt;label for="inline-newsletter-3-email" class="visually-hidden"&gt;Email address&lt;/label&gt;
&lt;input
type="email"
id="inline-newsletter-3-email"
name="email"
placeholder="your@email.com"
required
class="inline-newsletter-input"
aria-label="Email address"
/&gt;
&lt;button type="submit" class="inline-newsletter-button"&gt;Sign Up&lt;/button&gt;
&lt;/form&gt;
&lt;p id="inline-newsletter-3-privacy" class="inline-newsletter-privacy"&gt;&lt;a href="http://philippdubach.com/posts/building-a-no-tracking-newsletter-from-markdown-to-distribution/"&gt;No tracking&lt;/a&gt;. Unsubscribe anytime.&lt;/p&gt;
&lt;div id="inline-newsletter-3-message" class="inline-newsletter-message" style="display: none;"&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/aside&gt;
&lt;script&gt;
(function() {
var formId = 'inline-newsletter-3-form';
var messageId = 'inline-newsletter-3-message';
var emailId = 'inline-newsletter-3-email';
var privacyId = 'inline-newsletter-3-privacy';
function init() {
var form = document.getElementById(formId);
var messageDiv = document.getElementById(messageId);
var emailInput = document.getElementById(emailId);
var privacyDiv = document.getElementById(privacyId);
if (privacyDiv &amp;&amp; !privacyDiv.dataset.countLoaded) {
privacyDiv.dataset.countLoaded = 'true';
fetch('https://newsletter-api.philippd.workers.dev/api/subscriber-count')
.then(function(r) { return r.json(); })
.then(function(data) {
if (data.display) {
var countText = document.createTextNode('Join ' + data.display + ' readers. ');
privacyDiv.insertBefore(countText, privacyDiv.firstChild);
}
})
.catch(function() { });
}
if (!form) return;
form.addEventListener('submit', function(e) {
e.preventDefault();
var email = emailInput.value.trim();
if (!email) return;
var emailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
if (!emailRegex.test(email)) {
showMessage('Please enter a valid email address.', 'error');
return;
}
var submitButton = form.querySelector('button[type="submit"]');
submitButton.disabled = true;
submitButton.textContent = 'Subscribing...';
fetch('https://newsletter-api.philippd.workers.dev/api/subscribe', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ email: email })
})
.then(function(response) { return response.json(); })
.then(function(data) {
if (data.success) {
form.style.display = 'none';
document.querySelector('#' + formId).closest('.inline-newsletter').querySelector('.inline-newsletter-privacy').style.display = 'none';
showMessage('Thanks for subscribing! You\'ll receive the next newsletter in your inbox.', 'success');
} else {
showMessage(data.error || 'Something went wrong. Please try again.', 'error');
submitButton.disabled = false;
submitButton.textContent = 'Sign Up';
}
})
.catch(function() {
showMessage('Something went wrong. Please try again later.', 'error');
submitButton.disabled = false;
submitButton.textContent = 'Sign Up';
});
});
function showMessage(text, type) {
messageDiv.textContent = text;
messageDiv.className = 'inline-newsletter-message inline-newsletter-message-' + type;
messageDiv.style.display = 'block';
}
}
if (document.readyState === 'loading') {
document.addEventListener('DOMContentLoaded', init);
} else {
init();
}
})();
&lt;/script&gt;</description></item><item><title>Does AI mean the demand on labor goes up?</title><link>http://philippdubach.com/posts/does-ai-mean-the-demand-on-labor-goes-up/</link><pubDate>Thu, 15 Jan 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/does-ai-mean-the-demand-on-labor-goes-up/</guid><description>&lt;p&gt;&lt;a href="https://x.com/TheStalwart/status/2011418760813629738"&gt;Joe Weisenthal&lt;/a&gt; from Bloomberg, this week:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;All my shower thoughts now are about designing efficient workflows for synthesizing, collecting, labeling and annotating data.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Same. Since I started building every app and tool I thought would make my life easier, my workflow more efficient, I haven&amp;rsquo;t stopped. Apparently &lt;a href="https://techcrunch.com/2026/01/16/the-rise-of-micro-apps-non-developers-are-writing-apps-instead-of-buying-them/"&gt;non-developers are now writing apps&lt;/a&gt; instead of buying them. This is the AI productivity paradox in miniature: the tools get better and we do more, not less.&lt;/p&gt;
&lt;p&gt;The assumed narrative is still AI displaces jobs, humans collect UBI, society figures out leisure. But the trajectory might be more work, not less. A &lt;a href="https://cepr.org/voxeu/columns/ais-power-grows-so-does-our-workday"&gt;recent NBER study&lt;/a&gt; found that workers in AI-exposed occupations now work roughly 3 extra hours per week—and leisure time has dropped by the same amount. &lt;a href="https://investors.upwork.com/news-releases/news-release-details/upwork-study-finds-employee-workloads-rising-despite-increased-c"&gt;Upwork&amp;rsquo;s research&lt;/a&gt; puts it bluntly: 77% of employees say AI tools have &lt;em&gt;added&lt;/em&gt; to their workload.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://en.wikipedia.org/wiki/Jevons_paradox"&gt;Jevons paradox&lt;/a&gt; is 160 years old: when James Watt made steam engines more efficient, coal consumption didn&amp;rsquo;t fall. It exploded. Efficiency made coal useful in new ways. Satya Nadella &lt;a href="https://www.npr.org/sections/planet-money/2025/02/04/g-s1-46018/ai-deepseek-economics-jevons-paradox"&gt;referenced this for AI&lt;/a&gt; after DeepSeek rattled the markets. Erik Brynjolfsson argues it applies to AI-augmented occupations—coders, radiologists, translators. Make something more efficient and you find more things to do with it.&lt;/p&gt;
&lt;p&gt;When I can build an app in a weekend that used to take months, I don&amp;rsquo;t build one. I build six. When I can write a report in an hour, I write five. The friction that once protected us from infinite expectations evaporates. This is the Jevons paradox applied not just to markets or coal, but to our own time and cognitive capacity—a kind of psychological rebound effect where internal expectations outrun what&amp;rsquo;s actually sustainable.&lt;/p&gt;
&lt;p&gt;Keynes predicted a &lt;a href="http://www.econ.yale.edu/smith/econ116a/keynes1.pdf"&gt;15-hour work week&lt;/a&gt; by now. We got the productivity gains. We work longer hours than ever. Only &lt;a href="https://hellofuture.orange.com/en/the-ai-productivity-paradox-the-new-tech-may-be-eating-into-your-leisure-time/"&gt;21% of employees&lt;/a&gt; actually use the time AI saves them for personal life. The rest reinvest it right back into work. When capability expands, so does the definition of &amp;ldquo;enough.&amp;rdquo; The bar rises.&lt;/p&gt;
&lt;p&gt;If AI makes me 10x more productive, that&amp;rsquo;s not 10x more free time. That&amp;rsquo;s 10x more I &lt;em&gt;could&lt;/em&gt; be doing. In a competitive environment—founding, climbing, anything with stakes—someone who uses that 10x while I rest will outrun me. The fear was displacement. The reality might be inescapability.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Parkinson%27s_law#First_meaning"&gt;Parkinson&amp;rsquo;s Law&lt;/a&gt;: work expands to fill time available. The AI corollary: work expands to fill capabilities available. More capability means more possibility—and more obligation. We should know where this points.&lt;/p&gt;</description></item><item><title>Repo might be even bigger than we thought</title><link>http://philippdubach.com/posts/repo-might-be-even-bigger-than-we-thought/</link><pubDate>Tue, 13 Jan 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/repo-might-be-even-bigger-than-we-thought/</guid><description>&lt;blockquote&gt;
&lt;p&gt;Finance is anthropological&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That&amp;rsquo;s &lt;a href="https://en.wikipedia.org/wiki/Zoltan_Pozsar"&gt;Zoltan Pozsar&lt;/a&gt;, the Hungarian-American economist who mapped the plumbing of modern money before most people knew there was plumbing to map. When he said it to Bloomberg in 2019, he was trying to explain why repo markets &lt;em&gt;(&lt;a href="https://en.wikipedia.org/wiki/Repurchase_agreement"&gt;the overnight lending infrastructure that lubricates trillions in daily transactions&lt;/a&gt;)&lt;/em&gt; had just seized up in ways the Federal Reserve didn&amp;rsquo;t anticipate.&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;ve &lt;a href="https://philippdubach.com/posts/pozsars-bretton-woods-iii-the-framework-1/2/"&gt;written about Pozsar&amp;rsquo;s work before&lt;/a&gt;, particularly his &amp;ldquo;Bretton Woods III&amp;rdquo; thesis about the shifting role of the dollar. But his earlier research on shadow banking and repo markets feels increasingly relevant as we enter 2026. In December 2025, the Office of Financial Research &lt;a href="https://www.financialresearch.gov/the-ofr-blog/2025/12/04/sizing-us-repo-market/"&gt;published new data&lt;/a&gt; on the size of the U.S. repo market. The number: $12.6 trillion in average daily exposures. That&amp;rsquo;s roughly $700 billion larger than previous estimates; a measurement error roughly the size of the entire Swiss banking system.&lt;/p&gt;
&lt;p&gt;Where did the extra $700 billion come from? Mostly from what the OFR calls &amp;ldquo;non-centrally cleared bilateral repo,&amp;rdquo; or NCCBR; the segment of the market that doesn&amp;rsquo;t flow through clearinghouses or the tri-party platforms that regulators can easily observe. This bilateral segment alone accounts for $5 trillion. Until the OFR&amp;rsquo;s new transaction-level data collection, which only reached full implementation in July 2025, much of this activity was essentially invisible.&lt;/p&gt;
&lt;p&gt;This matters because repo is not a peripheral market. It is the market through which cash-rich institutions lend to cash-poor ones, every single day, against collateral. Money market funds, hedge funds, broker-dealers, asset managers, banks. When repo works, it&amp;rsquo;s invisible. When it doesn&amp;rsquo;t, as in September 2019, overnight rates spike and the Fed scrambles to inject liquidity.&lt;/p&gt;
&lt;p&gt;On December 31, 2025, eligible financial firms borrowed a record $74.6 billion from the Fed&amp;rsquo;s Standing Repo Facility, which is the highest since its launch in 2021. The Fed had just &lt;a href="https://tellerwindow.newyorkfed.org/2025/12/23/standing-repo-operations-in-the-federal-reserves-monetary-policy-implementation-framework/"&gt;eliminated the $500 billion daily cap&lt;/a&gt; on this facility, a quiet acknowledgment that the ceiling might actually matter. Quantitative tightening officially ended on December 1, 2025. Reserves had fallen to $2.8 trillion, their lowest in four years.&lt;/p&gt;
&lt;p&gt;The plumbing was straining again.&lt;/p&gt;
&lt;p&gt;Pozsar&amp;rsquo;s 2014 OFR paper, &amp;ldquo;Shadow Banking: The Money View,&amp;rdquo; introduced a framework that still haunts anyone who reads it carefully. At its core is a hierarchy of money. Currency sits at the top, the liability of the sovereign. Below that: bank deposits, insured and backstopped by the FDIC. Below that: repo, secured by collateral but not by any explicit government guarantee. Below that: the constant-NAV shares of money market funds, which promise par redemption but rest on layers of private credit puts, reputational commitments, and the fragile assumption that nothing will go wrong simultaneously.&lt;/p&gt;
&lt;p&gt;The key insight is that what counts as &amp;ldquo;money&amp;rdquo; depends on where you sit in this hierarchy. For a retail depositor, money is an insured bank balance. For a corporate treasurer managing $50 billion in cash, money begins where M2 ends—in repo, in money fund shares, in instruments that offer some semblance of safety at scale but lack the explicit backstops that smaller depositors take for granted.&lt;/p&gt;
&lt;p&gt;Pozsar called these institutions &amp;ldquo;cash pools&amp;rdquo;—the corporate treasuries, sovereign wealth funds, and asset managers whose cash balances are too large to fit within the insured deposit system. They need money-like instruments, but the supply of truly safe assets (Treasury bills, insured deposits) is inelastic. So they reach for the next best thing: shadow money claims backed by private collateral and private liquidity puts.&lt;/p&gt;
&lt;p&gt;Now, the new OFR data reveals that $5 trillion of daily repo activity, roughly 40% of the market, occurs in bilateral arrangements that, until recently, were largely opaque to regulators. The collateral backing this activity is 61.8% Treasuries, but that leaves substantial room for corporate bonds, agency MBS, and other assets that can gap in value during stress.&lt;/p&gt;
&lt;p&gt;Pozsar&amp;rsquo;s 2019 Global Money Notes described the repo market as a hierarchy with dealers at the center and the Fed at the top, operating as a &amp;ldquo;dealer of last resort&amp;rdquo; when private balance sheets reach their limits. The Standing Repo Facility was supposed to institutionalize this role, providing a ceiling on overnight rates by offering funding at a known price.&lt;/p&gt;
&lt;p&gt;The facility sat unused for years while reserves were abundant. Now, as reserves decline, usage is spiking at quarter-ends and year-ends, exactly when balance sheet constraints bind hardest. The question Pozsar raised in 2019 remains unanswered: can the Fed operate a standing repo facility that polices the top of its target range without losing control over its balance sheet size? Or will it be forced, eventually, to monetize excess collateral on a scale that looks a lot like QE by another name?&lt;/p&gt;
&lt;p&gt;There&amp;rsquo;s a concept in infrastructure studies called &amp;ldquo;seamful design&amp;rdquo;: the idea that making the seams of a system visible can improve rather than degrade the user experience. GPS, for instance, became more useful when designers surfaced uncertainty estimates rather than hiding them.&lt;/p&gt;
&lt;p&gt;The repo market is the opposite: seamless by design, invisible until it fails. The OFR&amp;rsquo;s new data collection is, in some sense, an attempt to add seams, to make visible what was hidden, to understand the shape of the beast before the next crisis. But measurement is not control. Knowing the market is $12.6 trillion doesn&amp;rsquo;t tell you what happens when a major counterparty fails, or when a category of collateral suddenly trades at distressed prices, or when the behavioral assumptions embedded in banks&amp;rsquo; liquidity models turn out to be wrong.&lt;/p&gt;
&lt;p&gt;Pozsar understood this intuitively. His famous &lt;a href="https://www.newyorkfed.org/medialibrary/media/research/economists/adrian/1306adri_map.pdf"&gt;map of the shadow banking system&lt;/a&gt;, posted in the New York Fed&amp;rsquo;s briefing room, required zooming in seven or eight times to read any detail. Colleagues who didn&amp;rsquo;t take the time to study it, he warned, were looking at &amp;ldquo;10% of the picture.&amp;rdquo;&lt;/p&gt;</description></item><item><title>The Market Can Stay Irrational Longer Than You Can Stay Solvent</title><link>http://philippdubach.com/posts/the-market-can-stay-irrational-longer-than-you-can-stay-solvent/</link><pubDate>Sun, 11 Jan 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/the-market-can-stay-irrational-longer-than-you-can-stay-solvent/</guid><description>&lt;p&gt;A friend recently recommended &lt;a href="https://en.wikipedia.org/wiki/Steve_Eisman"&gt;Steve Eisman&lt;/a&gt;&amp;rsquo;s podcast to me. Eisman, you might recall, is the hedge fund manager portrayed in The Big Short who famously bet against subprime mortgages before the 2008 crisis. In his &lt;a href="https://www.youtube.com/@RealEismanPlaybook"&gt;most recent episode&lt;/a&gt;, Eisman laid out a thesis for something that made me uncomfortable ever since the &lt;a href="https://en.wikipedia.org/wiki/2020_stock_market_crash"&gt;Covid-19 stock market crash&lt;/a&gt; recovery: the U.S. equity market has structurally decoupled from everyday economic reality.&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;ve written &lt;a href="https://philippdubach.com/posts/how-ai-is-shaping-my-investment-portfolio-for-2026/"&gt;about market concentration&lt;/a&gt; in my 2026 portfolio allocation. But Eisman&amp;rsquo;s point isn&amp;rsquo;t just about concentration. It&amp;rsquo;s about what this concentration means for everyone else. Consider what happens to consumer-exposed sectors. Combined, healthcare, consumer discretionary, and consumer staples have fallen from 38% of the index in 2015 to just 25% today. This matters because roughly &lt;a href="https://fred.stlouisfed.org/series/DPCERE1Q156NBEA"&gt;70% of U.S. GDP is consumer-driven&lt;/a&gt;. The traditional logic was simple: consumer spending drives the economy, consumer stocks reflect that spending, and therefore the stock market reflects economic health. That relationship has broken down.&lt;/p&gt;
&lt;p&gt;The disconnect shows up in daily American life. Healthcare costs continue rising, housing remains unaffordable for many, and grocery prices have yet to normalize. These are real pressures on real households. Yet the S&amp;amp;P 500 gained 16% in 2025, with the Nasdaq up 21%. The market doesn&amp;rsquo;t care about rent or insurance premiums because the companies reflecting those costs barely register in the index anymore. As Eisman puts it:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The market has become unmoored from everyday life.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This creates a structural problem for active managers that compounds over time. When &lt;a href="https://www.slickcharts.com/sp500"&gt;NVIDIA alone represents 7.7% of the S&amp;amp;P 500&lt;/a&gt;, Apple 6.8%, and Microsoft 6.1%, most institutional mandates physically prevent managers from holding proportional positions. Risk limits cap initial positions at perhaps 5% of assets under management. Sector allocation rules require diversification across all eleven sectors. The result is systematic underweighting of the fastest-growing names. Meanwhile, the bottom five sectors combined represent just 14% of the index. Real estate, with 31 constituents, accounts for barely 2%. Why dedicate research resources to an entire sector that can only marginally move your portfolio?&lt;/p&gt;
&lt;p&gt;The rise of passive investing amplifies all of this. Index funds now control roughly 60% of flows versus 40% for active managers. When money enters an index fund, it buys stocks in proportion to their existing market cap. Large positions grow larger. There&amp;rsquo;s no portfolio manager deciding NVIDIA looks expensive. The buying is mechanical, price-insensitive, and self-reinforcing. This doesn&amp;rsquo;t eliminate price discovery entirely.&lt;/p&gt;
&lt;p&gt;Eisman points to Oracle&amp;rsquo;s Q3 2025 experience: shares surged after reporting a massive backlog, then corrected below pre-earnings levels once investors realized the backlog concentrated in a single customer with questionable financing. Active managers still matter. They just matter less.&lt;/p&gt;
&lt;p&gt;In a normal correction, sellers meet buyers who evaluate whether prices have become attractive. In a passive-dominated market, redemptions trigger mechanical selling. Index funds don&amp;rsquo;t decide that a 20% drawdown makes stocks compelling. They sell what they own in proportion to what they own. If active managers control only 40% of flows, the stabilizing bid may prove insufficient. The &lt;a href="https://www.reuters.com/markets/"&gt;February-April 2025 correction&lt;/a&gt; saw the S&amp;amp;P fall 19% peak-to-trough. Eisman&amp;rsquo;s assessment: if an actual recession materializes, or if AI spending disappoints expectations,&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;the decline will almost certainly be steeper. It will be fast and very ugly.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;There&amp;rsquo;s also a tax dimension creating behavioral lock-in. Years of technology outperformance have embedded massive unrealized capital gains in both retail and institutional portfolios. Selling NVIDIA means realizing those gains and paying taxes on them. Investors avoid this until forced by margin calls, redemptions, or actual fundamental collapse. This creates asymmetric liquidity: plenty of buyers on the way up, scarce ones on the way down.&lt;/p&gt;
&lt;p&gt;What does this mean for portfolio construction? First, understand that traditional cap-weighted benchmarks now represent a concentrated bet on technology and AI capital expenditure. Second, active management faces structural headwinds that have nothing to do with manager skill. Third, liquidity assumptions that held in previous corrections may not hold in the next one. And fourth, consumer welfare can deteriorate materially without meaningfully impacting index returns. The K-shaped economy produces a K-shaped market, where the experience of median households and the experience of median stock index performance have genuinely diverged.&lt;/p&gt;
&lt;aside class="disclaimer" role="note" aria-label="Disclaimer"&gt;
&lt;div class="disclaimer-content"&gt;&lt;p&gt;&lt;strong&gt;Disclaimer:&lt;/strong&gt; All opinions expressed are my own. This is not investment, financial, tax, or legal advice. Past performance does not indicate future results. Do your own research and consult qualified professionals before making financial decisions. No liability accepted for any losses.&lt;/p&gt;&lt;/div&gt;
&lt;/aside&gt;</description></item><item><title>Social Media Success Prediction: BERT Models for Post Titles</title><link>http://philippdubach.com/posts/social-media-success-prediction-bert-models-for-post-titles/</link><pubDate>Sat, 10 Jan 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/social-media-success-prediction-bert-models-for-post-titles/</guid><description>&lt;p&gt;Last week I published a &lt;a href="https://philippdubach.com/standalone/hn-sentiment/"&gt;Hacker News title sentiment analysis&lt;/a&gt; based on the &lt;a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5910263"&gt;Attention Dynamics in Online Communities&lt;/a&gt; paper I have been working on. The &lt;a href="https://news.ycombinator.com/item?id=46512881"&gt;discussion on Hacker News&lt;/a&gt; raised the obvious question: can you actually predict what will do well here?&lt;a href="#lightbox-https%3a--static-philippdubach-com-hn_post_frontpage2-png-0" style="display: block; width: 70%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/https://static.philippdubach.com/hn_post_frontpage2.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/https://static.philippdubach.com/hn_post_frontpage2.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/https://static.philippdubach.com/hn_post_frontpage2.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/https://static.philippdubach.com/hn_post_frontpage2.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/https://static.philippdubach.com/hn_post_frontpage2.png 1200w"
sizes="70vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/https://static.philippdubach.com/hn_post_frontpage2.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/https://static.philippdubach.com/hn_post_frontpage2.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/https://static.philippdubach.com/hn_post_frontpage2.png 1440w"
sizes="70vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/https://static.philippdubach.com/hn_post_frontpage2.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/https://static.philippdubach.com/hn_post_frontpage2.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/https://static.philippdubach.com/hn_post_frontpage2.png 2000w"
sizes="70vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/https://static.philippdubach.com/hn_post_frontpage2.png"
alt="Hacker News Frontpage"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
The honest answer is: partially. Timing matters. News cycles matter. Who submits matters. Weekend versus Monday morning matters. Most of these factors aren&amp;rsquo;t in the title. But titles aren&amp;rsquo;t nothing either. &amp;ldquo;Show HN&amp;rdquo; signals something. So does phrasing, length, and topic selection. The question becomes: how much signal can you extract from 80 characters?&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://news.ycombinator.com/news"&gt;Hacker News&lt;/a&gt; (HN) is a social news website focusing on computer science and entrepreneurship. It is run by the investment fund and startup incubator &lt;a href="https://www.ycombinator.com"&gt;Y Combinator&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This isn&amp;rsquo;t new territory. &lt;a href="https://minimaxir.com/2017/06/reddit-deep-learning/"&gt;Max Woolf built a Reddit submission predictor&lt;/a&gt; back in 2017, and &lt;a href="https://ontology2.com/essays/ClassifyingHackerNewsArticles/"&gt;ontology2 trained an HN classifier&lt;/a&gt; using logistic regression on title words. Both found similar ceilings; around 0.76 AUC with classical approaches. I wanted to see what modern transformers could add.&lt;/p&gt;
&lt;p&gt;The baseline was DistilBERT, fine-tuned on 90,000 HN posts. ROC AUC of 0.654, trained in about 20 minutes on a T4 GPU. Not bad for something that only sees titles. Then RoBERTa with label smoothing pushed it to 0.692. Progress felt easy.&lt;a href="#lightbox-03_roc_curve-png-2" style="display: block; width: 70%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/03_roc_curve.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/03_roc_curve.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/03_roc_curve.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/03_roc_curve.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/03_roc_curve.png 1200w"
sizes="70vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/03_roc_curve.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/03_roc_curve.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/03_roc_curve.png 1440w"
sizes="70vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/03_roc_curve.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/03_roc_curve.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/03_roc_curve.png 2000w"
sizes="70vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/03_roc_curve.png"
alt="ROC curve comparing model versions"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
What if sentence embeddings captured something classification heads missed? I built an ensemble: &lt;a href="https://www.sbert.net/"&gt;SBERT&lt;/a&gt; for semantic features, RoBERTa for discrimination, weighted average at the end. The validation AUC jumped to 0.714.&lt;/p&gt;
&lt;p&gt;The problem was hiding in the train/test split. I&amp;rsquo;d used random sampling. HN has strong temporal correlations: topics cluster, writing styles evolve, news cycles create duplicates. A random split let the model see the future. SBERT&amp;rsquo;s semantic embeddings matched near-duplicate posts across the split perfectly.&lt;/p&gt;
&lt;p&gt;When I switched to a strict temporal split, training on 2022-early 2024 and testing on late 2024 onward, the ensemble dropped to 0.693. More revealing: the optimal SBERT weight went from 0.35 to 0.10. SBERT was contributing almost nothing. The model had memorized temporal patterns, not learned to predict.&lt;a href="#lightbox-02_calibration-png-3" style="display: block; width: 70%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/02_calibration.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/02_calibration.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/02_calibration.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/02_calibration.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/02_calibration.png 1200w"
sizes="70vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/02_calibration.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/02_calibration.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/02_calibration.png 1440w"
sizes="70vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/02_calibration.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/02_calibration.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/02_calibration.png 2000w"
sizes="70vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/02_calibration.png"
alt="Calibration plot showing predicted vs actual probabilities"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
I kept RoBERTa, added more regularization, dropped from 0.1 to 0.2 dropout, weight decay from 0.01 to 0.05, froze the lower six transformer layers. The model got worse at fitting training data. Train AUC dropped from 0.803 to 0.727.&lt;/p&gt;
&lt;p&gt;But the train-test gap collapsed from 0.109 to 0.042. That&amp;rsquo;s a 61% reduction in overfitting. Test AUC of 0.685 versus the ensemble&amp;rsquo;s 0.693, a difference that vanishes once you account for confidence intervals. And now inference runs on a single model, half the latency, no SBERT dependency, 500MB instead of 900MB.&lt;a href="#lightbox-table_version_comparison-png-4" style="display: block; width: 70%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/table_version_comparison.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/table_version_comparison.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/table_version_comparison.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/table_version_comparison.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/table_version_comparison.png 1200w"
sizes="70vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/table_version_comparison.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/table_version_comparison.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/table_version_comparison.png 1440w"
sizes="70vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/table_version_comparison.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/table_version_comparison.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/table_version_comparison.png 2000w"
sizes="70vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/table_version_comparison.png"
alt="Model version comparison showing evolution from V1 to V7"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;a href="#lightbox-06_score_by_category-png-5" style="display: block; width: 70%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/06_score_by_category.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/06_score_by_category.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/06_score_by_category.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/06_score_by_category.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/06_score_by_category.png 1200w"
sizes="70vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/06_score_by_category.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/06_score_by_category.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/06_score_by_category.png 1440w"
sizes="70vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/06_score_by_category.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/06_score_by_category.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/06_score_by_category.png 2000w"
sizes="70vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/06_score_by_category.png"
alt="Prediction scores by content category"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
The other lesson was calibration. A model that says 0.8 probability should mean &amp;ldquo;70% of posts I give this score actually hit 100 points.&amp;rdquo; Neural networks trained on cross-entropy don&amp;rsquo;t do this naturally. They&amp;rsquo;re overconfident. I used &lt;a href="https://scikit-learn.org/stable/modules/isotonic.html"&gt;isotonic regression&lt;/a&gt; on the validation set to fix the mapping. Expected calibration error (ECE) measures this gap:&lt;/p&gt;
$$ECE = \sum_{b=1}^{B} \frac{n_b}{N} \left| \text{acc}(b) - \text{conf}(b) \right|$$&lt;p&gt;where you bin predictions by confidence, then measure how far off the actual accuracy is from the predicted confidence in each bin. ECE went from 0.089 to 0.043. Now when the model says 0.4, it&amp;rsquo;s telling the truth.&lt;/p&gt;
&lt;p&gt;In practice, the model provides meaningful lift. If you only look at the top 10% of predictions by score, 62% of them are actual hits, roughly 1.9x better than random selection:&lt;a href="#lightbox-table_lift_analysis-png-6" style="display: block; width: 50%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/table_lift_analysis.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/table_lift_analysis.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/table_lift_analysis.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/table_lift_analysis.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/table_lift_analysis.png 1200w"
sizes="50vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/table_lift_analysis.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/table_lift_analysis.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/table_lift_analysis.png 1440w"
sizes="50vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/table_lift_analysis.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/table_lift_analysis.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/table_lift_analysis.png 2000w"
sizes="50vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/table_lift_analysis.png"
alt="Lift analysis showing precision at different thresholds"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;a href="#lightbox-08_calibration_error-png-7" style="display: block; width: 70%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/08_calibration_error.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/08_calibration_error.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/08_calibration_error.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/08_calibration_error.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/08_calibration_error.png 1200w"
sizes="70vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/08_calibration_error.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/08_calibration_error.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/08_calibration_error.png 1440w"
sizes="70vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/08_calibration_error.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/08_calibration_error.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/08_calibration_error.png 2000w"
sizes="70vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/08_calibration_error.png"
alt="Calibration error distribution"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
About training speed: I used the &lt;a href="https://www.nvidia.com/en-us/data-center/h100/"&gt;NVIDIA H100 GPU&lt;/a&gt;, which runs around 18x more expensive than the T4 per hour on hosted (Google Colab) runtimes. A sensible middle ground would be an A100 (40 or 80GB VRAM) or L4, training 3-5x faster than T4, maybe 5-7 minutes instead of 20-30. But watching epochs fly by at ~130 iterations per second after coming from T4&amp;rsquo;s ~3 iterations per second was a different experience. &lt;a href="#lightbox-colab-training-hn-png-8" style="display: block; width: 70%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/colab-training-hn.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/colab-training-hn.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/colab-training-hn.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/colab-training-hn.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/colab-training-hn.png 1200w"
sizes="70vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/colab-training-hn.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/colab-training-hn.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/colab-training-hn.png 1440w"
sizes="70vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/colab-training-hn.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/colab-training-hn.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/colab-training-hn.png 2000w"
sizes="70vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/colab-training-hn.png"
alt="Colab notebook showing H100 training at 130 it/s"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
The model learned some intuitive patterns. &amp;ldquo;Show HN&amp;rdquo; titles score higher. Deep technical dives do well. Generic news aggregation doesn&amp;rsquo;t. Titles between 40-80 characters perform better than very short or very long ones. Some of this probably reflects real engagement patterns. Some of it is noise the model hasn&amp;rsquo;t been sufficiently regularized to ignore.&lt;a href="#lightbox-10_title_length_performance-png-9" style="display: block; width: 70%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/10_title_length_performance.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/10_title_length_performance.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/10_title_length_performance.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/10_title_length_performance.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/10_title_length_performance.png 1200w"
sizes="70vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/10_title_length_performance.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/10_title_length_performance.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/10_title_length_performance.png 1440w"
sizes="70vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/10_title_length_performance.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/10_title_length_performance.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/10_title_length_performance.png 2000w"
sizes="70vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/10_title_length_performance.png"
alt="Model performance by title length"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;Running a few titles through the model shows what it picks up on:&lt;a href="#lightbox-table_title_workshop-png-10" style="display: block; width: 70%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/table_title_workshop.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/table_title_workshop.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/table_title_workshop.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/table_title_workshop.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/table_title_workshop.png 1200w"
sizes="70vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/table_title_workshop.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/table_title_workshop.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/table_title_workshop.png 1440w"
sizes="70vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/table_title_workshop.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/table_title_workshop.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/table_title_workshop.png 2000w"
sizes="70vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/table_title_workshop.png"
alt="Title workshop showing model predictions for different phrasings"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Vague claims score low. Specificity helps. First-person &amp;ldquo;I built&amp;rdquo; framing does well, which matches what actually gets upvoted. The model isn&amp;rsquo;t learning to game HN; it&amp;rsquo;s learning what HN already rewards.&lt;/p&gt;
&lt;p&gt;The model now runs, scoring articles in an &lt;a href="https://github.com/philippdubach/rss-reader"&gt;RSS reader pipeline&lt;/a&gt; I built. Does it help? Mostly. I still click on things marked low probability. But the high-confidence predictions are usually right. It&amp;rsquo;s a filter, not an oracle.&lt;a href="#lightbox-dashboard-hn-scoring-png-11" style="display: block; width: 70%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/dashboard-hn-scoring.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/dashboard-hn-scoring.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/dashboard-hn-scoring.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/dashboard-hn-scoring.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/dashboard-hn-scoring.png 1200w"
sizes="70vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/dashboard-hn-scoring.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/dashboard-hn-scoring.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/dashboard-hn-scoring.png 1440w"
sizes="70vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/dashboard-hn-scoring.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/dashboard-hn-scoring.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/dashboard-hn-scoring.png 2000w"
sizes="70vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/dashboard-hn-scoring.png"
alt="RSS reader dashboard showing HN prediction scores"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;&lt;a href="https://huggingface.co/philippdubach/hn-success-predictor"&gt;Model on HuggingFace&lt;/a&gt; — Download the weights and run inference locally
&lt;br&gt;
&lt;a href="https://github.com/philippdubach/rss-reader"&gt;RSS Reader Pipeline&lt;/a&gt; — Full scoring pipeline with feed aggregation
&lt;br&gt;
&lt;a href="https://huggingface.co/philippdubach/hn-success-predictor/blob/main/training.ipynb"&gt;Training Notebook&lt;/a&gt; — Colab-ready notebook with the complete training code&lt;/p&gt;
&lt;p&gt;On a side note: The patterns here aren&amp;rsquo;t specific to Hacker News or online communities. Temporal leakage shows up whenever you&amp;rsquo;re predicting something that evolves over time: credit defaults, client churn, market regimes. The fix is the same: validate on future data, not random holdouts. Calibration matters anywhere probabilities drive decisions. A loan approval model that says &amp;ldquo;70% chance of repayment&amp;rdquo; needs that number to mean something. Overfitting to training data is how banks end up with models that look great in backtests and fail in production.&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;ve built &lt;a href="https://philippdubach.com/projects/"&gt;similar systems for other domains&lt;/a&gt;: sentiment-based trading signals, glycemic response prediction, portfolio optimization. The ML fundamentals transfer. What changes is the domain knowledge needed to avoid the obvious mistakes, like training on data that wouldn&amp;rsquo;t have been available at prediction time, or trusting metrics that don&amp;rsquo;t reflect real-world performance.&lt;/p&gt;</description></item><item><title>Beyond Vector Search: Why LLMs Need Episodic Memory</title><link>http://philippdubach.com/posts/beyond-vector-search-why-llms-need-episodic-memory/</link><pubDate>Fri, 09 Jan 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/beyond-vector-search-why-llms-need-episodic-memory/</guid><description>&lt;p&gt;You&amp;rsquo;ve seen this message before. Copilot pausing; In long sessions, it happens often enough that I started wondering what&amp;rsquo;s actually going on in there. Hence this post.&lt;a href="#lightbox-Summarizing_conversation_history-png-0" style="display: block; width: 40%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/Summarizing_conversation_history.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/Summarizing_conversation_history.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/Summarizing_conversation_history.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/Summarizing_conversation_history.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/Summarizing_conversation_history.png 1200w"
sizes="40vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/Summarizing_conversation_history.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/Summarizing_conversation_history.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/Summarizing_conversation_history.png 1440w"
sizes="40vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/Summarizing_conversation_history.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/Summarizing_conversation_history.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/Summarizing_conversation_history.png 2000w"
sizes="40vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/Summarizing_conversation_history.png"
alt="Hierarchical memory architecture for LLM applications"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
The short answer: context windows grew larger. &lt;a href="https://platform.claude.com/docs/en/build-with-claude/context-windows"&gt;Claude handles 200K tokens&lt;/a&gt;, &lt;a href="https://gemini.google/overview/long-context/"&gt;Gemini claims a million&lt;/a&gt;. But bigger windows aren&amp;rsquo;t memory. They&amp;rsquo;re a larger napkin you throw away when dinner&amp;rsquo;s over.&lt;/p&gt;
&lt;p&gt;For som time I was convinced that vector databases would solve this. Embed everything, store it geometrically, retrieve by similarity. Elegant in theory. Try encoding &amp;ldquo;first we did X, then Y happened, which caused Z.&amp;rdquo; Sequences don&amp;rsquo;t live naturally in vector space. Neither do facts that change over time. Your database might confidently tell you Bonn is Germany&amp;rsquo;s capital if you fed it the wrong decade of documents.&lt;/p&gt;
&lt;p&gt;What caught my attention is &lt;a href="https://openreview.net/forum?id=BI2int5SAC"&gt;EM-LLM&lt;/a&gt;. The approach is basically &amp;ldquo;what if we just copied how brains do it?&amp;rdquo; They segment conversation into episodes using surprise detection; when something unexpected happens, that&amp;rsquo;s a boundary. Retrieval pulls not just similar content but temporally adjacent content too. You don&amp;rsquo;t just remember what you&amp;rsquo;re looking for. You remember what happened next. Their event boundaries actually correlate with where humans perceive breaks in experience. Either a coincidence or we&amp;rsquo;re onto something.&lt;a href="#lightbox-llm-memory-architecture2-png-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/llm-memory-architecture2.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/llm-memory-architecture2.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/llm-memory-architecture2.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/llm-memory-architecture2.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/llm-memory-architecture2.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/llm-memory-architecture2.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/llm-memory-architecture2.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/llm-memory-architecture2.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/llm-memory-architecture2.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/llm-memory-architecture2.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/llm-memory-architecture2.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/llm-memory-architecture2.png"
alt="Hierarchical memory architecture for LLM applications"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Knowledge graphs are the other path. &lt;a href="https://github.com/saxenauts/persona"&gt;Persona Graph&lt;/a&gt; treats memory as user-owned, with concepts as nodes. The connection between &amp;ldquo;volatility surface&amp;rdquo; and &amp;ldquo;Lightning McQueen&amp;rdquo; exists in my head (for some reason) but probably not yours. A flat embedding can&amp;rsquo;t capture that your graph is different from mine. &lt;a href="https://github.com/HawkinsRAG/HawkinsDB"&gt;HawkinsDB&lt;/a&gt; pulls from Thousand Brains theory. &lt;a href="https://docs.letta.com/"&gt;Letta&lt;/a&gt; just ships, production-ready blocks you can use today. &lt;a href="https://github.com/CaviraOSS/OpenMemory"&gt;OpenMemory&lt;/a&gt; goes further, separating emotional memory from procedural from episodic, with actual decay curves instead of hard timeouts. &lt;a href="https://mem0.ai/blog/llm-chat-history-summarization"&gt;Mem0&lt;/a&gt; reports 80-90% token cost reduction while quality goes up 26%. I can&amp;rsquo;t validate the claim, but if it holds, that&amp;rsquo;s more than optimization.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/FYYFU/HeadKV/"&gt;HeadKV&lt;/a&gt; figured out that attention heads aren&amp;rsquo;t created equal: some matter for memory, most don&amp;rsquo;t. Throw away 98.5% of your key-value cache, keep the important heads, lose almost nothing. &lt;a href="https://arxiv.org/abs/2410.13346"&gt;Sakana AI&lt;/a&gt; went weirder: tiny neural networks that decide per-token whether to remember or forget, evolved rather than trained. Sounds like it shouldn&amp;rsquo;t work. Apparently works great.&lt;/p&gt;
&lt;p&gt;Here&amp;rsquo;s what I keep coming back to: in any mature system, most of the graph will be memories of memories. You ask me my favorite restaurants, I think about it, answer, and now &amp;ldquo;that list I made&amp;rdquo; becomes its own retrievable thing. Next time someone asks about dinner plans, I don&amp;rsquo;t re-derive preferences from first principles. I remember what I concluded last time. Psychologists say &lt;a href="https://www.taylorfrancis.com/books/mono/10.4324/9781315755854/working-memory-pierre-barrouillet-val%C3%A9rie-camos"&gt;this is how human recall actually works&lt;/a&gt;; you&amp;rsquo;re not accessing the original, you&amp;rsquo;re accessing the last retrieval. Gets a little distorted each time.&lt;/p&gt;
&lt;p&gt;Should the model control its own memory? Give it a &amp;ldquo;remember this&amp;rdquo; tool? I don&amp;rsquo;t think so, not yet. &lt;a href="https://arxiv.org/abs/2505.02151"&gt;These things are overconfident&lt;/a&gt;. Maybe that changes. For now, memory probably needs to happen around the model, not through it. Eventually some learned architecture will make all this scaffolding obsolete. Train memory into the weights directly. I have no idea what that looks like. Sparse mixture of experts with overnight updates? Some forgotten recurrent trick? Right now it&amp;rsquo;s all duct tape and cognitive science papers.&lt;aside class="inline-newsletter" aria-label="Newsletter signup"&gt;
&lt;div class="inline-newsletter-content"&gt;
&lt;p class="inline-newsletter-headline"&gt;Enjoy this writing? Get new posts, projects, and articles delivered monthly.&lt;/p&gt;
&lt;form id="inline-newsletter-3-form" class="inline-newsletter-form"&gt;
&lt;label for="inline-newsletter-3-email" class="visually-hidden"&gt;Email address&lt;/label&gt;
&lt;input
type="email"
id="inline-newsletter-3-email"
name="email"
placeholder="your@email.com"
required
class="inline-newsletter-input"
aria-label="Email address"
/&gt;
&lt;button type="submit" class="inline-newsletter-button"&gt;Sign Up&lt;/button&gt;
&lt;/form&gt;
&lt;p id="inline-newsletter-3-privacy" class="inline-newsletter-privacy"&gt;&lt;a href="http://philippdubach.com/posts/building-a-no-tracking-newsletter-from-markdown-to-distribution/"&gt;No tracking&lt;/a&gt;. Unsubscribe anytime.&lt;/p&gt;
&lt;div id="inline-newsletter-3-message" class="inline-newsletter-message" style="display: none;"&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/aside&gt;
&lt;script&gt;
(function() {
var formId = 'inline-newsletter-3-form';
var messageId = 'inline-newsletter-3-message';
var emailId = 'inline-newsletter-3-email';
var privacyId = 'inline-newsletter-3-privacy';
function init() {
var form = document.getElementById(formId);
var messageDiv = document.getElementById(messageId);
var emailInput = document.getElementById(emailId);
var privacyDiv = document.getElementById(privacyId);
if (privacyDiv &amp;&amp; !privacyDiv.dataset.countLoaded) {
privacyDiv.dataset.countLoaded = 'true';
fetch('https://newsletter-api.philippd.workers.dev/api/subscriber-count')
.then(function(r) { return r.json(); })
.then(function(data) {
if (data.display) {
var countText = document.createTextNode('Join ' + data.display + ' readers. ');
privacyDiv.insertBefore(countText, privacyDiv.firstChild);
}
})
.catch(function() { });
}
if (!form) return;
form.addEventListener('submit', function(e) {
e.preventDefault();
var email = emailInput.value.trim();
if (!email) return;
var emailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
if (!emailRegex.test(email)) {
showMessage('Please enter a valid email address.', 'error');
return;
}
var submitButton = form.querySelector('button[type="submit"]');
submitButton.disabled = true;
submitButton.textContent = 'Subscribing...';
fetch('https://newsletter-api.philippd.workers.dev/api/subscribe', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ email: email })
})
.then(function(response) { return response.json(); })
.then(function(data) {
if (data.success) {
form.style.display = 'none';
document.querySelector('#' + formId).closest('.inline-newsletter').querySelector('.inline-newsletter-privacy').style.display = 'none';
showMessage('Thanks for subscribing! You\'ll receive the next newsletter in your inbox.', 'success');
} else {
showMessage(data.error || 'Something went wrong. Please try again.', 'error');
submitButton.disabled = false;
submitButton.textContent = 'Sign Up';
}
})
.catch(function() {
showMessage('Something went wrong. Please try again later.', 'error');
submitButton.disabled = false;
submitButton.textContent = 'Sign Up';
});
});
function showMessage(text, type) {
messageDiv.textContent = text;
messageDiv.className = 'inline-newsletter-message inline-newsletter-message-' + type;
messageDiv.style.display = 'block';
}
}
if (document.readyState === 'loading') {
document.addEventListener('DOMContentLoaded', init);
} else {
init();
}
})();
&lt;/script&gt;
&lt;/p&gt;</description></item><item><title>65% of Hacker News Posts Have Negative Sentiment, and They Outperform</title><link>http://philippdubach.com/posts/65-of-hacker-news-posts-have-negative-sentiment-and-they-outperform/</link><pubDate>Wed, 07 Jan 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/65-of-hacker-news-posts-have-negative-sentiment-and-they-outperform/</guid><description>&lt;h2 id="negativity-bias-and-engagement-on-hacker-news"&gt;Negativity Bias and Engagement on Hacker News&lt;/h2&gt;
&lt;p&gt;This Hacker News sentiment analysis began with a simple observation: posts with negative sentiment average 35.6 points on &lt;a href="https://news.ycombinator.com"&gt;Hacker News&lt;/a&gt;. The overall average is 28 points. That&amp;rsquo;s a 27% performance premium for negativity.&lt;/p&gt;
&lt;a href="#lightbox-hn-sentiment-png-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/hn-sentiment.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/hn-sentiment.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/hn-sentiment.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/hn-sentiment.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/hn-sentiment.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/hn-sentiment.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/hn-sentiment.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/hn-sentiment.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/hn-sentiment.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/hn-sentiment.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/hn-sentiment.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/hn-sentiment.png"
alt="Hacker News sentiment analysis distribution across 32,000 posts showing negative skew"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;This finding comes from an empirical study I&amp;rsquo;ve been running on HN attention dynamics, covering decay curves, preferential attachment, survival probability, and early-engagement prediction. The preprint is &lt;a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5910263"&gt;available on SSRN&lt;/a&gt;. I already had a gut feeling. Across 32,000 posts and 340,000 comments, nearly 65% register as negative. This might be a feature of my classifier being miscalibrated toward negativity; yet the pattern holds across six different models.&lt;/p&gt;
&lt;h2 id="six-model-sentiment-comparison-transformers-vs-llms"&gt;Six-Model Sentiment Comparison: Transformers vs LLMs&lt;/h2&gt;
&lt;a href="#lightbox-sentiment_models_comparison_6models-png-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/sentiment_models_comparison_6models.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/sentiment_models_comparison_6models.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/sentiment_models_comparison_6models.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/sentiment_models_comparison_6models.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/sentiment_models_comparison_6models.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/sentiment_models_comparison_6models.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/sentiment_models_comparison_6models.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/sentiment_models_comparison_6models.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/sentiment_models_comparison_6models.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/sentiment_models_comparison_6models.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/sentiment_models_comparison_6models.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/sentiment_models_comparison_6models.png"
alt="Sentiment classification comparison across six NLP models: DistilBERT, BERT Multi, RoBERTa, Llama 3.1 8B, Mistral 3.1 24B, and Gemma 3 12B"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;I tested three transformer-based classifiers (DistilBERT, BERT Multi, RoBERTa) and three LLMs (Llama 3.1 8B, Mistral 3.1 24B, Gemma 3 12B). The distributions vary, but the negative skew persists across all of them (inverted scale for 2-6). The results I use in my dashboard are from DistilBERT because it runs efficiently in my Cloudflare-based pipeline.&lt;/p&gt;
&lt;p&gt;What counts as &amp;ldquo;negative&amp;rdquo; here? Criticism of technology, skepticism toward announcements, complaints about industry practices, frustration with APIs. The usual. It&amp;rsquo;s worth noting that technical critique reads differently than personal attacks; most HN negativity is substantive rather than toxic. But, does negativity cause engagement, or does controversial content attract both negative framing and attention? Probably some of both.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="hackerbook-dataset-cross-validation-with-22gb-of-hacker-news-data"&gt;HackerBook Dataset: Cross-Validation With 22GB of Hacker News Data&lt;/h2&gt;
&lt;p&gt;Related to this, I also saw &lt;a href="https://news.ycombinator.com/item?id=46435308"&gt;this Show HN&lt;/a&gt;: 22GB of Hacker News in SQLite, served via WASM shards. Downloaded the &lt;a href="https://github.com/DOSAYGO-STUDIO/HackerBook"&gt;HackerBook&lt;/a&gt; export and ran a subset of my paper&amp;rsquo;s analytics on it.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Caveat: HackerBook is a single static snapshot (no time-series data). Therefore I could not analyze lifecycle analysis, early-velocity prediction, or decay fitting. What can be computed: distributional statistics, inequality metrics, circadian patterns.&lt;/em&gt;&lt;/p&gt;
&lt;a href="#lightbox-hackerbook_stats_table2-png-2" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/hackerbook_stats_table2.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/hackerbook_stats_table2.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/hackerbook_stats_table2.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/hackerbook_stats_table2.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/hackerbook_stats_table2.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/hackerbook_stats_table2.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/hackerbook_stats_table2.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/hackerbook_stats_table2.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/hackerbook_stats_table2.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/hackerbook_stats_table2.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/hackerbook_stats_table2.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/hackerbook_stats_table2.png"
alt="Summary statistics table for HackerBook Hacker News data sample"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;h3 id="score-distribution-and-power-law-fit"&gt;Score Distribution and Power-Law Fit&lt;/h3&gt;
&lt;a href="#lightbox-score_power_law_hackerbook2-png-3" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/score_power_law_hackerbook2.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/score_power_law_hackerbook2.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/score_power_law_hackerbook2.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/score_power_law_hackerbook2.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/score_power_law_hackerbook2.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/score_power_law_hackerbook2.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/score_power_law_hackerbook2.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/score_power_law_hackerbook2.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/score_power_law_hackerbook2.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/score_power_law_hackerbook2.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/score_power_law_hackerbook2.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/score_power_law_hackerbook2.png"
alt="Hacker News score distribution CCDF with power-law fit showing heavy-tailed engagement"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;h3 id="attention-inequality-lorenz-curve-and-gini-coefficient"&gt;Attention Inequality: Lorenz Curve and Gini Coefficient&lt;/h3&gt;
&lt;a href="#lightbox-attention_inequality_hackerbook2-png-4" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/attention_inequality_hackerbook2.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/attention_inequality_hackerbook2.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/attention_inequality_hackerbook2.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/attention_inequality_hackerbook2.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/attention_inequality_hackerbook2.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/attention_inequality_hackerbook2.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/attention_inequality_hackerbook2.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/attention_inequality_hackerbook2.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/attention_inequality_hackerbook2.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/attention_inequality_hackerbook2.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/attention_inequality_hackerbook2.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/attention_inequality_hackerbook2.png"
alt="Lorenz curve of Hacker News story scores measuring attention inequality with Gini coefficient"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;h3 id="circadian-posting-patterns"&gt;Circadian Posting Patterns&lt;/h3&gt;
&lt;a href="#lightbox-circadian_patterns_hackerbook2-png-6" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/circadian_patterns_hackerbook2.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/circadian_patterns_hackerbook2.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/circadian_patterns_hackerbook2.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/circadian_patterns_hackerbook2.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/circadian_patterns_hackerbook2.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/circadian_patterns_hackerbook2.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/circadian_patterns_hackerbook2.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/circadian_patterns_hackerbook2.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/circadian_patterns_hackerbook2.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/circadian_patterns_hackerbook2.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/circadian_patterns_hackerbook2.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/circadian_patterns_hackerbook2.png"
alt="Hacker News circadian posting patterns in UTC showing volume versus mean score by hour"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;h3 id="score-vs-comment-engagement"&gt;Score vs Comment Engagement&lt;/h3&gt;
&lt;a href="#lightbox-score_vs_direct_comments_hackerbook2-png-7" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/score_vs_direct_comments_hackerbook2.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/score_vs_direct_comments_hackerbook2.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/score_vs_direct_comments_hackerbook2.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/score_vs_direct_comments_hackerbook2.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/score_vs_direct_comments_hackerbook2.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/score_vs_direct_comments_hackerbook2.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/score_vs_direct_comments_hackerbook2.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/score_vs_direct_comments_hackerbook2.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/score_vs_direct_comments_hackerbook2.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/score_vs_direct_comments_hackerbook2.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/score_vs_direct_comments_hackerbook2.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/score_vs_direct_comments_hackerbook2.png"
alt="Hacker News score versus direct comments log-log scatter plot"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;a href="#lightbox-direct_comments_ccdf_hackerbook2-png-8" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/direct_comments_ccdf_hackerbook2.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/direct_comments_ccdf_hackerbook2.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/direct_comments_ccdf_hackerbook2.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/direct_comments_ccdf_hackerbook2.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/direct_comments_ccdf_hackerbook2.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/direct_comments_ccdf_hackerbook2.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/direct_comments_ccdf_hackerbook2.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/direct_comments_ccdf_hackerbook2.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/direct_comments_ccdf_hackerbook2.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/direct_comments_ccdf_hackerbook2.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/direct_comments_ccdf_hackerbook2.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/direct_comments_ccdf_hackerbook2.png"
alt="Direct comments distribution CCDF on Hacker News showing power-law tail"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;a href="#lightbox-mean_score_vs_direct_comments_binned_hackerbook2-png-9" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/mean_score_vs_direct_comments_binned_hackerbook2.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/mean_score_vs_direct_comments_binned_hackerbook2.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/mean_score_vs_direct_comments_binned_hackerbook2.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/mean_score_vs_direct_comments_binned_hackerbook2.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/mean_score_vs_direct_comments_binned_hackerbook2.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/mean_score_vs_direct_comments_binned_hackerbook2.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/mean_score_vs_direct_comments_binned_hackerbook2.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/mean_score_vs_direct_comments_binned_hackerbook2.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/mean_score_vs_direct_comments_binned_hackerbook2.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/mean_score_vs_direct_comments_binned_hackerbook2.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/mean_score_vs_direct_comments_binned_hackerbook2.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/mean_score_vs_direct_comments_binned_hackerbook2.png"
alt="Mean score versus direct comments on Hacker News binned in log-spaced buckets"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;</description></item><item><title>Praise by Name, Criticize by Category: Warren Buffett Retires at 95</title><link>http://philippdubach.com/posts/praise-by-name-criticize-by-category-warren-buffett-retires-at-95/</link><pubDate>Tue, 06 Jan 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/praise-by-name-criticize-by-category-warren-buffett-retires-at-95/</guid><description>&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Warren_Buffett"&gt;Warren Buffett&lt;/a&gt; has stepped down as CEO at 95. &lt;a href="https://en.wikipedia.org/wiki/Greg_Abel"&gt;Greg Abel&lt;/a&gt; inherits a company that paid &lt;a href="https://www.berkshirehathaway.com/letters/2024ltr.pdf"&gt;$26.8 billion in federal income taxes&lt;/a&gt; last year, roughly 5% of what all of corporate America paid combined. I do not have &lt;a href="http://philippdubach.com/posts/damodaran-on-golds-2025-surge/"&gt;much in common with Buffett&lt;/a&gt;, but I will miss his shareholder letters. Berkshire&amp;rsquo;s archive is a rare case of a public company explaining decisions candidly to its owners.&lt;/p&gt;
&lt;p&gt;In the &lt;a href="https://www.berkshirehathaway.com/letters/2024ltr.pdf"&gt;2024 letter&lt;/a&gt; Buffett repeats Tom Murphy&amp;rsquo;s rule: &amp;ldquo;Praise by name, criticize by category.&amp;rdquo; Murphy gave him this advice 60 years ago. The letter closes with another line worth keeping: &amp;ldquo;Kindness is costless but priceless.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;Three behaviors from those letters matter. (1) Communication discipline: Between 2019 and 2023, Buffett used &amp;ldquo;mistake&amp;rdquo; or &amp;ldquo;error&amp;rdquo; 16 times in his letters. Many Fortune 500 companies never used either word once. Amazon made &amp;ldquo;brutally candid observations&amp;rdquo; in its 2021 letter. Elsewhere, it has been happy talk and pictures. (2) Patience as allocation strategy: In the &lt;a href="https://www.berkshirehathaway.com/letters/1999htm.html"&gt;1999 letter&lt;/a&gt; he told shareholders that &amp;ldquo;truly large superiorities&amp;rdquo; over the index were past because Berkshire&amp;rsquo;s size constrains opportunity. That reads today like the core constraint of the Abel era, a point I reflected on when writing about &lt;a href="http://philippdubach.com/posts/how-ai-is-shaping-my-investment-portfolio-for-2026/"&gt;portfolio limits in 2026&lt;/a&gt;. (3) The non-theatrical life: Coverage of the retirement keeps returning to the same facts: still &lt;a href="https://www.cnbc.com/2023/03/03/warren-buffett-lives-in-the-same-home-he-bought-in-1958.html"&gt;living in the Omaha home he bought in 1958&lt;/a&gt;, still driving 7 minutes to &lt;a href="https://fortune.com/2025/12/30/warren-buffett-steps-down-ceo-keeps-coming-to-office-berkshire-hathaway/"&gt;work every day&lt;/a&gt; and stopping at a drive-through for &lt;a href="https://www.cnbc.com/2018/04/18/warren-buffett-buys-breakfast-from-mcdonalds-for-under-3-point-17.html"&gt;McDonald&amp;rsquo;s breakfast&lt;/a&gt;. The man is a true American hero.&lt;/p&gt;
&lt;p&gt;Read &lt;a href="https://www.berkshirehathaway.com/letters/letters.html"&gt;the letters chronologically&lt;/a&gt; and you see Berkshire become a system rather than a portfolio. The early articulation is there in &lt;a href="https://berkshirehathaway.com/letters/1983.html"&gt;the 1983 letter&lt;/a&gt;: partnership mentality, per-share intrinsic value, and a preference for businesses that generate cash and earn strong returns on tangible equity.&lt;/p&gt;
&lt;p&gt;The engine is insurance float. Property-casualty insurers collect premiums upfront and pay claims years or decades later. That gap creates investable capital at zero or negative cost. As Buffett puts it in the 2024 letter: &amp;ldquo;When writing P/C insurance, we receive payment upfront and much later learn what our product has cost us, sometimes a moment of truth that is delayed as much as 30 or more years.&amp;rdquo; In 2024, Berkshire&amp;rsquo;s insurance operations generated $9 billion in underwriting profit and $13.7 billion in investment income. GEICO, &amp;ldquo;repolished&amp;rdquo; over five years by Todd Combs, had what the letter calls a &amp;ldquo;spectacular&amp;rdquo; year.&lt;/p&gt;
&lt;p&gt;On the question of where Buffett was right: if you ban the word &amp;ldquo;mistake,&amp;rdquo; risk does not vanish; it goes off balance sheet until it detonates. In the &lt;a href="https://www.berkshirehathaway.com/letters/2024ltr.pdf"&gt;2024 letter&lt;/a&gt; he writes:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I have also been a director of large public companies at which &amp;lsquo;mistake&amp;rsquo; or &amp;lsquo;wrong&amp;rsquo; were forbidden words at board meetings or analyst calls. That taboo, implying managerial perfection, always made me nervous.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;He was also right about scale. In 2024, 53% of Berkshire&amp;rsquo;s 189 operating businesses reported declining earnings, yet the company posted $47.4 billion in operating profit. That is diversification at scale, but also its constraint. The next decade hinges on a handful of large moves. Markets understood this as the &lt;a href="https://www.cnbc.com/2026/01/02/berkshire-hathaway-shares-dip-as-warren-buffett-exits-and-greg-abel-era-begins.html"&gt;Abel era officially began&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;On a more critical note: Berkshire is an insurance-anchored allocator with operating companies plus a concentrated equity book. The 2024 marketable equity portfolio stood at $272 billion, down from $354 billion after significant Apple sales. At its peak, Apple represented roughly 40-50% of Berkshire&amp;rsquo;s public equity holdings. A single stock, bought mostly between 2016 and 2018, drove a substantial portion of portfolio returns over the past decade. The rest is insurance. This is not a criticism of the Apple thesis (it was correct), but the Buffett track record includes one very large, very right bet on a technology company he famously avoided for most of his career. His letters do not pretend error is rare; they treat delay as the sin. He has been candid about blind spots, discussing lessons from &lt;a href="https://www.cnbc.com/2023/10/01/buffett-learned-from-investment-mistakes-including-ibm-and-airlines.html"&gt;IBM and airlines&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Berkshire&amp;rsquo;s businesses also face difficult labor dynamics. BNSF has drawn &lt;a href="https://www.trains.com/trn/news-reviews/news-wire/union-ad-takes-aim-at-bnsf-up-leadership-ahead-of-stockholder-meetings/"&gt;union criticism&lt;/a&gt; over attendance policies and &lt;a href="https://www.employmentlawgroup.com/in-the-news/whistleblower-law-blog/osha-finds-bnsf-rail-company-owned-by-berkshire-hathaway-liable-in-three-retaliation-cases-and-awards-damages-to-employees/"&gt;OSHA findings&lt;/a&gt; in retaliation cases. The railroad earned $5 billion in 2024, flat with 2023.&lt;/p&gt;
&lt;p&gt;So what now? Abel inherits roughly $300 billion in cash and Treasury bills. The letter explains this is not a preference for cash: &amp;ldquo;Berkshire will never prefer ownership of cash-equivalent assets over the ownership of good businesses, whether controlled or only partially owned.&amp;rdquo; The cash is a byproduct of not finding anything worth buying at current prices. The first large capital move will tell us more than any profile can, which is why &lt;a href="https://www.wsj.com/finance/investing/berkshire-hathaway-greg-abel-cash-warren-buffett-73695061"&gt;coverage keeps circling back&lt;/a&gt; to the cash pile and the question of how much &amp;ldquo;Buffett premium&amp;rdquo; was embedded in Berkshire shares.&lt;/p&gt;
&lt;p&gt;If Buffett truly goes quiet, I hope he gets to experience not working at all, or at least the version that suits a man who prefers thinking to talking. The compounding may continue just fine.&lt;/p&gt;
&lt;aside class="disclaimer" role="note" aria-label="Disclaimer"&gt;
&lt;div class="disclaimer-content"&gt;&lt;p&gt;&lt;strong&gt;Disclaimer:&lt;/strong&gt; All opinions expressed are my own. This is not investment, financial, tax, or legal advice. Past performance does not indicate future results. Do your own research and consult qualified professionals before making financial decisions. No liability accepted for any losses.&lt;/p&gt;&lt;/div&gt;
&lt;/aside&gt;</description></item><item><title>RSS Swipr: Find Blogs Like You Find Your Dates</title><link>http://philippdubach.com/posts/rss-swipr-find-blogs-like-you-find-your-dates/</link><pubDate>Mon, 05 Jan 2026 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/rss-swipr-find-blogs-like-you-find-your-dates/</guid><description>&lt;p&gt;&lt;a href="#lightbox-rss-tinder-demo2-gif-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/rss-tinder-demo2.gif 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/rss-tinder-demo2.gif 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/rss-tinder-demo2.gif 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/rss-tinder-demo2.gif 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/rss-tinder-demo2.gif 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/rss-tinder-demo2.gif 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/rss-tinder-demo2.gif 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/rss-tinder-demo2.gif 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/rss-tinder-demo2.gif 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/rss-tinder-demo2.gif 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/rss-tinder-demo2.gif 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/rss-tinder-demo2.gif"
alt="GIF with interactive demo of the RSS Tinder App"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Algorithmic timelines are everywhere now. But I still prefer the control of RSS. Readers are good at aggregating content but bad at filtering it. What I wanted was something borrowed from dating apps: instead of an infinite list, give me cards. Swipe right to like, left to dislike. Then train a model to surface what I actually want to read. So I built &lt;em&gt;RSS Swipr&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;The frontend is vanilla JavaScript—no React, no build steps, just DOM manipulation and CSS transitions. You drag a card, it follows your finger, and snaps away with a satisfying animation. Behind the scenes, the app tracks everything: votes (like/neutral/dislike), time spent viewing each card, and whether you actually opened the link. If I swipe right but don&amp;rsquo;t click through, that&amp;rsquo;s a signal. If I spend 0.3 seconds on a card before swiping left, that&amp;rsquo;s a signal too.&lt;a href="#lightbox-screenshot_feed_import1-png-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/screenshot_feed_import1.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/screenshot_feed_import1.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/screenshot_feed_import1.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/screenshot_feed_import1.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/screenshot_feed_import1.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/screenshot_feed_import1.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/screenshot_feed_import1.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/screenshot_feed_import1.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/screenshot_feed_import1.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/screenshot_feed_import1.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/screenshot_feed_import1.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/screenshot_feed_import1.png"
alt="Feed management interface showing 1084 imported RSS feeds with 9327 total entries"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Feed management happens through a simple CSV import. Paste a list of &lt;code&gt;name,url&lt;/code&gt; pairs, click refresh, and the fetcher pulls articles with proper HTTP caching (ETag/Last-Modified) to avoid hammering servers. You can use your own feed list or load a predefined list. Thanks to Manuel Moreale who created &lt;a href="https://blogroll.org/"&gt;blogroll&lt;/a&gt; I was able to get an OPML export and load all curated RSS feeds directly. Something similar works with &lt;a href="https://minifeed.net/global"&gt;minifeed&lt;/a&gt; or &lt;a href="https://kagi.com/api/v1/smallweb/feed"&gt;Kagi&amp;rsquo;s smallweb&lt;/a&gt;. Or you use one of the &lt;a href="https://hnrss.github.io"&gt;Hacker News RSS&lt;/a&gt; feeds. If that feels too adventurous, I created &lt;a href="https://rss-aggregator.philippd.workers.dev"&gt;curated feeds&lt;/a&gt; for the most popular HN bloggers.&lt;/p&gt;
&lt;p&gt;Building the model, I started with XGBoost and some hand-engineered features (title length, word count, time of day, feed source). Decent—around 66% ROC-AUC. It learned that I dislike short, clickbaity titles. But it didn&amp;rsquo;t understand context.&lt;/p&gt;
&lt;p&gt;The upgrade was MPNet (&lt;code&gt;all-mpnet-base-v2&lt;/code&gt; from sentence-transformers) to generate 768-dimensional embeddings for every article&amp;rsquo;s title and description. Combined with engineered features—feed preferences, temporal patterns, text statistics—this gets fed into a Hybrid Random Forest.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;predict_preference&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# Generate semantic embeddings (768 dims)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mpnet&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# Extract behavioral + text features&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;features&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;feature_pipeline&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# Predict with Hybrid RF&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hstack&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;predict_proba&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;Training happens on Google Colab (free T4 GPU or even faster with H100 or A100 on a subscription). Upload your training CSV, run the notebook, download a &lt;code&gt;.pkl&lt;/code&gt; file.&lt;a href="#lightbox-screenshot_colab_head1-png-2" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/screenshot_colab_head1.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/screenshot_colab_head1.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/screenshot_colab_head1.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/screenshot_colab_head1.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/screenshot_colab_head1.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/screenshot_colab_head1.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/screenshot_colab_head1.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/screenshot_colab_head1.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/screenshot_colab_head1.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/screenshot_colab_head1.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/screenshot_colab_head1.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/screenshot_colab_head1.png"
alt="Google Colab notebook showing model training setup with GPU configuration"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
The notebook handles everything: installing sentence-transformers, downloading the feature engineering pipeline, checking GPU availability, and running 5-fold cross-validation.&lt;a href="#lightbox-screenshot_colab_results1-png-3" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/screenshot_colab_results1.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/screenshot_colab_results1.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/screenshot_colab_results1.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/screenshot_colab_results1.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/screenshot_colab_results1.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/screenshot_colab_results1.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/screenshot_colab_results1.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/screenshot_colab_results1.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/screenshot_colab_results1.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/screenshot_colab_results1.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/screenshot_colab_results1.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/screenshot_colab_results1.png"
alt="Training results showing ROC-AUC of 0.7537 across 5-fold cross-validation"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
With ~1400 training samples, the model achieves &lt;em&gt;75.4% ROC-AUC (± 0.019 std)&lt;/em&gt;. Not state-of-the-art, but enough to noticeably improve my reading experience. The model now understands that I like systems programming and ML papers, but skip most crypto and generic startup advice.&lt;/p&gt;
&lt;p&gt;The problem with transformer models is latency. Generating MPNet embeddings takes ~1 second per article. In a swipe interface, that lag is unbearable. The next best thing is a preload queue. While you&amp;rsquo;re reading the current card, the backend is scoring and fetching the next 3-5 articles in the background. By the time you swipe, the next card is already waiting.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-javascript" data-lang="javascript"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kr"&gt;async&lt;/span&gt; &lt;span class="nx"&gt;loadNextBatch&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="kr"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;excludeIds&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cardQueue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;,&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="kr"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kr"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sb"&gt;`/api/posts/batch?count=3&amp;amp;exclude=&lt;/span&gt;&lt;span class="si"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;excludeIds&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="kr"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kr"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cardQueue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(...&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;Article selection uses Thompson Sampling: 80% of the time it shows what the model thinks you&amp;rsquo;ll like (exploit), 20% it throws in something unexpected (explore). This prevents the filter bubble problem and lets the model discover if your tastes have changed.&lt;/p&gt;
&lt;p&gt;The whole system is designed as a closed loop:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Swipe&lt;/strong&gt; → votes get stored in SQLite&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Export&lt;/strong&gt; → download training CSV with votes + engagement data&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Train&lt;/strong&gt; → run Colab notebook, get new model&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Upload&lt;/strong&gt; → drag-drop the &lt;code&gt;.pkl&lt;/code&gt; file back into the app&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;a href="#lightbox-screenshot_export1-png-5" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/screenshot_export1.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/screenshot_export1.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/screenshot_export1.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/screenshot_export1.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/screenshot_export1.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/screenshot_export1.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/screenshot_export1.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/screenshot_export1.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/screenshot_export1.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/screenshot_export1.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/screenshot_export1.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/screenshot_export1.png"
alt="Export interface showing 1421 votes with breakdown: 583 likes, 193 neutral, 645 dislikes"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
The export includes everything the model needs: article text, feed metadata, your votes, link opens, and time spent. You can also &lt;strong&gt;import&lt;/strong&gt; a previous training CSV to restore your voting history on a fresh install—useful if you want to clone the repo on a new machine without losing your data.&lt;a href="#lightbox-screenshot_model_selection1-png-6" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/screenshot_model_selection1.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/screenshot_model_selection1.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/screenshot_model_selection1.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/screenshot_model_selection1.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/screenshot_model_selection1.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/screenshot_model_selection1.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/screenshot_model_selection1.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/screenshot_model_selection1.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/screenshot_model_selection1.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/screenshot_model_selection1.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/screenshot_model_selection1.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/screenshot_model_selection1.png"
alt="Model management interface showing active hybrid_rf model with ROC-AUC 0.7537"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;Uploaded models show their ROC-AUC score so you can compare performance across training runs. Activate whichever one works best.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Backend&lt;/strong&gt;: Python, Flask, SQLite
&lt;strong&gt;Frontend&lt;/strong&gt;: Vanilla JS, CSS variables
&lt;strong&gt;ML&lt;/strong&gt;: scikit-learn, XGBoost, sentence-transformers (MPNet)
&lt;strong&gt;Training&lt;/strong&gt;: Google Colab (free GPU tier)&lt;/p&gt;
&lt;p&gt;Total infrastructure cost: zero. Everything runs locally. No accounts, no cloud dependencies, no tracking.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;git clone https://github.com/philippdubach/rss-swipr.git
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; rss-swipr
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;python -m venv .venv &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;source&lt;/span&gt; .venv/bin/activate
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;pip install -r requirements.txt
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;python app.py
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;The &lt;a href="https://github.com/philippdubach/rss-swipr"&gt;full source&lt;/a&gt; and &lt;a href="https://colab.research.google.com/drive/1XjnAuwF3naPElKH9yZ3UEdslzN7qAUrQ?usp=sharing"&gt;Colab notebook&lt;/a&gt; are available on GitHub.&lt;/p&gt;</description></item><item><title>Apple's AI Bet: Playing the Long Game or Missing the Moment?</title><link>http://philippdubach.com/posts/apples-ai-bet-playing-the-long-game-or-missing-the-moment/</link><pubDate>Tue, 30 Dec 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/apples-ai-bet-playing-the-long-game-or-missing-the-moment/</guid><description>&lt;p&gt;&lt;a href="https://www.theinformation.com/articles/2026-predictions-apple-will-reverse-ai-slump"&gt;The Information&lt;/a&gt; published a piece today arguing that Apple&amp;rsquo;s restrained AI approach may finally pay off in 2026. The thesis: while OpenAI, Google, and Meta pour hundreds of billions into data centers and model training, Apple has kept its powder dry, sitting on &lt;a href="https://www.apple.com/newsroom/2025/10/apple-reports-fourth-quarter-results/"&gt;$157 billion in cash and marketable securities&lt;/a&gt; as of Q4 2025. If the AI spending bubble deflates, Apple&amp;rsquo;s position looks rather clever. This piqued my interest, from a strategy point of view: Apple hasn&amp;rsquo;t been absent from AI. They&amp;rsquo;ve been making a specific bet that large language models will commoditize, and that value will flow to distribution and customer relationships rather than to whoever has the best model. The revamped Siri expected in spring 2026 will reportedly be powered by &lt;a href="https://www.bloomberg.com/news/articles/2025-11-05/apple-plans-to-use-1-2-trillion-parameter-google-gemini-model-to-power-new-siri"&gt;Google&amp;rsquo;s Gemini through a deal worth $1 billion annually&lt;/a&gt;. The custom Gemini model will run on Apple&amp;rsquo;s Private Cloud Compute servers.
&lt;a href="#lightbox-ai_capex_comparison-png-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/ai_capex_comparison.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/ai_capex_comparison.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/ai_capex_comparison.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/ai_capex_comparison.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai_capex_comparison.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/ai_capex_comparison.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/ai_capex_comparison.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/ai_capex_comparison.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai_capex_comparison.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/ai_capex_comparison.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/ai_capex_comparison.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/ai_capex_comparison.png"
alt="Big Tech AI Capital Expenditure 2023-2025"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
This is consistent with Apple&amp;rsquo;s history. They didn&amp;rsquo;t build their own search engine. They took Google&amp;rsquo;s money to be the default on Safari. &lt;a href="https://www.apple.com/newsroom/2025/12/john-giannandrea-to-retire-from-apple/"&gt;John Giannandrea&amp;rsquo;s retirement&lt;/a&gt; earlier this month, with Siri now under Mike Rockwell, signals internal recognition that something had to change.&lt;/p&gt;
&lt;p&gt;The iPhone distribution advantage is underappreciated. Apple can push AI features through software updates to &lt;a href="https://www.macrumors.com/2025/01/30/apple-1q-2025-earnings/"&gt;over 2.3 billion active devices&lt;/a&gt;. When Apple Intelligence features ship, they just appear. This is the same advantage that made Apple Music competitive against Spotify, or keeps Safari relevant despite Chrome&amp;rsquo;s benchmarks.&lt;/p&gt;
&lt;p&gt;The commoditization evidence is suggestive. I&amp;rsquo;ve &lt;a href="http://philippdubach.com/posts/is-ai-really-eating-the-world-1/2/"&gt;written before&lt;/a&gt; about these dynamics. GPT-4 launched with a substantial lead; within months, &lt;a href="https://www.anthropic.com/news/claude-3-family"&gt;Claude and Gemini were comparable&lt;/a&gt;. &lt;a href="https://newsletter.semianalysis.com/p/deepseek-debates"&gt;DeepSeek proved frontier models can be built for a fraction of OpenAI&amp;rsquo;s cost&lt;/a&gt;. API pricing has &lt;a href="https://openai.com/api/pricing/"&gt;dropped 97% since GPT-3&amp;rsquo;s launch&lt;/a&gt;. The hyperscalers are spending &lt;a href="https://www.wsj.com/tech/ai/big-tech-ai-spending-7b6c8d8a"&gt;$400 billion collectively on AI infrastructure in 2025&lt;/a&gt;, more than global telecom capex. The question isn&amp;rsquo;t whether this produces capable models. It&amp;rsquo;s whether it produces defensible advantages.
&lt;a href="#lightbox-ai_api_pricing-png-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/ai_api_pricing.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/ai_api_pricing.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/ai_api_pricing.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/ai_api_pricing.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai_api_pricing.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/ai_api_pricing.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/ai_api_pricing.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/ai_api_pricing.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai_api_pricing.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/ai_api_pricing.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/ai_api_pricing.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/ai_api_pricing.png"
alt="AI API Pricing Collapse 2020-2025"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;a href="https://www.bloomberg.com/news/newsletters/2025-11-02/apple-s-nearly-140-billion-quarter-when-ios-26-1-will-be-out-ipad-mini-revamp-mhhpy1ax"&gt;Mark Gurman&amp;rsquo;s Bloomberg reporting&lt;/a&gt; suggests Apple views LLMs as commodities not worth proprietary development costs. The counterargument is obvious: what if the next capability jump makes current models look like toys?&lt;/p&gt;
&lt;p&gt;But the AI investment boom resembles previous cycles. Enormous capital flowing into a sector where barriers keep falling. That pattern often ends with winners who have distribution and customer relationships, not winners who spent the most on R&amp;amp;D. Apple&amp;rsquo;s bet isn&amp;rsquo;t guaranteed to be correct, but it&amp;rsquo;s defensible.&lt;/p&gt;
&lt;p&gt;The spring Siri update will matter. Reports that &lt;a href="https://9to5mac.com/2025/10/19/apple-employees-concerned-by-early-ios-26-4-apple-intelligence-sir-version/"&gt;Apple employees have concerns about performance in early iOS 26.4 builds&lt;/a&gt; aren&amp;rsquo;t encouraging. But Apple delayed the launch multiple times, suggesting they&amp;rsquo;re trying to get it right rather than shipping half-baked.&lt;/p&gt;
&lt;p&gt;Apple&amp;rsquo;s $157 billion cash pile provides optionality. If AI startups face a funding crunch, Apple can acquire capability. If someone achieves a breakthrough, Apple has resources to respond. Apple has preserved its options.&lt;/p&gt;</description></item><item><title>Building a No-Tracking Newsletter from Markdown to Distribution</title><link>http://philippdubach.com/posts/building-a-no-tracking-newsletter-from-markdown-to-distribution/</link><pubDate>Wed, 24 Dec 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/building-a-no-tracking-newsletter-from-markdown-to-distribution/</guid><description>&lt;p&gt;&lt;a href="#lightbox-Newsletter_Overview2-jpg-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/Newsletter_Overview2.jpg 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/Newsletter_Overview2.jpg 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/Newsletter_Overview2.jpg 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/Newsletter_Overview2.jpg 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/Newsletter_Overview2.jpg 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/Newsletter_Overview2.jpg 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/Newsletter_Overview2.jpg 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/Newsletter_Overview2.jpg 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/Newsletter_Overview2.jpg 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/Newsletter_Overview2.jpg 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/Newsletter_Overview2.jpg 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/Newsletter_Overview2.jpg"
alt="Screenshot of rendered newsletter showing article preview cards with images and descriptions"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Friends have been asking how they can stay up to date with what I&amp;rsquo;m working on and keep track of the things I read, write, and share. RSS feeds don&amp;rsquo;t seem to be en vogue anymore, apparently. So I built a mailing list. What else would you do over the Christmas break?&lt;/p&gt;
&lt;p&gt;From a previous marketing job I knew Mailchimp. Also, every newsletter I unsubscribe from is Mailchimp. I no longer wish to receive these emails.
&lt;a href="#lightbox-unsubscribe2-png-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/unsubscribe2.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/unsubscribe2.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/unsubscribe2.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/unsubscribe2.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/unsubscribe2.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/unsubscribe2.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/unsubscribe2.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/unsubscribe2.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/unsubscribe2.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/unsubscribe2.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/unsubscribe2.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/unsubscribe2.png"
alt="Unsubscribe confirmation from Mailchimp newsletters"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Or obviously Substack. I read &lt;a href="https://simonw.substack.com"&gt;Simon Willison&amp;rsquo;s Newsletter&lt;/a&gt; sometimes. And obviously &lt;a href="http://philippdubach.com/posts/michael-burrys-379-newsletter/"&gt;Michael Burry&amp;rsquo;s $379 Substack&lt;/a&gt;. Those are solid options, but I had a clear picture in mind of what I wanted. I wanted only HTML, no tracking (also why I use &lt;a href="https://www.goatcounter.com/"&gt;GoatCounter&lt;/a&gt; on my site and not Google Analytics), and full control of the creation and distribution chain from end to end. So I sat down and drew into my notebook, what I always do when I have an idea after a long walk or a hot shower.
&lt;a href="#lightbox-newsletter_scetch3-jpg-2" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/newsletter_scetch3.jpg 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/newsletter_scetch3.jpg 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/newsletter_scetch3.jpg 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/newsletter_scetch3.jpg 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/newsletter_scetch3.jpg 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/newsletter_scetch3.jpg 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/newsletter_scetch3.jpg 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/newsletter_scetch3.jpg 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/newsletter_scetch3.jpg 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/newsletter_scetch3.jpg 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/newsletter_scetch3.jpg 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/newsletter_scetch3.jpg"
alt="Hand-drawn notebook sketch of newsletter architecture showing markdown to HTML to distribution flow"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
I then went over to Illustrator (actually &lt;a href="https://affinity.serif.com/en-us/designer/"&gt;Affinity Designer&lt;/a&gt;, which I have been happily using since my Creative Cloud subscription ran out, sorry Adobe) and built a quick mockup of my drawing. I fed the mockup to Claude to generate pure HTML. After a few iterations it more or less looked like I wanted it to be.&lt;/p&gt;
&lt;p&gt;The architecture: write the newsletter in Markdown (as I do for all of &lt;a href="http://philippdubach.com/about"&gt;my blog&lt;/a&gt;). Render it as HTML. Fetch OpenGraph images from my Cloudflare CDN at the lowest feasible resolution and pull descriptions automatically. Format links with preview cards. Keep some space for freetext at the top and bottom.
&lt;a href="#lightbox-newsletter_architecture2-png-3" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/newsletter_architecture2.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/newsletter_architecture2.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/newsletter_architecture2.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/newsletter_architecture2.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/newsletter_architecture2.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/newsletter_architecture2.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/newsletter_architecture2.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/newsletter_architecture2.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/newsletter_architecture2.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/newsletter_architecture2.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/newsletter_architecture2.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/newsletter_architecture2.png"
alt="Flowchart showing newsletter pipeline: Write Markdown, Render HTML, Host on R2, Fetch KV for subscribers, Send via Resend API"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
I built a &lt;a href="https://github.com/philippdubach/newsletter-generator"&gt;Python engine&lt;/a&gt; that renders my &lt;code&gt;.md&lt;/code&gt; files to email-safe HTML. The script handles several things automatically: (1) It fetches OpenGraph metadata for every link using Beautiful Soup, caching results to avoid repeated requests. (2) optimizes images using Cloudflare&amp;rsquo;s image transformation service. For email, I use 240px width (2x the display size of 120px for retina displays). (3) It generates LinkedIn-style preview cards with images on the left and text on the right. The output is table-based HTML because email clients from 2003 still exist and they&amp;rsquo;re apparently immortal.
&lt;a href="#lightbox-Newsletter_Overview-png-4" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/Newsletter_Overview.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/Newsletter_Overview.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/Newsletter_Overview.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/Newsletter_Overview.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/Newsletter_Overview.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/Newsletter_Overview.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/Newsletter_Overview.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/Newsletter_Overview.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/Newsletter_Overview.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/Newsletter_Overview.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/Newsletter_Overview.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/Newsletter_Overview.png"
alt="Screenshot of rendered newsletter showing article preview cards with images and descriptions"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Originally I intended to manually copy-paste the HTML into an email and send it out since I did not expect many subscribers at first (or at all). But I had another challenge at hand: how do people sign up?&lt;/p&gt;
&lt;p&gt;Since I had already been using &lt;a href="https://developers.cloudflare.com/kv/"&gt;Cloudflare Workers KV&lt;/a&gt; to build an API with historic values of my temperature and humidity sensor at home, I resorted to that. The API is simple. POST to &lt;code&gt;/api/subscribe&lt;/code&gt; with an email address, and it gets stored in KV with a timestamp and some metadata.&lt;/p&gt;
&lt;p&gt;After some Copilot iterations (I&amp;rsquo;m not a security guy, so not sure how I feel about handing all the security and testing to an agent, please reach out if you can help) the Worker includes rate limiting, honeypot fields for spam protection, proper CORS headers, and RFC-compliant email validation.&lt;/p&gt;
&lt;p&gt;I then wanted to get a confirmation email every time someone signed up. Since SMTP sending over my domain did not work reliably at first, I had to look for other options. Even though I wanted everything self-hosted, I ended up using the &lt;a href="https://resend.com/"&gt;Resend API&lt;/a&gt;. The API is straightforward:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;span class="lnt"&gt;13
&lt;/span&gt;&lt;span class="lnt"&gt;14
&lt;/span&gt;&lt;span class="lnt"&gt;15
&lt;/span&gt;&lt;span class="lnt"&gt;16
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-typescript" data-lang="typescript"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kr"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nx"&gt;sendWelcomeEmail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;subscriberEmail&lt;/span&gt;: &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;env&lt;/span&gt;: &lt;span class="kt"&gt;Env&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="kr"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;https://api.resend.com/emails&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nx"&gt;method&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;POST&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="s1"&gt;&amp;#39;Authorization&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="sb"&gt;`Bearer &lt;/span&gt;&lt;span class="si"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;RESEND_API_KEY&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="s1"&gt;&amp;#39;Content-Type&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;application/json&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nx"&gt;body&lt;/span&gt;: &lt;span class="kt"&gt;JSON.stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="kr"&gt;from&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Philipp Dubach &amp;lt;noreply@notifications.philippdubach.com&amp;gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nx"&gt;to&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;subscriberEmail&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nx"&gt;subject&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Welcome to the Newsletter&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nx"&gt;html&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="sb"&gt;`&amp;lt;p&amp;gt;Thanks for subscribing!&amp;lt;/p&amp;gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;br&gt;
&lt;p&gt;After implementing this, I figured: why not send a confirmation to the subscriber and a copy to me? Why not use Resend for the whole distribution? (This is not a paid advertisement.) The HTML newsletter I generate goes straight into the email body. No images hosted elsewhere (except for the optimized preview thumbnails). No tracking pixels. No click tracking. The email is just HTML.&lt;/p&gt;
&lt;p&gt;I also looked at &lt;a href="https://www.mailgun.com/"&gt;Mailgun&lt;/a&gt; and &lt;a href="https://sendgrid.com/"&gt;SendGrid&lt;/a&gt; before settling on Resend. Mailgun has better deliverability monitoring but a more complex API. SendGrid has more features but felt overengineered for what I needed. Resend&amp;rsquo;s free tier and simple API won. If you have strong opinions on email APIs, I&amp;rsquo;m curious to hear them.&lt;/p&gt;
&lt;p&gt;The total cost of running this: zero. Cloudflare Workers has a generous free tier. Cloudflare R2 (where the HTML newsletters are hosted) has 10GB free storage. Resend gives 3,000 emails per month. The Python script runs locally or on my Azure instance.&lt;/p&gt;
&lt;p&gt;You can find &lt;a href="https://static.philippdubach.com/newsletter/newsletter-2025-12.html"&gt;my first newsletter here&lt;/a&gt;. The full code for both the &lt;a href="https://github.com/philippdubach/newsletter-generator"&gt;newsletter generator&lt;/a&gt; and the &lt;a href="https://github.com/philippdubach/newsletter-api"&gt;subscriber API&lt;/a&gt; is on GitHub.&lt;/p&gt;</description></item><item><title>How AI is Shaping My Investment Portfolio for 2026</title><link>http://philippdubach.com/posts/how-ai-is-shaping-my-investment-portfolio-for-2026/</link><pubDate>Fri, 12 Dec 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/how-ai-is-shaping-my-investment-portfolio-for-2026/</guid><description>&lt;p&gt;I have two portfolios: (a) long-term, diversified, low-cost ETFs, and (b) collecting diamonds in front of bulldozers, short-term option plays, and some individual stocks I find interesting. Here, we will only look at (a). This essay is structured along five themes I believe to be true for 2026:&lt;/p&gt;
&lt;p&gt;&lt;a href="#section1"&gt;(1)&lt;/a&gt; Market Concentration and High Valuations&lt;br&gt;
&lt;a href="#section2"&gt;(2)&lt;/a&gt; US Dollar Depreciation Expected Despite Continued Dominance&lt;br&gt;
&lt;a href="#section3"&gt;(3)&lt;/a&gt; AI Investment Remains Central But Requires Scrutiny&lt;br&gt;
&lt;a href="#section4"&gt;(4)&lt;/a&gt; European Fiscal Revolution Creates Investment Opportunities&lt;br&gt;
&lt;a href="#section5"&gt;(5) &lt;/a&gt;Fixed Income Offers Best Prospects Since Global Financial Crisis&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s start with the conclusion. Here&amp;rsquo;s how I will rebalance my portfolio going into 2026:
&lt;a href="#lightbox-allocation_composite3-png-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/allocation_composite3.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/allocation_composite3.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/allocation_composite3.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/allocation_composite3.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/allocation_composite3.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/allocation_composite3.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/allocation_composite3.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/allocation_composite3.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/allocation_composite3.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/allocation_composite3.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/allocation_composite3.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/allocation_composite3.png"
alt="2026 portfolio allocation composite showing asset class breakdown: 28% US Equities, 18% Europe, 12% Asia EM, 14% Fixed Income, 5% Gold, 4.5% Crypto"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
What changed and why?: US Equities (-10%): S&amp;amp;P 500 valuations at 23× forward P/E reflect peak optimism; Nasdaq&amp;rsquo;s 30× trailing P/E is unsustainable. I reduce large-cap exposure from 33% to 23% to avoid dual headwinds: equity mean reversion and USD depreciation. I redeploy 5% into US small-cap stocks, which offer better valuations and risk-adjusted returns. Europe (+5%): European equities trade at a 22% discount to global peers and benefit from Germany&amp;rsquo;s €1T+ infrastructure commitment and structural reforms. This compounds three tailwinds: improving fundamentals, valuation re-rating, and EUR stability vs CHF. I increase the allocation from 8% to 13%. Fixed Income (+4%): Global yields remain near post-GFC highs with 10-year UST around 4.2% in December 2025. I reallocate from 10% equities-focused bonds to 14% fixed income (CHF Corporates +5%, EUR Govt Bonds +3.5%, US Treasuries +2%), establishing duration exposure and counter-cyclical protection ahead of Fed rate cuts. Japan (Unchanged): Japan&amp;rsquo;s structural reforms and BOJ stimulus remain supportive, but a 3% Asia Developed exposure is adequate. JPY strength vs USD and CHF acts as a tailwind on existing holdings. Asia EM (+0.5%): Increase from 10% to 10.5% to capture Chinese stimulus and attractive tech valuations. CNY appreciation vs USD provides diversification and a natural hedge against dollar weakness. Alternatives (+0.5%): Increase Listed PE/Alt from 1.5% to 2%, maintaining access to Swiss private markets and uncorrelated returns with currency-matched positioning. Gold (+1%): I increase from 4% to 5%. Gold serves as a hedge against USD weakness while benefiting from record central bank reserve diversification. This measured increase captures structural de-dollarization demand without chasing 2025&amp;rsquo;s roughly +58% performance. Crypto (+0.5%): I increase from 4% to 4.5% as a diversification component. At the &lt;a href="#end"&gt;end of this article&lt;/a&gt;, I have included a short, more technical insight into how I structured my research process.
&lt;a href="#lightbox-portfolio_performance-png-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/portfolio_performance.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/portfolio_performance.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/portfolio_performance.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/portfolio_performance.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/portfolio_performance.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/portfolio_performance.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/portfolio_performance.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/portfolio_performance.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/portfolio_performance.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/portfolio_performance.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/portfolio_performance.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/portfolio_performance.png"
alt="2025 portfolio performance chart comparing total portfolio return vs S&amp;amp;P 500 and 60/40 benchmark, showing CHF-hedged strategy outperformance"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Before we dive into the detailed rationale for my 2026 asset allocation, let&amp;rsquo;s quickly look at how this portfolio performed this year. The portfolio delivered solid returns in 2025, outperforming the 60/40 benchmark, primarily driven by the CHF-hedged S&amp;amp;P 500 allocation and domestic dividend stocks, with CHF-hedged gold providing additional diversification gains. The 11.5% USDCHF depreciation proved to be the defining factor this year, while the S&amp;amp;P 500 returned +18% in USD. Unhedged USD exposure translated to just ~5% in CHF terms, vindicating the currency-hedged core equity strategy. Detractors included the unhedged MSCI World SRI ETF, also hit by FX exposure.&lt;/p&gt;
&lt;p&gt;Now if you are still with me, let&amp;rsquo;s dive into the five themes that I believe to be important going into the next year.&lt;/p&gt;
&lt;section id="section1"&gt;&lt;/section&gt;
&lt;p&gt;• &lt;strong&gt;Market Concentration and High Valuations&lt;/strong&gt;: The S&amp;amp;P 500 has become dangerously concentrated. As of December 2025, the top 10 companies represent approximately 45% of the index&amp;rsquo;s value, a historic concentration level not seen since the dot-com bubble. Nvidia alone accounts for over 7%. What was once a diversified investment across 500 companies is now heavily weighted toward a handful of tech giants, most betting heavily on AI.
&lt;a href="#lightbox-treemap_fixed4-png-2" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/treemap_fixed4.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/treemap_fixed4.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/treemap_fixed4.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/treemap_fixed4.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/treemap_fixed4.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/treemap_fixed4.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/treemap_fixed4.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/treemap_fixed4.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/treemap_fixed4.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/treemap_fixed4.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/treemap_fixed4.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/treemap_fixed4.png"
alt="S&amp;amp;P 500 treemap visualization showing market cap concentration: Nvidia 7.2%, Apple 6.6%, Microsoft 5.9%, with top 10 stocks representing 40% of index"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The top 10 US companies dominate the world equity market: Top 5 US tech firms alone have a collective value ($17.6) that exceeds the combined GDP of the Japan, India, UK, France, and Italy ($17.1).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;a href="#lightbox-market_cap_comparison-png-3" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/market_cap_comparison.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/market_cap_comparison.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/market_cap_comparison.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/market_cap_comparison.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/market_cap_comparison.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/market_cap_comparison.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/market_cap_comparison.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/market_cap_comparison.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/market_cap_comparison.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/market_cap_comparison.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/market_cap_comparison.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/market_cap_comparison.png"
alt="Bar chart showing top US companies dominate global equity markets, with top 5 tech firms valued at $17.6 trillion, exceeding most national GDPs"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Even though there might be a consensus view among analysts that elevated valuations are supported by earnings growth rather than multiple expansion alone,&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Valuations are especially high in the US. The S&amp;amp;P500 trades at 23 times forward earnings, near the top of its historical range. While the Nasdaq&amp;rsquo;s 30× trailing P/E is well below the dotcom bubble peak, it still reflects significant optimism. Outside the US, valuations are more moderate: European and Chinese equities are 10% and 7% above their 20-year average valuations, respectively, and Japan&amp;rsquo;s index trades at a discount to its long-term average. &lt;em&gt;via &lt;a href="https://www.ubs.com/global/en/wealthmanagement/insights/year-ahead-registration.html"&gt;UBS Year Ahead&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The &lt;a href="https://www.multpl.com/shiller-pe"&gt;Shiller CAPE ratio&lt;/a&gt; sits at 40.5 as of early December 2025, more than double its historical mean of 17.3 and approaching levels last seen, again, during the dot-com peak.
&lt;a href="#lightbox-cape_ratio_chart-png-4" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/cape_ratio_chart.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/cape_ratio_chart.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/cape_ratio_chart.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/cape_ratio_chart.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/cape_ratio_chart.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/cape_ratio_chart.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/cape_ratio_chart.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/cape_ratio_chart.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/cape_ratio_chart.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/cape_ratio_chart.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/cape_ratio_chart.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/cape_ratio_chart.png"
alt="S&amp;amp;P 500 Index Shiller CAPE Ratio (1871-2025)"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Under these circumstances, I think small-cap and international equities offer more attractive entry points than US large-cap indices. This creates an opportunity to shift part of the US equity holding out of the S&amp;amp;P 500 into a mix of: S&amp;amp;P 500 value index funds (excluding high P/E ratio stocks), mid-cap stocks, international index funds (for geographic diversification), and small-cap stocks (which have more normal valuations and haven&amp;rsquo;t experienced the same speculative growth). This also aligns with my view that &lt;a href="http://philippdubach.com/posts/is-ai-really-eating-the-world-1/2/"&gt;the AI boom might not end with a winner-takes-all situation for the hyperscalers&lt;/a&gt;.&lt;/p&gt;
&lt;section id="section2"&gt;&lt;/section&gt;
&lt;p&gt;• &lt;strong&gt;US Dollar Depreciation Expected Despite Continued Dominance&lt;/strong&gt;: There is growing consensus among analysts for continued dollar weakness, with JP Morgan estimating the currency remains roughly 10% overvalued and Goldman Sachs projecting 4% depreciation over the coming year. But, dollar dominance in global finance will erode only slowly over decades through structural shifts in trade and GDP share, while dollar valuation can decline much faster due to less exceptional US economic performance and difficulty attracting unhedged capital flows. The key driver is the US&amp;rsquo;s shrinking share of global trade and persistent fiscal deficits, not an imminent collapse of reserve currency status. This aligns with the points we outlined in our previous review of &lt;a href="http://philippdubach.com/posts/pozsars-bretton-woods-iii-three-years-later-2/2/"&gt;Pozsar&amp;rsquo;s Bretton Woods III&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This distinction is clearly visible in historical data: according to &lt;a href="https://data.imf.org/?sk=E6A5F467-C14B-4AA8-9F6D-5A09EC4E62A4"&gt;IMF COFER data&lt;/a&gt;, the dollar&amp;rsquo;s share of global reserves has declined gradually from 71% in Q1 1999 to 56% by Q2 2025, a structural erosion occurring over 25 years. In contrast, the trade-weighted dollar index has experienced far more volatile swings, fluctuating between 95 and 130 over the same period, with particularly sharp movements during crisis periods (2008 financial crisis, 2020 pandemic). The dollar can lose 15-20% of its value in just a few years while maintaining its reserve currency dominance. Recent strength to 130 in 2022-2024 appears unsustainable given widening fiscal deficits and declining US share of global trade, suggesting room for significant near-term depreciation even as the dollar&amp;rsquo;s reserve status erodes only gradually.
&lt;a href="#lightbox-dollar_reserve_status-png-5" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/dollar_reserve_status.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/dollar_reserve_status.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/dollar_reserve_status.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/dollar_reserve_status.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/dollar_reserve_status.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/dollar_reserve_status.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/dollar_reserve_status.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/dollar_reserve_status.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/dollar_reserve_status.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/dollar_reserve_status.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/dollar_reserve_status.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/dollar_reserve_status.png"
alt="Dual-axis chart showing USD share of global reserves declining from 71% (1999) to 56% (2025) alongside trade-weighted dollar index fluctuations between 95-130"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
The divergence between structural dominance (slow decline) and cyclical valuation (rapid fluctuations) shows that dollar depreciation can occur independently of reserve currency status changes. Gold and the dollar typically move in opposite directions, and when the dollar weakens, gold becomes more attractive as an alternative store of value, driving its price higher. The outlook for gold in 2026 reflects a convergence of supportive factors beyond simple dollar weakness. Gold has already broken above $4,000/oz for the first time, driven by persistent inflation volatility and increasing demand from both investors and central banks. The structural case strengthens as central banks accelerate reserve diversification, with official sector gold purchases reaching record levels in 2023-2024 as institutions reduce dollar concentration.
&lt;a href="#lightbox-gold_dollar_chart2-png-6" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/gold_dollar_chart2.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/gold_dollar_chart2.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/gold_dollar_chart2.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/gold_dollar_chart2.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/gold_dollar_chart2.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/gold_dollar_chart2.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/gold_dollar_chart2.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/gold_dollar_chart2.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/gold_dollar_chart2.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/gold_dollar_chart2.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/gold_dollar_chart2.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/gold_dollar_chart2.png"
alt="Chart comparing Gold prices and Dollar Index from 1985-2024, showing gold&amp;#39;s rise from $300 to $2,700 and the dollar&amp;#39;s cyclical fluctuations, illustrating their inverse relationship"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
I modestly increased my FX-hedged gold position from 4.0% to 5.0%, reflecting upgraded return forecasts but keeping the allocation measured, given gold&amp;rsquo;s exceptional performance: up roughly 58% year-to-date through December 2025. This increase captures institutional conviction without chasing momentum.&lt;/p&gt;
&lt;section id="section3"&gt;&lt;/section&gt;
&lt;p&gt;• &lt;strong&gt;AI Investment Remains Central But Requires Scrutiny&lt;/strong&gt;: Almost all investment reports I read over the past weeks position AI as the dominant investment catalyst, with capex projected to reach $571 billion in 2026 (UBS) and potentially $1.3 trillion by 2030. The five largest hyperscalers now account for ~27% of S&amp;amp;P 500 capital expenditure.
&lt;a href="#lightbox-ai_capex_historical2-png-8" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/ai_capex_historical2.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/ai_capex_historical2.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/ai_capex_historical2.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/ai_capex_historical2.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai_capex_historical2.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/ai_capex_historical2.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/ai_capex_historical2.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/ai_capex_historical2.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ai_capex_historical2.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/ai_capex_historical2.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/ai_capex_historical2.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/ai_capex_historical2.png"
alt="Bar chart of infrastructure investment peaks as % of US GDP. AI capex projections for 2026 (1.9%, $571B) and 2030 (3.8%, $1.3T) would surpass all historical infrastructure booms including Broadband 2000 (1.15%) and Electricity 1949 (0.98%). Asterisks denote projected values from UBS"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
AI capital expenditure is projected to reach $1.3 trillion by 2030 (3.8% of US GDP), which would exceed all previous infrastructure booms including broadband (1.15%), electricity (0.98%), and the Apollo program (0.74%). However, as UBS notes,&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;no investment boom has ever seen capital spending perfectly match future demand.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I personally view the current AI boom as potentially speculative at least in terms of the current valuations, with many top S&amp;amp;P companies having inflated price-to-earnings ratios. &lt;a href="http://philippdubach.com/posts/is-ai-really-eating-the-world-1/2/"&gt;I&amp;rsquo;m also not convinced that AGI is imminent or that AI model providers will capture most of the economic value, believing AI may become a competitive commodity&lt;/a&gt; where value flows to companies using AI rather than those providing it. For portfolio allocation purposes, the actions derived are consistent with what we outlined in &lt;a href="#section1"&gt;(1)&lt;/a&gt; Market Concentration and Active Management Opportunity.&lt;/p&gt;
&lt;section id="section4"&gt;&lt;/section&gt;
&lt;p&gt;• &lt;strong&gt;European Fiscal Revolution Creates Investment Opportunities&lt;/strong&gt;: Germany&amp;rsquo;s historic abandonment of its debt brake policy, committing over €1 trillion to infrastructure, defense, and security spending (with an additional €600 billion in private sector commitments), represents a structural break from decades of fiscal conservatism.
&lt;a href="#lightbox-germany_fiscal_pivot-png-9" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/germany_fiscal_pivot.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/germany_fiscal_pivot.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/germany_fiscal_pivot.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/germany_fiscal_pivot.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/germany_fiscal_pivot.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/germany_fiscal_pivot.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/germany_fiscal_pivot.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/germany_fiscal_pivot.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/germany_fiscal_pivot.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/germany_fiscal_pivot.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/germany_fiscal_pivot.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/germany_fiscal_pivot.png"
alt="Bar chart of Germany&amp;#39;s fiscal spending breakdown: €500B infrastructure fund, €400B defense spending, plus €600B private sector commitments totaling over €1.5 trillion"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
JP Morgan upgrades eurozone growth to 1.5% and Goldman Sachs identifies a structural shift focused on defense independence, energy security, and reindustrialization. This fiscal activism is expected to narrow the US-Europe growth differential from 60bps to 30bps, making European equities, currently trading at a 22% discount to global peers, increasingly attractive despite elevated valuations elsewhere.
&lt;a href="#lightbox-european_valuation_discount-png-10" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/european_valuation_discount.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/european_valuation_discount.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/european_valuation_discount.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/european_valuation_discount.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/european_valuation_discount.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/european_valuation_discount.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/european_valuation_discount.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/european_valuation_discount.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/european_valuation_discount.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/european_valuation_discount.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/european_valuation_discount.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/european_valuation_discount.png"
alt="Bar chart comparing regional equity valuations: US at 23x forward P/E versus Europe at 14x, showing 22% European discount to global peers"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;section id="section5"&gt;&lt;/section&gt;
&lt;p&gt;• &lt;strong&gt;Fixed Income Offers Best Prospects Since Global Financial Crisis&lt;/strong&gt;: Higher starting yields and steeper curves have dramatically improved bond return potential. As of early December 2025, 10-year US Treasuries yield around &lt;a href="https://www.cnbc.com/quotes/US10Y"&gt;4.2%&lt;/a&gt;, with medium-duration quality bonds expected to generate mid-single-digit returns. All major research houses project 2-3 additional Fed rate cuts in 2026, while the ECB is expected to hold steady and the Bank of Japan to continue hiking. As usual it is to be expected that front-end yields are more sensitive to central bank policy and offer strong counter-cyclical properties, while fiscal concerns drive term-risk premia higher at the long end, benefiting strategic curve positioning.
&lt;a href="#lightbox-historic_crisis_returns-png-11" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/historic_crisis_returns.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/historic_crisis_returns.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/historic_crisis_returns.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/historic_crisis_returns.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/historic_crisis_returns.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/historic_crisis_returns.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/historic_crisis_returns.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/historic_crisis_returns.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/historic_crisis_returns.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/historic_crisis_returns.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/historic_crisis_returns.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/historic_crisis_returns.png"
alt="Bar chart showing 60/40 portfolio returns minus cash at 1-year and 3-year intervals after major crises (1990-2022). Average returns: 7% at 1-year, 22% at 3-year"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Michael Cembalest, Chairman of Market and Investment Strategy, J.P. Morgan Asset &amp;amp; Wealth Management on the impact of geopolitical events:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;It is shocking how little geopolitics actually matters to markets unless it gets truly terrible.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;• &lt;strong&gt;Other points to consider&lt;/strong&gt;: (1) Global growth remains resilient, with the US expected around 1.8% and global growth near 2.5%. Consensus points to America&amp;rsquo;s economic outperformance becoming &amp;ldquo;less exceptional&amp;rdquo; relative to other regions. (2) Expect elevated inflation volatility and sticky pricing pressures. Fed easing cycles are underway, but the path remains uncertain with tariffs adding to price pressures. (3) Europe&amp;rsquo;s fiscal pivot is the big story. Germany&amp;rsquo;s €1 trillion spending bill marks a historic shift, with broader European infrastructure investment accelerating. Fiscal deficits globally may weigh on currencies. (4) Economic nationalism is reshaping global dynamics. US effective tariff rates have reached levels not seen since 1934, creating a new trade order that markets must price in. (5) China&amp;rsquo;s Tech sector remains a top global opportunity despite tensions. Stimulus measures are supporting equities, and yuan appreciation is expected as growth stabilizes. (6) Attractive entry point for quality bonds. 10-year UST yields around 4.2% in December 2025 offer compelling returns, with better starting valuations than recent years. (7) Elevated geopolitical risks persist: Russia-Ukraine, Middle East tensions, and broader great power competition remain market-moving factors. (8) Bitcoin with institutional adoption accelerating ETF inflows continue and corporate treasury allocations are expanding. Regulatory clarity improving in the US, though enforcement actions remain a wildcard. Leverage buildup in derivatives markets. Watch for Bitcoin halving aftermath effects and macro liquidity conditions as primary drivers.&lt;/p&gt;
&lt;section id="end"&gt;&lt;/section&gt;
&lt;p&gt;A short note on my analysis process: (1) I combed through insights and data from analyst and research outlooks by &lt;a href="https://am.gs.com/en-hk/advisors/insights/article/investment-outlook"&gt;Goldman Sachs Asset Management&lt;/a&gt;, &lt;a href="https://am.jpmorgan.com/content/dam/jpm-am-aem/global/en/insights/portfolio-insights/ltcma/noindex/ltcma-full-report.pdf"&gt;J.P. Morgan Asset Management&lt;/a&gt;, &lt;a href="https://www.morganstanley.com/insights/articles/stock-market-investment-outlook-2026"&gt;Morgan Stanley&lt;/a&gt;, and &lt;a href="https://www.ubs.com/global/en/wealthmanagement/insights/year-ahead-registration.html"&gt;UBS Investment Research&lt;/a&gt;. (2) I wrote a &lt;a href="https://gist.github.com/philippdubach/0087fdaefb8ca905e5df87176c1a31e3"&gt;script to convert large PDFs to Markdown&lt;/a&gt; and optimize them for LLM processing. (3) I then used Claude agents to look for differences and similarities in the reports, &lt;a href="https://gist.github.com/philippdubach/5ebd726295f2fac39915d535000ce63a"&gt;which created an extensive overview&lt;/a&gt;. (4) This served, together with my own thoughts and opinions, as the basis for my 2026 allocation.&lt;/p&gt;
&lt;aside class="disclaimer" role="note" aria-label="Disclaimer"&gt;
&lt;div class="disclaimer-content"&gt;&lt;p&gt;&lt;strong&gt;Disclaimer:&lt;/strong&gt; All opinions expressed are my own. This is not investment, financial, tax, or legal advice. Past performance does not indicate future results. Do your own research and consult qualified professionals before making financial decisions. No liability accepted for any losses.&lt;/p&gt;&lt;/div&gt;
&lt;/aside&gt;</description></item><item><title>Not Logan Roy: Netflix vs. Paramount's Bidding War</title><link>http://philippdubach.com/posts/not-logan-roy-netflix-vs.-paramounts-bidding-war/</link><pubDate>Tue, 09 Dec 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/not-logan-roy-netflix-vs.-paramounts-bidding-war/</guid><description>&lt;p&gt;In the &lt;a href="https://en.wikipedia.org/wiki/Succession_(TV_series)"&gt;HBO series Succession&lt;/a&gt;, billionaire Logan Roy&amp;rsquo;s children spent four seasons scheming, backstabbing, and making offers to inherit a media empire. This week, the real version played out with more zeros and a $252 billion Oracle stake. Time for a closer look:&lt;/p&gt;
&lt;p&gt;On Friday, Warner Bros. Discovery&amp;rsquo;s board agreed to sell the company to &lt;a href="https://ir.netflix.net/investor-news-and-events/financial-releases/press-release-details/2025/NETFLIX-TO-ACQUIRE-WARNER-BROS--FOLLOWING-THE-SEPARATION-OF-DISCOVERY-GLOBAL-FOR-A-TOTAL-ENTERPRISE-VALUE-OF-82-7-BILLION-Equity-Value-of-72-0-Billion/default.aspx"&gt;Netflix for $72 billion&lt;/a&gt;. By Monday, &lt;a href="https://ir.paramount.com/news-releases/news-release-details/paramount-launches-all-cash-tender-offer-acquire-warner-bros"&gt;Paramount had launched a hostile tender offer&lt;/a&gt; directly to shareholders at $30 per share, all cash. In this post I will be going into the gap between those two numbers, streaming economics, aggregator theory, and hostile deal mechanics.
&lt;a href="#lightbox-wbd_offer_comparison2-png-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/wbd_offer_comparison2.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/wbd_offer_comparison2.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/wbd_offer_comparison2.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/wbd_offer_comparison2.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/wbd_offer_comparison2.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/wbd_offer_comparison2.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/wbd_offer_comparison2.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/wbd_offer_comparison2.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/wbd_offer_comparison2.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/wbd_offer_comparison2.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/wbd_offer_comparison2.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/wbd_offer_comparison2.png"
alt="Comparison of Netflix vs Paramount offers for Warner Bros Discovery"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
The Netflix offer breaks down into three pieces: $23.25 per share in cash, $4.50 per share in Netflix stock subject to a collar, and shares in a spun-off entity called Discovery Global containing CNN and the cable networks that Netflix doesn&amp;rsquo;t want. Analysts value that stub somewhere between $2 and $5 per share, which puts the total package at roughly $29.75 to $32.75. Paramount is offering $30 per share in cash for the entire company, including the cable assets. Warner&amp;rsquo;s stock closed Friday at $26.08 and opened Monday around $27.64, which tells you the market expects a bidding war but isn&amp;rsquo;t fully convinced either deal closes.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We&amp;rsquo;re sitting on Wall Street, where cash is still king. We are offering shareholders $17.6 billion more cash than the deal they currently have signed up with Netflix.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;David Ellison&amp;rsquo;s arithmetically correct. Warner&amp;rsquo;s board took the Netflix deal anyway. This gets at something I learned in a dealmaking class in University: Boards weight speculative ideas of long-term value. They believe the Discovery Global spinoff might be worth $5, that Netflix stock has upside, that strategic fit matters. Shareholders, particularly the arbs and institutional holders who actually vote, prefer certainty. Thirty dollars in cash is just $30 in cash. As &lt;a href="https://www.bloomberg.com/opinion/authors/ARbTQlRLRjE/matthew-s-levine"&gt;Matt Levine noted&lt;/a&gt;, &amp;ldquo;$30 in cash is worth more than, well, again, the stock closed at $26.08 on Friday.&amp;rdquo; The board&amp;rsquo;s job is to maximize long-term shareholder value. The shareholders would like their value now, please.
&lt;a href="#lightbox-streaming_market_cap2-png-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/streaming_market_cap2.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/streaming_market_cap2.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/streaming_market_cap2.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/streaming_market_cap2.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/streaming_market_cap2.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/streaming_market_cap2.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/streaming_market_cap2.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/streaming_market_cap2.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/streaming_market_cap2.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/streaming_market_cap2.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/streaming_market_cap2.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/streaming_market_cap2.png"
alt="Market capitalization of major streaming and media companies 2020-2025"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
The strategic logic behind Netflix&amp;rsquo;s offer deserves examination. Both companies began as distributors. The Warner brothers &lt;a href="https://en.wikipedia.org/wiki/Warner_Bros.#Founding"&gt;opened a movie theater in Pennsylvania in 1907&lt;/a&gt; before moving into film production; Netflix mailed DVDs before becoming a streaming giant. The crucial difference: physical distribution has capacity constraints, while internet distribution has none. A theater seat that goes unsold is lost revenue forever. The marginal costs of Netflix to deliver to one more subscriber, whether that subscriber is in Zurich or Tokio, are essentially zero. This asymmetry explains why Netflix is worth $425 billion and the combined legacy studios are worth a fraction of that. Consider what Netflix does to content it doesn&amp;rsquo;t own. Drive to Survive transformed Formula. &lt;a href="https://f1miamigp.com/news/press-release/apple-secures-f1-broadcast-deal-for-the-us/#:~:text=Formula%201%20has%20announced%20Apple,passion%20for%20innovation%20and%20entertainment."&gt;Apple is now paying $150 million annually&lt;/a&gt; for F1 broadcast rights that ESPN once carried for free. NBCUniversal&amp;rsquo;s Suits sat dormant on Peacock until Netflix licensed it and turned it into a streaming phenomenon. In each case, Netflix created enormous value but captured little of it. The logical next step: own the IP instead of renting it.&lt;/p&gt;
&lt;p&gt;This is what Ben Thompson calls &lt;a href="https://stratechery.com/2017/defining-aggregators/"&gt;aggregation economics&lt;/a&gt;. Hollywood executives spent years insisting that content was king, and for decades they were right. When distribution required owning theaters, securing broadcast licenses, or negotiating cable carriage, the studios held leverage. The internet eliminated those bottlenecks. Now the scarce resource isn&amp;rsquo;t access to content but attention, and the companies that own the customer relationship capture most of the value. Netflix grasped this early; the legacy studios chased streaming without understanding why Netflix was winning. The result: Netflix commands a market cap of $425 billion, while Paramount&amp;rsquo;s standalone value sits around $15 billion.
&lt;a href="#lightbox-financing_structure3-png-3" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/financing_structure3.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/financing_structure3.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/financing_structure3.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/financing_structure3.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/financing_structure3.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/financing_structure3.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/financing_structure3.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/financing_structure3.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/financing_structure3.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/financing_structure3.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/financing_structure3.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/financing_structure3.png"
alt="Paramount financing structure showing $54B debt, $40B Ellison equity, and Gulf SWF contributions"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Paramount&amp;rsquo;s financing structure is worth looking at. The &lt;a href="https://www.sec.gov/Archives/edgar/data/1437107/000119312525310708/d92876dex99a1a.htm"&gt;tender offer filing&lt;/a&gt; is backed by $54 billion in debt commitments from Bank of America, Citigroup, and Apollo, plus a $40.4 billion equity backstop from Larry Ellison&amp;rsquo;s trust. That trust holds approximately 1.16 billion Oracle shares worth around $252 billion at current prices. Additional equity comes from Saudi Arabia&amp;rsquo;s Public Investment Fund, Abu Dhabi&amp;rsquo;s L&amp;rsquo;imad Holding, Qatar Investment Authority, and Affinity Partners. To avoid CFIUS jurisdiction, the foreign investors have waived all governance and voting rights.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Hostile Tender Mechanics&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In a friendly deal, the target&amp;rsquo;s board negotiates terms and recommends shareholders accept. In a hostile tender, the acquirer goes directly to shareholders with a public offer, bypassing the board. Warner&amp;rsquo;s board has 10 business days to respond with a recommendation. Defense mechanisms exist (poison pills, enhanced breakup fees) but all invite litigation. The best defense is usually more money from the preferred bidder.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The antitrust arguments on both sides are instructive. Ellison argues that combining Netflix (#1 in streaming) with HBO Max (#3) is anticompetitive: &amp;ldquo;It&amp;rsquo;s like saying Coke could buy Pepsi because Budweiser sells a lot of beer.&amp;rdquo; Netflix counters by pointing to &lt;a href="https://www.nielsen.com/news-center/2025/streaming-claims-record-high-share-of-tv-viewing/"&gt;Nielsen&amp;rsquo;s TV viewing data&lt;/a&gt;, which shows Netflix at 8% of total TV usage, slightly below Paramount&amp;rsquo;s 8.2%. By that measure, Netflix ranks sixth overall, with YouTube at #1 and Disney at #2. The relevant market definition will determine whether this deal survives regulatory review.
&lt;a href="#lightbox-tv_viewing_share2-png-4" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/tv_viewing_share2.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/tv_viewing_share2.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/tv_viewing_share2.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/tv_viewing_share2.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/tv_viewing_share2.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/tv_viewing_share2.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/tv_viewing_share2.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/tv_viewing_share2.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/tv_viewing_share2.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/tv_viewing_share2.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/tv_viewing_share2.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/tv_viewing_share2.png"
alt="Nielsen TV viewing share by platform"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
If regulators define the market narrowly as &amp;ldquo;subscription video on demand,&amp;rdquo; combining Netflix with HBO Max looks troubling. If they define it as &amp;ldquo;all video consumption,&amp;rdquo; Netflix is one player among many, competing against YouTube&amp;rsquo;s bottomless catalog of free content. This framing matters because YouTube already exceeds Netflix in total viewing time. The existential threat facing Hollywood isn&amp;rsquo;t consolidation among paid streamers. It&amp;rsquo;s the democratization of content creation itself. Every teenager with a smartphone is a potential competitor for audience attention. The hours flowing to TikTok and YouTube creators don&amp;rsquo;t flow to HBO. From this vantage point, Netflix absorbing Warner Bros. looks less like monopolization than like circling the wagons. Ellison offered his counter-narrative on Monday: the Netflix deal means &amp;ldquo;the death of the theatrical movie business in Hollywood.&amp;rdquo; He promised to put 30 movies a year in theaters exclusively and to combine CBS News with CNN into what he called a news service &amp;ldquo;in the trust business, the truth business&amp;rdquo; that &amp;ldquo;speaks to the 70% of Americans that are in the middle.&amp;rdquo; Whether you find this vision compelling probably depends on your priors about theatrical distribution and centrist news. The deal timeline matters. Netflix&amp;rsquo;s offer is expected to take 12-18 months to close, driven by antitrust review. Paramount claims its offer has a faster path to regulatory approval. If Warner&amp;rsquo;s shareholders ultimately take Paramount&amp;rsquo;s offer, Warner owes Netflix a $2.8 billion breakup fee. If Netflix&amp;rsquo;s deal collapses after the review period, Netflix owes Warner $5.8 billion, one of the largest breakup fees on record.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Breakup Fee Economics&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Breakup fees serve two functions: compensating the jilted bidder for deal expenses and transaction costs, and creating a hurdle for competing offers. A $5.8 billion reverse breakup fee equals roughly $2 per Warner share, meaning any competing bid needs to clear that hurdle to be economically equivalent. The size of Netflix&amp;rsquo;s fee signals both confidence and a willingness to pay for optionality.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Warner&amp;rsquo;s stock trading below both offers reflects the compounded uncertainties: antitrust risk, timeline risk, financing risk, and the possibility that both deals fall apart. The 12-18 month window creates a lot of room for things to change. Interest rates could move. The administration&amp;rsquo;s antitrust priorities could shift. Netflix&amp;rsquo;s stock could fall further, reducing the value of the stock component. Paramount&amp;rsquo;s financing consortium could develop cold feet.&lt;/p&gt;
&lt;p&gt;What happens next is procedurally straightforward. Warner&amp;rsquo;s board will respond to Paramount&amp;rsquo;s tender offer within 10 business days. Netflix will likely raise its bid; Ellison signaled Monday that $30 &amp;ldquo;wasn&amp;rsquo;t best and final.&amp;rdquo; The arbs will push for whichever deal offers better risk-adjusted value. Whoever wins will spend the next year in &lt;a href="https://www.ftc.gov/legal-library/browse/statutes/hart-scott-rodino-antitrust-improvements-act-1976"&gt;antitrust review&lt;/a&gt; while the other side&amp;rsquo;s lawyers look for grounds to challenge.&lt;/p&gt;
&lt;p&gt;Hollywood&amp;rsquo;s century-old industrial structure is unwinding regardless of which bid prevails. The studio system emerged when controlling both production and distribution created durable advantages. The internet dissolved those advantages by making distribution essentially free and universally accessible. Warner Bros. spent a century building an integrated media empire; Netflix spent two decades proving that owning the customer relationship matters more than owning the soundstages. The question isn&amp;rsquo;t whether legacy media consolidates into tech platforms. It&amp;rsquo;s which platform, at what price, and whether inherited wealth can rewrite the outcome. I doubt it. On the internet, aggregators tend to win, and Netflix is the aggregator in video.&lt;/p&gt;</description></item><item><title>Nike's Crisis and the Economics of Brand Decay</title><link>http://philippdubach.com/posts/nikes-crisis-and-the-economics-of-brand-decay/</link><pubDate>Tue, 02 Dec 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/nikes-crisis-and-the-economics-of-brand-decay/</guid><description>&lt;h2 id="nikes-28-billion-value-destruction"&gt;Nike&amp;rsquo;s $28 Billion Value Destruction&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;What it sounds like is that the CEO has the wrong people making the wrong decisions across the strongest brand or one of the strongest brands in consumer history.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This quote by &lt;a href="https://www.youtube.com/watch?v=vRNe_aJEUqA"&gt;Scott Galloway on his podcast&lt;/a&gt; is from July 2024. In March 2025, Nike reported its worst revenue decline in nearly five years: an &lt;a href="https://www.reuters.com/business/retail-consumer/nike-post-worst-revenue-fall-5-years-stagnant-demand-2025-03-19/"&gt;11.5% drop to $11.01 billion. Digital sales fell 20%, app downloads decreased 35%, and store foot traffic declined 11%&lt;/a&gt;. Nike&amp;rsquo;s crisis reveals how competitive advantages work, and how quickly they can disappear when the company that once captured roughly half of the US athletic footwear market systematically weakens its own foundations.&lt;/p&gt;
&lt;p&gt;Nike spent decades building dominance through complementary assets: product development, athlete partnerships, and marketing that reinforced premium positioning. These three worked together to create what Porter would call a &lt;a href="https://hbr.org/1996/11/what-is-strategy"&gt;sustainable competitive advantage&lt;/a&gt;. When all three are strong, you can charge premium prices and maintain gross margins above 40%. When you systematically weaken each pillar simultaneously, the advantage collapses.
&lt;a href="#lightbox-nike_final-png-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/nike_final.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/nike_final.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/nike_final.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/nike_final.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/nike_final.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/nike_final.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/nike_final.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/nike_final.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/nike_final.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/nike_final.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/nike_final.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/nike_final.png"
alt="Nike stock price decline vs On Holdings and Deckers (Hoka) performance 2024-2025, showing NKE underperformance against athletic footwear competitors"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;h2 id="the-direct-to-consumer-strategy-failure"&gt;The Direct-to-Consumer Strategy Failure&lt;/h2&gt;
&lt;p&gt;Nike&amp;rsquo;s gross margins peaked around 45% in the mid-2010s. By fiscal 2025, &lt;a href="https://investors.nike.com/investors/news-events-and-reports/investor-news/investor-news-details/2025/NIKE-Inc--Reports-Fiscal-2025-Fourth-Quarter-and-Full-Year-Results/default.aspx"&gt;margins had compressed to 42.7%&lt;/a&gt;, a decline of 190 basis points, primarily due to higher discounts and unfavorable sales channel mix. The direct-to-consumer shift that was supposed to improve margins actually made things worse because it reduced retail presence at exactly the wrong moment. Competitors filled the shelf space Nike vacated.&lt;/p&gt;
&lt;p&gt;The most significant strategic shift came in 2020 when Nike hired &lt;a href="https://en.wikipedia.org/wiki/John_Donahoe"&gt;John Donahoe&lt;/a&gt;, a former Bain consultant and eBay CEO, to replace &lt;a href="https://en.wikipedia.org/wiki/Mark_Parker"&gt;Mark Parker&lt;/a&gt;. Donahoe accelerated the direct-to-consumer transition, terminating hundreds of wholesale accounts. The theory was sound: wholesale margins are 30-35% after retailer markups, while direct sales can reach 50% or higher. But retail shelf space is a zero-sum game. When Nike pulled out, competitors immediately filled the void. Running brands like On and Hoka, which had been developing new sole technology and cushioning systems, suddenly had access to prime retail real estate. &lt;a href="https://www.on.com/en-ch/explore/technology/cloudtec?srsltid=AfmBOoomVyhraDqH3pGlVf7yD58YPNIajIRWR02jOmbaop1oR1i5y941"&gt;On&amp;rsquo;s CloudTec&lt;/a&gt; and &lt;a href="https://www.hoka.com/en/us/maximum-cushion/"&gt;Hoka&amp;rsquo;s maximalist cushioning&lt;/a&gt; gained visibility precisely when Nike was reducing its retail presence. Nike improved its direct-to-consumer margins but reduced its total addressable market. The company assumed consumers would follow it online, but many didn&amp;rsquo;t. Instead, they discovered alternatives in physical stores.&lt;/p&gt;
&lt;h2 id="product-innovation-decline-and-organizational-restructuring"&gt;Product Innovation Decline and Organizational Restructuring&lt;/h2&gt;
&lt;p&gt;The product development problem was structural. Donahoe reorganized Nike from sport-specific teams to general categories: Men&amp;rsquo;s, Women&amp;rsquo;s, and Kids&amp;rsquo;. This destroyed the organizational capabilities that had driven product development. The running team understood biomechanics and materials science. The basketball team understood court dynamics and performance requirements. When you collapse these into general categories, you lose the specialized knowledge that creates functional differentiation. Senior designers and executives left for competitors. The company began relying more heavily on retro basketball shoes, which worked during the pandemic. But when trends shifted toward lower-profile shoes like Adidas Sambas, Nike was left holding excess inventory. Inventory turnover deteriorated, and gross margins compressed further.&lt;/p&gt;
&lt;h2 id="how-on-and-hoka-captured-nikes-athletic-footwear-market-share"&gt;How On and Hoka Captured Nike&amp;rsquo;s Athletic Footwear Market Share&lt;/h2&gt;
&lt;p&gt;The combination of reduced retail presence and weaker product development created a gap that competitors exploited. On&amp;rsquo;s revenue grew from &lt;a href="https://investors.on-running.com/financials-and-filings/financial-releases/news-details/2025/On-Announces-Fourth-Quarter-and-Full-Year-Results-and-the-Filing-of-its-Annual-Report-on-Form-20-F-for-2024/default.aspx"&gt;$330 million in 2020 to $1.8 billion by 2025&lt;/a&gt;. Hoka&amp;rsquo;s parent company, Deckers, saw Hoka revenue increase from &lt;a href="https://www.deckers.com/investors/financial-information/annual-reports"&gt;$352 million in 2020 to $1.4 billion&lt;/a&gt;. Moody&amp;rsquo;s has noted that emerging brands like On and Hoka have intensified competition, contributing to Nike&amp;rsquo;s 10% revenue drop and &lt;a href="https://www.reuters.com/business/moodys-downgrades-nikes-debt-ratings-cost-pressures-2025-11-13/"&gt;42% decline in EBIT&lt;/a&gt;.
&lt;a href="#lightbox-nike_revenue_comparison-png-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/nike_revenue_comparison.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/nike_revenue_comparison.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/nike_revenue_comparison.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/nike_revenue_comparison.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/nike_revenue_comparison.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/nike_revenue_comparison.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/nike_revenue_comparison.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/nike_revenue_comparison.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/nike_revenue_comparison.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/nike_revenue_comparison.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/nike_revenue_comparison.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/nike_revenue_comparison.png"
alt="Nike revenue decline compared to On Running and Hoka revenue growth 2020-2025, showing athletic footwear market share shift"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
The athlete partnership strategy collapsed. Roger Federer left for On, which he partially owns. Harry Kane signed with Skechers. Simone Biles went to Athleta. Josh Allen moved to New Balance. Tiger Woods left to start his own brand. These departures matter because athlete partnerships aren&amp;rsquo;t just marketing expenses. They&amp;rsquo;re product development inputs and distribution channels. When Michael Jordan worked with Nike in the 1980s, the collaboration produced the Air Jordan line, which generated over &lt;a href="https://investors.nike.com/investors/news-events-and-reports/investor-news/default.aspx"&gt;$5 billion in annual revenue&lt;/a&gt; by 2023. The real problem isn&amp;rsquo;t that athlete deals are more expensive today. It&amp;rsquo;s that &lt;a href="https://www.youtube.com/watch?v=txwDt595-Yo&amp;amp;t=199"&gt;Nike lost athletes&lt;/a&gt; because it was no longer the clear leader in product development. Federer left because On was developing better running shoes.&lt;/p&gt;
&lt;h2 id="nikes-marketing-strategy-failure-and-brand-erosion"&gt;Nike&amp;rsquo;s Marketing Strategy Failure and Brand Erosion&lt;/h2&gt;
&lt;p&gt;The marketing shift was perhaps the most visible change, and the most damaging to brand positioning. Nike&amp;rsquo;s historical advertising was unapologetically about winning. The tagline &amp;ldquo;&lt;a href="https://www.reddit.com/media?url=https%3A%2F%2Fpreview.redd.it%2Fe3lkj2o4lml31.jpg%3Fwidth%3D1080%26crop%3Dsmart%26auto%3Dwebp%26s%3Dfdc5352ed7f85228e7a90d594462e82c7089b774"&gt;Just Do It&lt;/a&gt;&amp;rdquo; was aggressive. The tone was confident, sometimes arrogant. This worked because it matched the product and the athletes. When you have the best products and the best athletes, you can be arrogant. When you don&amp;rsquo;t, arrogance becomes empty posturing. Under Donahoe, the messaging softened. Ads became more whimsical, focused on participation rather than victory, and emphasized social issues. A strategy that worked for many brands but for Nike, this represented a departure from what had made the brand distinctive. More importantly, it didn&amp;rsquo;t match the product. Without new products and top athletes, the softer messaging couldn&amp;rsquo;t sustain premium positioning.&lt;/p&gt;
&lt;p&gt;Then came the tariffs. In early 2025, the Trump administration implemented &lt;a href="https://www.whitehouse.gov/wp-content/uploads/2025/04/Annex-I.pdf"&gt;&amp;ldquo;reciprocal tariffs&amp;rdquo;&lt;/a&gt; on imports from various countries. &lt;a href="https://media.about.nike.com/files/be19ec03-e139-46b6-9b6c-d44a5c699110/FY23_Nike_Impact_Report.pdf"&gt;Nike manufactures 95% of its shoes and 60% of its apparel in Southeast Asia&lt;/a&gt;, particularly in Vietnam, China, Indonesia, and Cambodia. While rates were later adjusted downward, Nike still faces an estimated &lt;a href="https://apnews.com/article/f84fe37e11dbf4b439d8655d3533380c"&gt;$1 billion to $1.5 billion in additional tariff costs&lt;/a&gt; over the next few years. Nike has announced plans to reduce China&amp;rsquo;s share of its U.S. footwear imports from 16% to single digits by fiscal 2026, but this requires building new supplier relationships, retooling factories, and establishing new logistics networks. The transition will take years and cost billions.&lt;/p&gt;
&lt;aside class="inline-newsletter" aria-label="Newsletter signup"&gt;
&lt;div class="inline-newsletter-content"&gt;
&lt;p class="inline-newsletter-headline"&gt;Enjoy this writing? Get new posts, projects, and articles delivered monthly.&lt;/p&gt;
&lt;form id="inline-newsletter-2-form" class="inline-newsletter-form"&gt;
&lt;label for="inline-newsletter-2-email" class="visually-hidden"&gt;Email address&lt;/label&gt;
&lt;input
type="email"
id="inline-newsletter-2-email"
name="email"
placeholder="your@email.com"
required
class="inline-newsletter-input"
aria-label="Email address"
/&gt;
&lt;button type="submit" class="inline-newsletter-button"&gt;Sign Up&lt;/button&gt;
&lt;/form&gt;
&lt;p id="inline-newsletter-2-privacy" class="inline-newsletter-privacy"&gt;&lt;a href="http://philippdubach.com/posts/building-a-no-tracking-newsletter-from-markdown-to-distribution/"&gt;No tracking&lt;/a&gt;. Unsubscribe anytime.&lt;/p&gt;
&lt;div id="inline-newsletter-2-message" class="inline-newsletter-message" style="display: none;"&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/aside&gt;
&lt;script&gt;
(function() {
var formId = 'inline-newsletter-2-form';
var messageId = 'inline-newsletter-2-message';
var emailId = 'inline-newsletter-2-email';
var privacyId = 'inline-newsletter-2-privacy';
function init() {
var form = document.getElementById(formId);
var messageDiv = document.getElementById(messageId);
var emailInput = document.getElementById(emailId);
var privacyDiv = document.getElementById(privacyId);
if (privacyDiv &amp;&amp; !privacyDiv.dataset.countLoaded) {
privacyDiv.dataset.countLoaded = 'true';
fetch('https://newsletter-api.philippd.workers.dev/api/subscriber-count')
.then(function(r) { return r.json(); })
.then(function(data) {
if (data.display) {
var countText = document.createTextNode('Join ' + data.display + ' readers. ');
privacyDiv.insertBefore(countText, privacyDiv.firstChild);
}
})
.catch(function() { });
}
if (!form) return;
form.addEventListener('submit', function(e) {
e.preventDefault();
var email = emailInput.value.trim();
if (!email) return;
var emailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
if (!emailRegex.test(email)) {
showMessage('Please enter a valid email address.', 'error');
return;
}
var submitButton = form.querySelector('button[type="submit"]');
submitButton.disabled = true;
submitButton.textContent = 'Subscribing...';
fetch('https://newsletter-api.philippd.workers.dev/api/subscribe', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ email: email })
})
.then(function(response) { return response.json(); })
.then(function(data) {
if (data.success) {
form.style.display = 'none';
document.querySelector('#' + formId).closest('.inline-newsletter').querySelector('.inline-newsletter-privacy').style.display = 'none';
showMessage('Thanks for subscribing! You\'ll receive the next newsletter in your inbox.', 'success');
} else {
showMessage(data.error || 'Something went wrong. Please try again.', 'error');
submitButton.disabled = false;
submitButton.textContent = 'Sign Up';
}
})
.catch(function() {
showMessage('Something went wrong. Please try again later.', 'error');
submitButton.disabled = false;
submitButton.textContent = 'Sign Up';
});
});
function showMessage(text, type) {
messageDiv.textContent = text;
messageDiv.className = 'inline-newsletter-message inline-newsletter-message-' + type;
messageDiv.style.display = 'block';
}
}
if (document.readyState === 'loading') {
document.addEventListener('DOMContentLoaded', init);
} else {
init();
}
})();
&lt;/script&gt;
&lt;h2 id="supply-chain-risk-and-path-dependency"&gt;Supply Chain Risk and Path Dependency&lt;/h2&gt;
&lt;p&gt;The tariff situation highlights a broader issue with Nike&amp;rsquo;s supply chain strategy. The company has concentrated manufacturing in a small number of countries to achieve scale economies and cost efficiency. This works well when trade policy is stable, but when trade policy shifts, the concentration becomes a vulnerability. This is what economists call path dependency. Past decisions constrain future options. Nike focused on cost efficiency, but this created exposure to trade policy risk that the company can&amp;rsquo;t easily unwind. When &lt;a href="https://www.reuters.com/business/retail-consumer/nike-supplier-halts-production-3-vietnam-plants-due-covid-19-2021-07-15/"&gt;Vietnam shut down factories during COVID-19&lt;/a&gt;, Nike lost three months of production.&lt;/p&gt;
&lt;h2 id="nikes-turnaround-strategy-under-elliott-hill"&gt;Nike&amp;rsquo;s Turnaround Strategy Under Elliott Hill&lt;/h2&gt;
&lt;p&gt;The company&amp;rsquo;s response has been to return to its original formula. In September 2024, &lt;a href="https://apnews.com/article/6f439a3f5a62e9b2380aaf3e21c7e220"&gt;Nike replaced Donahoe with Elliott Hill&lt;/a&gt;, a 30-year company veteran. Hill is pushing product development as part of &lt;a href="https://investors.nike.com/investors/news-events-and-reports/investor-news/investor-news-details/2025/NIKE-Inc--Reports-Fiscal-2025-Fourth-Quarter-and-Full-Year-Results/default.aspx"&gt;Nike&amp;rsquo;s &amp;ldquo;Sport Offense&amp;rdquo; strategy&lt;/a&gt;. There are &lt;a href="https://www.youtube.com/watch?v=b0Ezn5pZE7o"&gt;early signs&lt;/a&gt; this might work, though Nike has faced multiple consecutive quarters of declining sales as it works to stabilize the business.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The sport offense realignment will focus on driving distinction within key sports, building a complete product portfolio, creating stories to inspire and connect with consumers, and elevating and growing the entire marketplace.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Nike&amp;rsquo;s competitors have used the past few years to build stronger positions. On has established itself in running with new sole technology and Federer&amp;rsquo;s endorsement, growing from $330 million to $1.8 billion in revenue. Hoka has gained market share with its maximalist cushioning, growing from &lt;a href="https://www.deckers.com/investors/financial-information/annual-reports"&gt;$352 million to $1.4 billion&lt;/a&gt;. The competitive environment has fundamentally changed. In the 2010s, Nike could dominate through scale and brand power alone. Today, smaller brands can compete effectively by focusing on specific sports or product categories. Social media and direct-to-consumer platforms have lowered barriers to entry. Nike can&amp;rsquo;t simply return to its old formula and expect to regain dominance because the structural advantages that made the old formula work no longer exist.
&lt;a href="#lightbox-nike_strategic_cascade_v2-png-4" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/nike_strategic_cascade_v2.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/nike_strategic_cascade_v2.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/nike_strategic_cascade_v2.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/nike_strategic_cascade_v2.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/nike_strategic_cascade_v2.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/nike_strategic_cascade_v2.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/nike_strategic_cascade_v2.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/nike_strategic_cascade_v2.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/nike_strategic_cascade_v2.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/nike_strategic_cascade_v2.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/nike_strategic_cascade_v2.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/nike_strategic_cascade_v2.png"
alt="Nike strategic decision cascade showing how DTC pivot, product innovation decline, athlete departures, and marketing strategy failure combined to destroy competitive advantage"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/p&gt;
&lt;h2 id="why-complementary-assets-explain-nikes-brand-decline"&gt;Why Complementary Assets Explain Nike&amp;rsquo;s Brand Decline&lt;/h2&gt;
&lt;p&gt;Nike&amp;rsquo;s crisis wasn&amp;rsquo;t caused by a single mistake or even a series of mistakes. It was caused by a fundamental misunderstanding of how competitive advantages work. The company treated its brand, its products, and its athlete partnerships as separate assets that could be managed independently. But they&amp;rsquo;re not separate. They&amp;rsquo;re complementary assets that create value only when they work together. When you weaken one, you weaken all of them. When you weaken all of them simultaneously, the advantage collapses completely. The direct-to-consumer transition, organizational restructuring, and marketing shift each made sense in isolation. But together, they destroyed the system that created Nike&amp;rsquo;s competitive advantage.&lt;/p&gt;
&lt;p&gt;The company focused on efficiency and margins while ignoring the fact that its real advantage came from being the best at product development, athlete relationships, and brand positioning at the same time. The tariffs added pressure at the worst possible time, but they&amp;rsquo;re not the root cause. Nike systematically weakened its competitive advantages and then tried to maintain premium pricing without the product foundation to support it. The market responded predictably and retailers quickly filled shelf space with other brands. Athletes left for companies that were actually developing new products. Recovery will be harder than the decline because the structural advantages that made Nike dominant might no longer exist.&lt;/p&gt;</description></item><item><title>Deploying to Production with AI Agents: Testing Cursor on Azure</title><link>http://philippdubach.com/posts/deploying-to-production-with-ai-agents-testing-cursor-on-azure/</link><pubDate>Sun, 30 Nov 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/deploying-to-production-with-ai-agents-testing-cursor-on-azure/</guid><description>&lt;p&gt;I&amp;rsquo;ve been curious about &lt;a href="https://cursor.com/features"&gt;Cursor&amp;rsquo;s capabilities&lt;/a&gt; for a while, but never had a good reason to try it. This weekend I decided to host my own URL shortener and deployed &lt;a href="https://yourls.org"&gt;YOURLS&lt;/a&gt;, a free and open-source link shortener, on a fresh Azure VM. It seemed like a solid test case since it involves SSH access, server configuration, database setup, and SSL certificates. If an AI assistant could handle that end-to-end, it would be genuinely useful.&lt;/p&gt;
&lt;p&gt;I was honestly surprised. Cursor didn&amp;rsquo;t just write commands it connected via SSH, navigated the server, installed dependencies, configured Apache virtual hosts, set up MySQL, and handled the SSL certificate setup. It made sensible decisions about file permissions, security settings, and configuration details. When I asked for a custom YOURLS plugin to add date prefixes to short URLs, it built it on the first try. The whole build and deployment took about 15 minutes, which previously took me at least an hour of manual work and troubleshooting.&lt;/p&gt;
&lt;p&gt;The URL shortener is now live and working. You can find this article at &lt;a href="https://pdub.click/2511308"&gt;pdub.click/2511308&lt;/a&gt;. I made the full scrubbed &lt;a href="https://gist.github.com/philippdubach/d913591f906447041e2752729cd406e5"&gt;transcript available&lt;/a&gt; if you want to see exactly how Cursor handled each step. If you want to do this installation yourself, I wrote a &lt;a href="https://philippdubach.com/standalone/yourls-azure-tutorial/"&gt;step-by-step tutorial&lt;/a&gt; covering the entire process, or you might as well let Cursor do it.&lt;/p&gt;
&lt;p&gt;Right after finishing, I closed my laptop and went to clean my bathroom. This reminded me of &lt;a href="https://x.com/AuthorJMac/status/1773679197631701238?lang=en"&gt;Joanna Maciejewska&amp;rsquo;s quote&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I want AI to do my laundry and dishes so that I can do art and writing, not for AI to do my art and writing so that I can do laundry and dishes.&lt;/p&gt;
&lt;/blockquote&gt;</description></item><item><title>Michael Burry's $379 Newsletter</title><link>http://philippdubach.com/posts/michael-burrys-379-newsletter/</link><pubDate>Fri, 28 Nov 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/michael-burrys-379-newsletter/</guid><description>&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Michael_Burry"&gt;Michael Burry&lt;/a&gt; (who in your head probably looks like &lt;a href="https://www.historyvshollywood.com/reelfaces/big-short/"&gt;Christian Bale thanks to The Big Short&lt;/a&gt;), the investor who famously predicted the 2008 housing crash, has launched a Substack newsletter after &lt;a href="https://www.bloomberg.com/news/articles/2025-11-18/burry-says-he-s-active-in-markets-after-fund-is-deregistered"&gt;deregistering his hedge fund&lt;/a&gt;. The $379 annual subscription capitalizes on the 1.6 million followers he&amp;rsquo;s built on &lt;a href="https://twitter.com/michaeljburry"&gt;X&lt;/a&gt;, offering what he describes as his &amp;ldquo;sole focus&amp;rdquo; going forward.&lt;/p&gt;
&lt;p&gt;The newsletter&amp;rsquo;s &lt;a href="https://michaeljburry.substack.com/p/foundations-my-1999-and-part-of-2000"&gt;inaugural post takes&lt;/a&gt; (which he kindly enough made accessible for free as a Thanksgiving gift today) readers back to 1999, when Burry was a 27-year-old neurology resident at Stanford making $33'000 annually while carrying $150'000 in medical school debt. There he wrote his &lt;a href="https://michaeljburry.substack.com/api/v1/file/a7e6acc6-aeac-460a-a26a-5fbe43e50d19.pdf"&gt;Valuestocks.net article &amp;ldquo;Buffett Revisited&amp;rdquo;&lt;/a&gt;. A fellow resident casually mentioned making $1.5 million on Polycom stock. Physicians crowded around terminals checking stocks while patients waited. In that environment, Burry was writing investment analysis late at night, getting paid $1 per word by MSN Money under the pen name &amp;ldquo;Value Doc.&amp;rdquo; His VSN Fund returned 68.1% in 1999, and by February 2000, the &lt;a href="https://michaeljburry.substack.com/api/v1/file/7e7cf8c7-2cd5-4bc1-8e36-0f7354ae04d6.pdf"&gt;San Francisco Chronicle&lt;/a&gt; noted he had shorted Amazon. Fourteen days after that article appeared, the NASDAQ topped. It was a peak it wouldn&amp;rsquo;t revisit for 15 years.&lt;/p&gt;
&lt;p&gt;Burry&amp;rsquo;s approach today is notably personal and reflective. His analysis of Apple in 1999 exemplifies his contrarian thinking. He bought it for the VSN Portfolio despite pushback, writing that great companies like Coca-Cola, American Express, and Disney had all experienced 11-year periods of negative real returns. Unlike the skeptics who simply dismissed the internet as a fad in 1999, Burry recognized the technology was transformational; he just believed the infrastructure was being overbuilt relative to near-term demand. He&amp;rsquo;s making the same argument about AI today. He believes markets are deep in bubble territory, drawing parallels between the late 1990s tech mania and today&amp;rsquo;s AI boom. His X post&amp;rsquo;s often echoed familiar warnings:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Feb 21, 2000: SF Chronicle says I&amp;rsquo;m short Amazon. Greenspan 2005: &amp;lsquo;bubble in home prices … does not appear likely.&amp;rsquo; Powell &amp;lsquo;25: &amp;lsquo;AI companies actually… are profitable… it&amp;rsquo;s a different thing.&amp;rsquo;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The comparison is deliberate. Burry highlighted then-Fed Chair Alan Greenspan&amp;rsquo;s 2005 insistence that U.S. housing prices showed no signs of a bubble. This was just two years before the subprime implosion validated Burry&amp;rsquo;s famous &amp;ldquo;Big Short.&amp;rdquo; Today, he&amp;rsquo;s openly bearish on AI poster children Nvidia and Palantir, suggesting history is rhyming once again. Shortly after the newsletter was out, &lt;a href="https://x.com/firstadopter/status/1993077524813980131?s=20"&gt;Nvidia circulated a seven-page memo&lt;/a&gt; to Wall Street analysts explicitly naming Burry in its opening, a rare move for a company of Nvidia&amp;rsquo;s stature. The memo sought to refute his claims about stock-based compensation, depreciation schedules, and what he calls &amp;ldquo;circular financing.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;Mainly, Nvidia disputed Burry&amp;rsquo;s depreciation argument. Burry contends that customers overstate GPU useful lives to justify massive capex, claiming the hardware becomes obsolete in two to three years. Nvidia counters that its A100s, released in 2020, continue running at high utilization rates with &amp;ldquo;meaningful economic value&amp;rdquo; well beyond that timeframe, justifying the standard four-to-six-year depreciation schedule. The memo also rejected suggestions of &amp;ldquo;&lt;a href="https://www.youtube.com/watch?v=Q0TpWitfxPk"&gt;circular financing&lt;/a&gt;,&amp;rdquo; noting that Nvidia&amp;rsquo;s strategic investments represent a small fraction of revenue and that AI startups raise capital predominantly from outside investors. Burry responded on Substack:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I stand by my analysis. I am not claiming Nvidia is Enron. It is clearly Cisco.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;He argues Nvidia now occupies the exact position Cisco held in 1999-2000. It&amp;rsquo;s the key hardware supplier powering a massive capital investment cycle built on optimistic demand forecasts. Just as telecom companies spent tens of billions laying fiber optic cable based on projections that &amp;ldquo;internet traffic doubles every 100 days,&amp;rdquo; today&amp;rsquo;s hyperscalers are promising nearly $3 trillion in AI infrastructure spending over the next three years. The problem? In the early 2000s, less than 5% of U.S. fiber capacity was operational. Burry believes today&amp;rsquo;s AI buildout rests on similarly flawed assumptions about data center power and GPU longevity. &amp;ldquo;And once again there is a Cisco at the center of it all, with the picks and shovels for all and the expansive vision to go with it. Its name is Nvidia,&amp;rdquo; Burry wrote. The analogy might resonate with market observers who remember how that story ended. Cisco&amp;rsquo;s stock peaked above $80 in March 2000. It wouldn&amp;rsquo;t return to that level for nearly 24 years. The company survived and remained profitable, but shareholders who bought at the top experienced a generational loss. One key difference is worth noting: Cisco&amp;rsquo;s forward P/E in 2000 was around 200; Nvidia&amp;rsquo;s is under 40.&lt;/p&gt;
&lt;p&gt;In my opinion the technical argument around depreciation matters more than it might appear. If hyperscalers must depreciate GPUs over three years instead of six, companies like Alphabet would see roughly a 10% hit to net profit. More importantly, it would signal that the economic returns on AI infrastructure spending are weaker than advertised. Alphabet, for example, is currently guiding $90 billion-plus of AI spending this year. Using 5-year straight line depreciation, you get $18 billion per year in expenses. Add $9 billion for a conservative 10% WACC. That&amp;rsquo;s $27 billion, and assuming a 70% blended margin, you need about $40 billion per year in incremental revenue directly attributable to AI to make the infrastructure spending justifiable. Therefore, the question remains: how much of Alphabet&amp;rsquo;s $60 billion annualized revenue increase is actually attributable to AI versus normal growth? Burry closes his first newsletter with characteristic understatement:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I doubted if I should ever come back. I&amp;rsquo;m back.&lt;/p&gt;
&lt;/blockquote&gt;</description></item><item><title>Is AI Really Eating the World? AGI, Networks, Value [2/2]</title><link>http://philippdubach.com/posts/is-ai-really-eating-the-world-agi-networks-value-2/2/</link><pubDate>Mon, 24 Nov 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/is-ai-really-eating-the-world-agi-networks-value-2/2/</guid><description>&lt;p&gt;&lt;em&gt;Start by reading &lt;a href="http://philippdubach.com/posts/is-ai-really-eating-the-world-1/2/"&gt;Is AI Really Eating the World? What we&amp;rsquo;ve Learned [1/2]&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;All current &lt;a href="https://en.wikipedia.org/wiki/Recommender_system"&gt;recommendation systems&lt;/a&gt; work by capturing and analyzing user behavior at scale. Netflix needs millions of users watching millions of hours to train its recommendation algorithm. Amazon needs billions of purchases. The &lt;a href="https://en.wikipedia.org/wiki/Network_effect"&gt;network effect&lt;/a&gt; comes from data scale. What if LLMs can bypass this? What if an LLM can provide useful recommendations by reasoning about conceptual relationships rather than requiring massive behavioral datasets? If I ask for &amp;ldquo;books like Pirsig&amp;rsquo;s Zen and the Art of Motorcycle Maintenance but more focused on Eastern philosophy,&amp;rdquo; a sufficiently capable LLM might answer well without needing to observe 100 million readers. It understands (or appears to understand) the conceptual space. I&amp;rsquo;m uncertain whether LLMs can do this reliably by the end of 2025. The fundamental question is whether they reason or pattern-match at a very sophisticated level. &lt;a href="https://arxiv.org/abs/2308.03762"&gt;Recent research suggests LLMs may rely more on statistical correlations than true reasoning&lt;/a&gt;. If it&amp;rsquo;s mostly pattern-matching, they still need the massive datasets and we&amp;rsquo;re back to conventional network effects. If they can actually reason over conceptual spaces, that&amp;rsquo;s different. That would unbundle data network effects from recommendation quality. Recommendation quality would depend on model capability, not data scale. And if model capability is commoditizing, then the value in recommendations flows to whoever owns customer relationships and distribution, not to whoever has the most data or the best model. I lean toward thinking LLMs are sophisticated pattern-matchers rather than reasoners, which means traditional network effects still apply. But this is one area where I&amp;rsquo;m genuinely waiting to see more evidence.&lt;/p&gt;
&lt;p&gt;Now, on AGI. The Silicon Valley consensus, articulated by &lt;a href="https://sherwood.news/tech/gi-artificial-general-intelligence-when-predictions/"&gt;Sutskever, Altman, Musk, and others&lt;/a&gt;, is that we&amp;rsquo;re on a clear path to artificial general intelligence in the next few years, possibly by 2027 or 2028. The argument goes: &lt;a href="https://arxiv.org/abs/2001.08361"&gt;scaling laws&lt;/a&gt; continue to hold, we&amp;rsquo;re seeing emergent capabilities at each scale jump, and there&amp;rsquo;s no obvious wall before we reach human-level performance across all cognitive domains. I remain unconvinced. Not because I think AGI is impossible, but because the path from &amp;ldquo;really good at pattern completion and probabilistic next-token prediction&amp;rdquo; to &amp;ldquo;general reasoning and planning capabilities&amp;rdquo; seems less straightforward than the AI CEOs suggest. &lt;a href="https://arxiv.org/abs/2305.00050"&gt;Current LLMs still fail in characteristic ways on tasks requiring actual causal reasoning&lt;/a&gt;, spatial reasoning, or planning over extended horizons. They&amp;rsquo;re getting better, but the improvement curve on these specific capabilities looks different from the improvement curve on language modeling perplexity. That suggests to me that we might need architectural innovations beyond just scaling, and those are harder to predict.&lt;/p&gt;
&lt;p&gt;But let&amp;rsquo;s say I&amp;rsquo;m wrong. Let&amp;rsquo;s say AGI arrives by 2028. Even then, I find it hard to model why this would be tremendously economically beneficial specifically to the companies that control the models. Here&amp;rsquo;s why: we already have multiple competing frontier models (ChatGPT, Claude, Gemini, Microsoft&amp;rsquo;s offerings, and now DeepSeek). If AGI arrives, it likely arrives for multiple players at roughly the same time, given how quickly capabilities diffuse in this space. Multiple competing AGIs means price competition. Price competition in a product with near-zero marginal cost means prices collapse toward marginal cost. Where does economic value flow in that scenario? It flows to the users of AI, not the providers. Engineering firms using AGI for materials development capture value through better materials. Pharmaceutical companies using AGI for drug discovery capture value through better drugs. Retailers using AGI for inventory management capture value through better margins. The AGI providers compete with each other to offer the capability at the lowest price. This is basic microeconomics. You capture value when you have market power, either through monopoly, through differentiation, or through control of a scarce input. If models are commodities or near-commodities, model providers have none of these.&lt;/p&gt;
&lt;p&gt;The counterargument is that one provider achieves escape velocity and reaches AGI first with enough of a lead that they establish dominance before others catch up. This is the OpenAI/Microsoft theory of the case. Maybe. But the evidence so far suggests capability leads are measured in months, not years. &lt;a href="https://openai.com/index/gpt-4-research/"&gt;GPT-4 launched in March 2023&lt;/a&gt; with a substantial lead. Within six months, &lt;a href="https://www.anthropic.com/news/claude-2"&gt;Claude 2 was comparable&lt;/a&gt;. Within a year, multiple models clustered around similar capability. The diffusion is fast. Another counterargument is vertical integration. Maybe the hyperscalers that control cloud infrastructure plus model development plus customer relationships plus application distribution can capture value even if models themselves commoditize. This is more plausible, essentially the AWS playbook. Amazon didn&amp;rsquo;t make money by having the best database. They made money by owning the infrastructure, the customer relationships, and the entire stack from hardware to application platform. Microsoft is clearly pursuing this strategy with &lt;a href="https://www.microsoft.com/en-us/microsoft-365/blog/2023/03/16/introducing-microsoft-365-copilot-a-whole-new-way-to-work/"&gt;Azure plus OpenAI plus Copilot plus Office integration&lt;/a&gt;. Google has Search plus Cloud plus Gemini plus Workspace. This could work, but it&amp;rsquo;s a different thesis than &amp;ldquo;we have the best model.&amp;rdquo; It&amp;rsquo;s &amp;ldquo;we control the distribution and can bundle.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;Evans shows a scatter plot (Slide 34) of model benchmark scores from &lt;a href="https://arxiv.org/abs/2009.03300"&gt;standard evaluations like MMLU and HumanEval&lt;/a&gt;. Leaders change weekly. The gaps are small. Meanwhile, consumer awareness doesn&amp;rsquo;t track model quality. ChatGPT dominates with over &lt;a href="https://openai.com/index/how-people-are-using-chatgpt/"&gt;700 million weekly active users&lt;/a&gt; not because it has the best model anymore, but because it got there first and built brand. If models are commodities, value moves up the stack to product design, distribution, vertical integration, and customer relationships. This is exactly what happened with databases. Oracle didn&amp;rsquo;t win because they had the best database engine. They won through enterprise sales, support contracts, and ecosystem lock-in. Microsoft didn&amp;rsquo;t beat them with a better database. They won by bundling SQL Server with Windows Server and offering acceptable performance at a lower price. The SaaS pattern suggests something similar happens here. The model becomes an input. The applications built on top, the customer relationships, the distribution, those become the valuable assets. Why do I think this pattern applies rather than, say, the search pattern where Google maintained dominance despite no fundamental technical moat? Two reasons: (1) Search had massive data network effects. Every search improved the algorithm, and Google&amp;rsquo;s scale meant they improved faster. LLMs have weaker data network effects because the pretraining data is largely static and publicly available, and fine-tuning data requirements are smaller. (2) Search had winner-take-all dynamics through defaults and single-answer demand. You pick one search engine and use it for everything. AI applications look more diverse. You might use different models for different tasks, or your applications might switch between models transparently based on price and performance. The switching costs are lower.&lt;/p&gt;
&lt;p&gt;So where does this leave us? The technology exists and the underlying capabilities are real. But I think the current evidence points toward a world where value flows to applications and customer relationships, and where the $400 billion the hyperscalers are spending buys them competitive positioning rather than monopoly. The integrators are making money now by helping enterprises navigate uncertainty. Some of that will produce real productivity gains. Much of it is expensive signaling and competitive positioning. The startups unbundling existing software will see mixed results, the ones that succeed will do so by owning distribution or solving really specific problems where switching costs are high, not by having better access to AI. The biggest uncertainty is whether the hyperscalers can use vertical integration to capture value anyway, or whether the applications layer fragments and value flows to thousands of specialized companies. That depends less on AI capabilities and more on competitive dynamics, regulation, and whether enterprises prefer integrated platforms or best-of-breed solutions. My guess is we end up somewhere in between. The hyperscalers maintain strong positions through bundling and infrastructure control. A long tail of specialized applications captures value in specific verticals. The model providers themselves, unless they&amp;rsquo;re also infrastructure providers, struggle to capture value proportional to the capability they&amp;rsquo;re creating. But I&amp;rsquo;m genuinely uncertain, and that uncertainty is where the interesting bets are.&lt;/p&gt;
&lt;p&gt;What makes Evans&amp;rsquo; presentation valuable is precisely what frustrated me about it initially: his refusal to collapse uncertainty prematurely. I&amp;rsquo;ve spent this entire post arguing for a specific view of how value will flow in AI markets, but Evans is right that we&amp;rsquo;re pattern-matching from incomplete data. Every previous platform shift looked obvious in retrospect and uncertain in real time. The PC revolution, the internet boom, mobile, they all had credible skeptics who turned out wrong and credible bulls who were right for the wrong reasons. Evans&amp;rsquo; discipline in laying out the full range of possibilities, from commodity to monopoly to something entirely new, is the intellectually honest position. I&amp;rsquo;ve made specific bets here because that&amp;rsquo;s useful for readers trying to navigate the space, but I&amp;rsquo;m more confident in my framework than in my conclusions.&lt;/p&gt;</description></item><item><title>Is AI Really Eating the World? [1/2]</title><link>http://philippdubach.com/posts/is-ai-really-eating-the-world-1/2/</link><pubDate>Sun, 23 Nov 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/is-ai-really-eating-the-world-1/2/</guid><description>&lt;p&gt;In August 2011, Marc Andreessen wrote &lt;a href="https://a16z.com/why-software-is-eating-the-world/"&gt;&amp;ldquo;Why Software Is Eating the World&amp;rdquo;&lt;/a&gt;, an essay about how software was transforming industries, disrupting traditional businesses, and revolutionizing the global economy. Recently, &lt;a href="https://www.ben-evans.com/benedictevans/2014/1/18/a16z"&gt;Benedict Evans&lt;/a&gt;, a former a16z partner, gave a presentation on the generative AI platform shift three years after ChatGPT&amp;rsquo;s launch. His argument in short:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;we know this matters, but we don&amp;rsquo;t know how.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;In this article I will try to explain why I find his framing fascinating but incomplete, and why the evidence points toward AI model commoditization rather than durable competitive advantages at the model layer. Evans structures technology history in cycles. Every 10-15 years, the industry reorganizes around a new platform: &lt;a href="https://en.wikipedia.org/wiki/Mainframe_computer"&gt;mainframes&lt;/a&gt; (1960s-70s), PCs (1980s), web (1990s), smartphones (2000s-2010s). Each shift pulls all innovation, investment, and company creation into its orbit. Generative AI appears to be the next platform shift, or it could break the cycle entirely. The range of outcomes spans from &amp;ldquo;just more software&amp;rdquo; to a single unified intelligence that handles everything. The pattern recognition is smart, but I think the current evidence points more clearly toward commoditization than Evans suggests, with value flowing up the AI value chain to applications rather than to model providers.&lt;/p&gt;
&lt;p&gt;The hyperscalers are spending historic amounts on AI infrastructure. In 2025, &lt;a href="https://techblog.comsoc.org/2025/11/01/ai-spending-boom-accelerates-big-tech-to-invest-invest-an-aggregate-of-400-billion-in-2025-more-in-2026/"&gt;Microsoft, Google, Amazon, and Meta will invest roughly $400 billion&lt;/a&gt; in AI capex, more than global telecommunications capex. Microsoft now spends over 30% of revenue on capex, double what Verizon spends. What has this produced? Models that are simultaneously more capable and less defensible. When ChatGPT launched in November 2022, OpenAI had a massive quality advantage. Today, dozens of models cluster around similar performance. &lt;a href="https://newsletter.semianalysis.com/p/deepseek-debates"&gt;DeepSeek proved that anyone with $500 million can build a frontier AI model&lt;/a&gt;. LLM pricing has collapsed. &lt;a href="https://techcrunch.com/2025/08/08/openai-priced-gpt-5-so-low-it-may-spark-a-price-war/"&gt;OpenAI&amp;rsquo;s API pricing has dropped by 97% since GPT-3&amp;rsquo;s launch&lt;/a&gt;, and every year brings an order of magnitude decline in inference cost.&lt;/p&gt;
&lt;p&gt;Now, $500 million is still an enormous barrier. Only a few dozen entities globally can deploy that capital with acceptable risk. &lt;a href="https://arxiv.org/abs/2303.08774"&gt;GPT-4&amp;rsquo;s performance on complex reasoning tasks&lt;/a&gt;, &lt;a href="https://www.anthropic.com/news/claude-2-1"&gt;Claude&amp;rsquo;s extended context windows of up to 200,000 tokens&lt;/a&gt;, &lt;a href="https://blog.google/technology/ai/google-gemini-ai/"&gt;Gemini&amp;rsquo;s multimodal capabilities&lt;/a&gt;, these represent genuine breakthroughs. But the economic moat isn&amp;rsquo;t obvious to me (yet). Open-source AI models from Meta and Mistral keep narrowing the gap, and if the model layer commoditizes fully, the competitive advantage shifts to data, distribution, and integration.&lt;/p&gt;
&lt;p&gt;Evans uses an extended metaphor: automation that works disappears. In the 1950s, automatic elevators were AI. Today they&amp;rsquo;re just elevators. As &lt;a href="https://en.wikipedia.org/wiki/Larry_Tesler"&gt;Larry Tesler&lt;/a&gt; noted in 1970,&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;AI is whatever machines can&amp;rsquo;t do yet. Once it works, it&amp;rsquo;s just software.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The question: will LLMs follow this pattern, or is this different?&lt;/p&gt;
&lt;p&gt;Current enterprise AI deployment shows clear winners but also real constraints. Software development has seen massive adoption, with &lt;a href="https://github.blog/news-insights/research/survey-ai-wave-grows/"&gt;GitHub reporting that 92% of developers now use AI coding tools&lt;/a&gt;. Marketing has found immediate uses generating ad assets at scale. Customer support has attracted investment, though with the caveat that LLMs produce plausible answers, not necessarily correct ones. Beyond these areas, the enterprise AI adoption rate looks scattered. &lt;a href="https://www.deloitte.com/us/en/insights/industry/telecommunications/connectivity-mobile-trends-survey.html"&gt;Deloitte surveys from June 2025 show that roughly 20% of U.S. consumers use generative AI chatbots daily&lt;/a&gt;, with another 34% using them weekly or monthly. Enterprise deployment is further behind. &lt;a href="https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai"&gt;McKinsey data shows most AI &amp;ldquo;agents&amp;rdquo; remain in pilot or experimental stages&lt;/a&gt;. A quarter of CIOs have launched something. Forty percent don&amp;rsquo;t expect production deployment until 2026 or later.&lt;/p&gt;
&lt;p&gt;But I think here&amp;rsquo;s where Evans&amp;rsquo; &amp;ldquo;we don&amp;rsquo;t know&amp;rdquo; approach misses something important. Consulting firms are booking billions in AI integration contracts right now. &lt;a href="https://www.crn.com/news/ai/2025/accenture-s-3b-ai-bet-is-paying-off-inside-a-massive-transformation-fueled-by-advanced-ai"&gt;Accenture alone expects $3 billion in GenAI bookings for fiscal 2025&lt;/a&gt;. The revenue isn&amp;rsquo;t coming from the models. It&amp;rsquo;s coming from integration projects, change management, and process redesign. The pitch is simple: your competitors are moving on this, you can&amp;rsquo;t afford to wait. If your competitors are investing and you&amp;rsquo;re not, you risk being left behind. If everyone invests and AI delivers modest gains, you&amp;rsquo;ve maintained relative position. If everyone invests and AI delivers nothing, you&amp;rsquo;ve wasted money but haven&amp;rsquo;t lost competitive ground. Evans notes that cloud adoption took 20 years to reach 30% of enterprise workloads and is still growing. New technology platform cycles always take longer than advocates expect. His most useful analogy is spreadsheets. &lt;a href="https://en.wikipedia.org/wiki/VisiCalc"&gt;VisiCalc&lt;/a&gt; in the late 1970s transformed accounting. If you were an accountant, you had to have it. If you were a lawyer, you thought &amp;ldquo;that&amp;rsquo;s nice for my accountant.&amp;rdquo; ChatGPT today has the same dynamic. Certain people with certain jobs find it immediately essential. Everyone else sees a demo and doesn&amp;rsquo;t know what to do with the blank prompt. This is right, and it suggests we&amp;rsquo;re early. But it doesn&amp;rsquo;t tell us where value will accumulate in the AI value chain.&lt;/p&gt;
&lt;p&gt;The standard pattern for deploying technology goes in stages: (1) Absorb it (make it a feature, automate obvious tasks). (2) Innovate (create new products, unbundle incumbents). (3) Disrupt (redefine what the market is). We&amp;rsquo;re mostly in stage one. Stage two is happening in pockets. &lt;a href="https://www.ycombinator.com/companies"&gt;Y Combinator&amp;rsquo;s recent batches are overwhelmingly AI-focused&lt;/a&gt;, betting on thousands of new companies unbundling existing software (startups are attacking specific enterprise problems like converting COBOL to Java or reconfiguring telco billing systems). Stage three remains speculative. From an economic perspective, there&amp;rsquo;s the automation question: do you do the same work with fewer people, or more work with the same people? This echoes debates about &lt;a href="https://en.wikipedia.org/wiki/Technological_change#Labor-augmenting_technological_change"&gt;labor-augmenting technical change&lt;/a&gt; in economics. Companies whose competitive advantage was &amp;ldquo;we can afford to hire enough people to do this&amp;rdquo; face real pressure. Companies whose advantage was unique data, customer relationships, or distribution may get stronger. This is standard economic analysis of labor-augmenting technical change, and it probably holds here too.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Continue reading &lt;a href="http://philippdubach.com/posts/is-ai-really-eating-the-world-agi-networks-value-2/2/"&gt;Is AI Really Eating the World? AGI, Networks, and Value [2/2]&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;</description></item><item><title>Weather Forecasts Have Improved a Lot</title><link>http://philippdubach.com/posts/weather-forecasts-have-improved-a-lot/</link><pubDate>Sat, 22 Nov 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/weather-forecasts-have-improved-a-lot/</guid><description>&lt;p&gt;Reading the press release for Google DeepMind&amp;rsquo;s &lt;a href="https://deepmind.google/discover/blog/weathernext-2-our-most-advanced-weather-forecasting-model/"&gt;WeatherNext 2&lt;/a&gt;, I wondered: have weather forecasts actually improved over the past years?&lt;/p&gt;
&lt;p&gt;Turns out they have, dramatically. &lt;a href="https://ourworldindata.org/weather-forecasts"&gt;A four-day forecast today matches the accuracy of a one-day forecast from 30 years ago&lt;/a&gt;. Hurricane track errors that once exceeded 400 nautical miles for 72-hour forecasts now sit below 80 miles. The &lt;a href="https://charts.ecmwf.int"&gt;European Centre for Medium-Range Weather Forecasts reports three-day forecasts now reach 97% accuracy&lt;/a&gt;, with seven-day forecasts approaching that threshold.&lt;/p&gt;
&lt;p&gt;Google&amp;rsquo;s new model accelerates this trend. &lt;a href="https://arstechnica.com/science/2025/11/googles-new-weather-model-impressed-during-its-first-hurricane-season/"&gt;The hurricane model performed remarkably well this season when tested against actual paths&lt;/a&gt;. WeatherNext 2 generates forecasts 8 times faster than its predecessor with resolution down to one hour. Each prediction takes under a minute on a single TPU compared to hours on a supercomputer using physics-based models. The speed comes from a smarter training approach. WeatherNext 2 (along with &lt;a href="https://www.nature.com/articles/s41586-024-07744-y"&gt;neuralgcm&lt;/a&gt;) uses a continuous ranked probability score (CRPS) objective rather than the L2 losses common in earlier neural weather models. The method adds random noise to parameters and trains the model to minimize L1 loss while maximizing differences between ensemble members with different noise initializations.&lt;/p&gt;
&lt;p&gt;This matters because L2 losses blur predictions when models roll out autoregressively over multiple time steps. Spatial features degrade and the model truncates extremes. &lt;a href="https://news.ycombinator.com/item?id=45957193"&gt;Models trained with L2 losses struggle to forecast high-impact extreme weather at moderate lead times&lt;/a&gt;. The CRPS objective preserves the sharp spatial features and extreme values needed for cyclone tracking and heat wave prediction. These improvements stem from better satellite and ground station data, faster computers running higher-resolution models, and improved communication through apps and online services. AI systems like WeatherNext 2 and Pangu-Weather (which performs forecasts up to 10,000 times faster than traditional methods) are accelerating progress that has been building for decades.&lt;/p&gt;</description></item><item><title>GLP-1 Receptor Agonists in ASUD Treatment</title><link>http://philippdubach.com/posts/glp-1-receptor-agonists-in-asud-treatment/</link><pubDate>Fri, 21 Nov 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/glp-1-receptor-agonists-in-asud-treatment/</guid><description>&lt;blockquote&gt;
&lt;p&gt;Alcohol and other substance use disorders (ASUDs) are complex, multifaceted, but treatable medical conditions with widespread medical, psychological, and societal consequences. However, treatment options remain limited, therefore the discovery and development of new treatments for ASUDs is critical. Glucagon-like peptide-1 receptor agonists (GLP-1RAs), currently approved for the treatment of type 2 diabetes mellitus, obesity, and obstructive sleep apnea, have recently emerged as potential new pharmacotherapies for ASUDs.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Semaglutide, the GLP-1 receptor agonist marketed as Ozempic and Wegovy, may be the most significant new pharmacotherapy candidate for alcohol use disorder in decades. This development matters most for people struggling with substance use disorders who have few effective treatment options. It also matters for manufacturers like Novo Nordisk facing &lt;a href="https://philippdubach.com/posts/novo-nordisks-post-patent-strategy/"&gt;patent expiration pressures on Ozempic&lt;/a&gt;. The research into GLP-1RAs for addiction treatment is early but notable given the limited pharmacotherapy options currently available for ASUDs. In February 2025, researchers at UNC published results from the first randomized controlled trial of semaglutide for ASUD treatment. The phase 2 trial enrolled 48 non-treatment-seeking adults with AUD and administered low-dose semaglutide &lt;a href="https://jamanetwork.com/journals/jamapsychiatry/fullarticle/2829811"&gt;(0.25 mg/week for 4 weeks, 0.5 mg/week for 4 weeks - standard dosing for weight loss reaches 2.4 mg per week)&lt;/a&gt; over 9 weeks. Participants on semaglutide consumed less alcohol in controlled laboratory settings and reported fewer drinks per drinking day in their normal lives. They also reported less craving for alcohol. Heavy drinking episodes declined more sharply in the semaglutide group compared to placebo over the nine-week trial. The mechanism likely involves GLP-1 receptors in the brain&amp;rsquo;s mesolimbic reward pathway, where semaglutide modulates dopamine signaling to reduce the reinforcing effects of alcohol consumption. Despite the low doses, effect sizes for some drinking outcomes exceeded those typically seen with naltrexone, one of the few FDA-approved medications for alcohol use disorder. A &lt;a href="https://www.nature.com/articles/s41467-024-48780-6"&gt;large real-world study&lt;/a&gt; of 83,825 patients with obesity found semaglutide associated with a 50-56% lower risk of AUD incidence and recurrence compared to other anti-obesity medications. While larger trials are needed to confirm these results, the early evidence suggests GLP-1 may offer a meaningful treatment option for a condition where new therapies have been approved at a rate of roughly one every 25 years. Phase 3 trials evaluating semaglutide for AUD are now underway, and pemvidutide, a GLP-1/glucagon dual receptor agonist, has received FDA Fast Track designation for alcohol use disorder.&lt;/p&gt;
&lt;aside class="disclaimer" role="note" aria-label="Disclaimer"&gt;
&lt;div class="disclaimer-content"&gt;&lt;p&gt;&lt;strong&gt;Disclaimer:&lt;/strong&gt; For informational purposes only, not medical advice. Consult a qualified healthcare provider for any medical questions or conditions.&lt;/p&gt;&lt;/div&gt;
&lt;/aside&gt;</description></item><item><title>The Bicycle Needs Riding to be Understood</title><link>http://philippdubach.com/posts/the-bicycle-needs-riding-to-be-understood/</link><pubDate>Fri, 14 Nov 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/the-bicycle-needs-riding-to-be-understood/</guid><description>&lt;blockquote&gt;
&lt;p&gt;Some concepts are easy to grasp in the abstract. Boiling water: apply heat and wait. Others you really need to try. You only think you understand how a bicycle works, until you learn to ride one.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;You should write an LLM agent—not because they&amp;rsquo;re revolutionary, but because the bicycle needs riding to be understood. Having built agents myself, Ptacek&amp;rsquo;s central insight resonates: the behavior surprises in specific ways, particularly around how models scale effort with complexity before inexplicably retreating.&lt;/p&gt;
&lt;p&gt;Ptacek walks through building a functioning agent in roughly 50 lines of Python, demonstrating how an LLM with ping access autonomously chose multiple Google endpoints without explicit instruction, a moment that crystallizes both promise and unpredictability. His broader point matches my experience: context engineering isn&amp;rsquo;t mystical but straightforward programming—managing token budgets, orchestrating sub-agents, balancing explicit loops against emergent behavior. The open problems in agent design—titrating nondeterminism, connecting to ground truth, allocating tokens—remain remarkably accessible to individual experimentation, each iteration taking minutes rather than requiring institutional resources.&lt;/p&gt;</description></item><item><title>AI Models as Standalone P&amp;Ls</title><link>http://philippdubach.com/posts/ai-models-as-standalone-pls/</link><pubDate>Sun, 09 Nov 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/ai-models-as-standalone-pls/</guid><description>&lt;blockquote&gt;
&lt;p&gt;Microsoft reported earnings for the quarter ended Sept. [&amp;hellip;] buried in its financial filings were a couple of passages suggesting that OpenAI suffered a net loss of $11.5 billion or more during the quarter.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;For every dollar of revenue, they&amp;rsquo;re allegedly spending roughly $5 to deliver the product. These OpenAI losses initially sound like a joke about &amp;ldquo;making it up on volume,&amp;rdquo; but they point to a more fundamental problem facing OpenAI and its competitors. AI companies are locked into continuously releasing more powerful (and expensive) models. If they stop, &lt;a href="https://arxiv.org/abs/2311.16989"&gt;open-source alternatives will catch up&lt;/a&gt; and offer equivalent capabilities at substantially lower costs. This creates an uncomfortable dynamic. If your current model requires spending more than you earn just to fund the next generation, the path to profitability becomes unclear—perhaps impossible.&lt;/p&gt;
&lt;p&gt;Anthropic CEO Dario Amodei (everybody&amp;rsquo;s favorite AI CEO) recently offered a different perspective in a &lt;a href="https://youtu.be/GcqQ1ebBqkc?si=sEDGAVBuZsjtLpZS&amp;amp;t=1016"&gt;conversation with Stripe co-founder John Collison&lt;/a&gt;. He argues that treating each model as an independent business unit reveals a different picture than conventional accounting suggests.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Let&amp;rsquo;s say in 2023, you train a model that costs $100 million, and then you deploy it in 2024 and it makes $200 million of revenue.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;So far, this looks profitable, a solid 2x return on the training investment. But here&amp;rsquo;s where it gets complicated.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Meanwhile, because of the scaling laws, in 2024, you also train a model that costs $1 billion. If you look in a conventional way at the profit and loss of the company you&amp;rsquo;ve lost $100 million the first year, you&amp;rsquo;ve lost $800 million the second year, and you&amp;rsquo;ve lost $8 billion in the third year, so it looks like it&amp;rsquo;s getting worse and worse.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The pattern continues:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In 2025, you get $2 billion of revenue from that $1 billion model trained the previous year.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Again, viewed in isolation, this model returned 2x its training cost.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;And you spend $10 billion to train the model for the following year.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The losses appear to accelerate dramatically, from $100 million to $800 million to $8 billion.&lt;/p&gt;
&lt;p&gt;This is where Amodei&amp;rsquo;s reframing becomes interesting.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;If you consider each model to be a company, the model that was trained in 2023 was profitable. You paid $100 million and then it made $200 million of revenue.&amp;quot;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;He also acknowledges there are inference costs (the actual computing expenses of running the model for users) but suggests these don&amp;rsquo;t fundamentally change the picture in his simplified example. His core argument:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;If every model was a company, the model in this example is actually profitable. What&amp;rsquo;s going on is that at the same time as you&amp;rsquo;re reaping the benefits from one company, you&amp;rsquo;re founding another company that&amp;rsquo;s much more expensive and requires much more upfront R&amp;amp;D investment.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is essentially an argument that AI companies are building a portfolio of profitable products, but the accounting makes it look terrible because each successive &amp;ldquo;product&amp;rdquo; costs 10x more than the last to develop. The losses stem from overlapping these profitable cycles while exponentially increasing investment scale. But this framework only works if two critical assumptions hold: (1) Each model consistently returns roughly 2x its training cost in revenue, and (2) The improvements from spending 10x more justify that investment—meaning customers will pay enough more for the better model to maintain that 2x return.&lt;/p&gt;
&lt;p&gt;Amodei outlines two ways this resolves:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;So the way that it&amp;rsquo;s going to shake out is this will keep going up until the numbers go very large and the models can&amp;rsquo;t get larger, and, you know, then it&amp;rsquo;ll be a large, very profitable business.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;In this first scenario, scaling hits physical or practical limits. You&amp;rsquo;ve maxed out available compute, data, or capability improvements. Training costs plateau because you literally can&amp;rsquo;t build a meaningfully larger model. At that point, companies stop needing exponentially larger investments and begin harvesting profits from their final-generation models. The second scenario is less optimistic:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Or at some point the models will stop getting better, right? The march to AGI will be halted for some reason.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If the improvements stop delivering proportional returns before reaching natural limits, companies face what Amodei calls overhang.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;And then perhaps there&amp;rsquo;ll be some overhang, so there&amp;rsquo;ll be a one-time, &amp;lsquo;Oh man, we spent a lot of money and we didn&amp;rsquo;t get anything for it,&amp;rsquo; and then the business returns to whatever scale it was at.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;What Amodei&amp;rsquo;s framework doesn&amp;rsquo;t directly address is the open-source problem. If training Model C costs $10 billion but open-source alternatives &lt;a href="https://synaptic.com/resources/open-source-ai-2024"&gt;reach comparable performance six months later&lt;/a&gt;, that 2x return window might not materialize. The entire argument depends on maintaining a significant capability lead that customers will pay premium prices for. There&amp;rsquo;s also the question of whether the 2x return assumption holds as models become more expensive. The jump from $100 million to $1 billion to $10 billion in training costs assumes that customers will consistently value the improvements enough to double revenue.&lt;/p&gt;</description></item><item><title>Working with Models</title><link>http://philippdubach.com/posts/working-with-models/</link><pubDate>Sat, 08 Nov 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/working-with-models/</guid><description>&lt;p&gt;There was this &amp;ldquo;&lt;a href="https://us1.discourse-cdn.com/flex001/uploads/ultralytics1/original/1X/45c604467b6f4212858281cf28f71a77083fb45e.jpeg"&gt;I work with Models&lt;/a&gt;&amp;rdquo; joke which I first heard years ago from an analyst working on a valuation model (&lt;a href="http://philippdubach.com/posts/everything-is-a-dcf-model/"&gt;see my previous post&lt;/a&gt;). I guess it has become more relevant than ever:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;This monograph presents the core principles that have guided the development of diffusion models, tracing their origins and showing how diverse formulations arise from shared mathematical ideas. Diffusion modeling starts by defining a forward process that gradually corrupts data into noise, linking the data distribution to a simple prior through a continuum of intermediate distributions.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If you want to get into this topic in the first place, be sure to check out &lt;a href="https://deepgenerativemodels.github.io"&gt;Stefano Ermon&amp;rsquo;s CS236 Deep Generative Models Course&lt;/a&gt;. Lecture recordings of the full course can also be found on &lt;a href="https://www.youtube.com/playlist?list=PLoROMvodv4rPOWA-omMM6STXaWW4FvJT8"&gt;YouTube&lt;/a&gt;.&lt;/p&gt;</description></item><item><title>Pozsar's Bretton Woods III: Three Years Later [2/2]</title><link>http://philippdubach.com/posts/pozsars-bretton-woods-iii-three-years-later-2/2/</link><pubDate>Sun, 26 Oct 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/pozsars-bretton-woods-iii-three-years-later-2/2/</guid><description>&lt;p&gt;&lt;em&gt;Start by reading &lt;a href="http://philippdubach.com/posts/pozsars-bretton-woods-iii-the-framework-1/2/"&gt;Pozsar&amp;rsquo;s Bretton Woods III: The Framework [1/2]&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Now, what actually happened in the three years since Pozsar published the Bretton Woods III framework? (1) Dollar reserve diversification is happening, but gradual: &lt;a href="https://www.morganstanley.com/insights/articles/us-dollar-declines"&gt;Foreign central bank Treasury holdings declined from peaks exceeding $7.5 trillion to levels below $7 trillion&lt;/a&gt;. This represents steady diversification away from dollar-denominated assets, though not a dramatic collapse. (2) Gold has performed strongly: From roughly $1'900/oz when Pozsar published his dispatches to peaks above $4'000/oz today, gold has appreciated substantially, consistent with increased central bank gold buying and demand for &amp;ldquo;outside money.&amp;rdquo; (3) Alternative payment systems are developing: Various nations continue building infrastructure for non-dollar trade settlement. While these systems remain in preliminary stages rather than fully operational alternatives to SWIFT, development timelines could speed up following specific triggering events. (4) The dollar itself has remained strong: Perhaps surprisingly given predictions of reserve currency decline, the dollar achieved its best performance against a basket of major currencies since 2015 in 2024. The DXY index (which tracks the dollar against major trading partners) &lt;a href="https://www.morningstar.com/markets/will-dollar-keep-falling#:~:text=In%20the%20first%20half%20of,delivered%20nearly%2040%25%20cumulative%20gains"&gt;fell about 11% this year&lt;/a&gt;, marking the end of this decade-long rally. (5) Commodity collateral is increasingly important: &lt;a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2355674"&gt;Research on commodities as collateral&lt;/a&gt; shows that under capital controls and collateral constraints, investors import commodities and pledge them as collateral. Higher collateral demands increase commodity prices and affect the inventory-convenience yield relationship.&lt;/p&gt;
&lt;h2 id="chinas-strategic-options-in-bretton-woods-iii"&gt;China&amp;rsquo;s Strategic Options in Bretton Woods III&lt;/h2&gt;
&lt;p&gt;One of Pozsar&amp;rsquo;s more provocative arguments concerns China&amp;rsquo;s strategic options. With approximately $3 trillion in foreign exchange reserves heavily weighted toward dollars and Treasuries, China faces the same calculus as any holder of large dollar reserves: what is the risk these could be frozen? Pozsar outlined two theoretical paths for China: (1) Sell Treasuries to purchase commodities directly (especially discounted Russian commodities), thereby converting financial claims into physical resources. (2) Print renminbi to purchase commodities, creating a &amp;ldquo;eurorenminbi&amp;rdquo; market parallel to the eurodollar system.&lt;/p&gt;
&lt;p&gt;The first option provides inflation control for China (securing physical resources) while potentially raising yields in Treasury markets. The second option represents a more fundamental challenge to dollar dominance, the birth of an alternative offshore currency market backed by commodity reserves rather than financial reserves. In practice, we&amp;rsquo;ve seen elements of both. China has increased commodity imports from Russia substantially. The internationalization of the renminbi has progressed, though more slowly than some expected, constrained by China&amp;rsquo;s capital controls and the relative underdevelopment of its financial markets compared to dollar markets.&lt;/p&gt;
&lt;h2 id="durable-insights-from-the-bretton-woods-iii-framework"&gt;Durable Insights from the Bretton Woods III Framework&lt;/h2&gt;
&lt;p&gt;Regardless of whether Bretton Woods III emerges exactly as described, several insights from Pozsar&amp;rsquo;s framework appear durable. (1) Central banks control the nominal domain, not the real domain: Monetary policy can influence demand, manage liquidity, and stabilize financial markets. It cannot conjure physical resources, build supply chains, or speed up energy transitions. This distinction matters most during periods of supply-driven inflation, when rate hikes do little to resolve the underlying commodity shortage. (2) Physical infrastructure matters for financial markets: The number of VLCCs, the capacity of the Suez Canal, the efficiency of port facilities, these real-world constraints bind financial flows. Understanding the infrastructure underlying commodity movements provides insight into funding market dynamics. (3) Collateralization is changing: The trend toward commodity-backed finance, warehouse receipt systems, and physical collateral reflects both technological improvements (better monitoring and verification) and strategic shifts (diversification away from pure financial claims). As the &lt;a href="https://www.fsb.org/uploads/P200223-2.pdf"&gt;FSB noted in 2023&lt;/a&gt;, banks play a vital role in the commodities ecosystem, providing not just credit but clearing services and intermediation between commodity firms and central counterparties. (4) Geopolitical risk affects monetary arrangements: The weaponization of reserve assets, however justified in specific circumstances, changes the risk calculation for all reserve holders. This doesn&amp;rsquo;t mean immediate de-dollarization, but it does mean persistent, gradual reserve diversification.&lt;/p&gt;
&lt;h2 id="practical-implications-for-funding-markets-and-monetary-policy"&gt;Practical Implications for Funding Markets and Monetary Policy&lt;/h2&gt;
&lt;p&gt;So what can we take from this for today: (1) Funding market stresses may be more persistent: If commodity traders require more financing for longer durations due to less efficient trade routes, and if banks face balance sheet constraints from regulatory requirements or QT, term funding premia may remain elevated relative to overnight rates. The FRA-OIS spread, the spread between forward rate agreements and overnight indexed swaps, becomes a window into these dynamics. (2) Cross-currency basis swaps signal more than rate differentials: Persistent deviations from covered interest parity reflect structural factors: global trade reconfiguration, reserve diversification, and the changing geography of dollar funding demand. These aren&amp;rsquo;t temporary anomalies to be arbitraged away but potentially persistent features of the new monetary system. (3) Commodity volatility has monetary policy implications that are difficult to manage: When commodity prices surge due to supply disruptions rather than demand strength, central banks face an ugly tradeoff: tighten policy to control inflation headlines while risking recession, or accommodate the price shock and accept higher inflation. Unlike demand-driven inflation, supply-driven commodity inflation doesn&amp;rsquo;t respond well to rate hikes. (4) Infrastructure bottlenecks matter: Just as G-SIB constraints around year-end affect money market functioning, shipping capacity constraints and logistical bottlenecks affect commodity prices and, through them, inflation. Monitoring the &amp;ldquo;real plumbing,&amp;rdquo; freight rates, port congestion, pipeline capacity, provides early warning signals for inflation pressures.&lt;/p&gt;
&lt;h2 id="bretton-woods-iii-as-an-analytical-framework"&gt;Bretton Woods III as an Analytical Framework&lt;/h2&gt;
&lt;p&gt;Perhaps the most valuable way to engage with Bretton Woods III is not as a prediction to be validated or refuted, but as a framework for thinking about the intersection of geopolitics, commodities, and money. It forces attention to questions that are easy to overlook: (a) How do physical constraints on commodity flows affect financial market plumbing? (b) What risks do reserve holders face that aren&amp;rsquo;t captured in traditional financial risk metrics? (c) Where do central bank powers end and other forms of power, military, diplomatic, infrastructural, begin? (d) How do the &amp;ldquo;real&amp;rdquo; and &amp;ldquo;nominal&amp;rdquo; domains interact during periods of stress?&lt;/p&gt;
&lt;p&gt;The current environment shows elements consistent with the Bretton Woods III framework: gradual reserve diversification, persistent commodity volatility, funding market stresses related to term commodity financing, and increasing focus on supply chain resilience over pure efficiency. It also shows elements inconsistent with it: dollar strength through 2024, the slow pace of alternative payment systems, and the resilience of dollar-based financial infrastructure. What seems clear is that the assumptions underlying Bretton Woods II, that dollar reserves are nearly risk-free, that globalized supply chains should be optimized for cost above all else, that central banks can manage most monetary disturbances, are being questioned in ways they weren&amp;rsquo;t five years ago. Whether that questioning leads to a new monetary order or simply a modified version of the current one remains to be seen. But Pozsar&amp;rsquo;s framework provides a useful lens for watching the process unfold, connecting developments in commodity markets, funding markets, and geopolitical arrangements into a coherent story about how the global financial system actually works.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Pozsar&amp;rsquo;s full Money Notes series is available through &lt;a href="https://exunoplures.hu/global-money-notes"&gt;his website&lt;/a&gt;, and Perry Mehrling&amp;rsquo;s course &lt;a href="https://sites.bu.edu/perry/lectures/mb-lectures/"&gt;Economics of Money and Banking&lt;/a&gt; provides excellent background on the &amp;ldquo;money view&amp;rdquo; that underpins this analysis.&lt;/em&gt;&lt;/p&gt;</description></item><item><title>Pozsar's Bretton Woods III: The Framework [1/2]</title><link>http://philippdubach.com/posts/pozsars-bretton-woods-iii-the-framework-1/2/</link><pubDate>Sat, 25 Oct 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/pozsars-bretton-woods-iii-the-framework-1/2/</guid><description>&lt;p&gt;In March 2022, as Western nations imposed unprecedented sanctions following Russia&amp;rsquo;s invasion of Ukraine, &lt;a href="https://exunoplures.hu/people"&gt;Zoltan Pozsar&lt;/a&gt; published a series of dispatches that would become some of the most discussed pieces in financial markets that year. The core thesis was stark: we were witnessing the birth of &amp;ldquo;Bretton Woods III,&amp;rdquo; a fundamental shift in how the global monetary system operates. Nearly three years later, with more data on de-dollarization trends, commodity market dynamics, and structural changes in global trade, it&amp;rsquo;s worth revisiting this framework.&lt;/p&gt;
&lt;p&gt;I first heard of Pozsar at Credit Suisse during the &lt;a href="https://exunoplures.hu/public/pdf/a-decade-on-money-32.pdf"&gt;2019 repo market disruptions&lt;/a&gt; and the &lt;a href="https://exunoplures.hu/public/pdf/a-decade-on-money-34.pdf"&gt;March 2020 funding crisis&lt;/a&gt;, when his framework explained market dynamics in a way I have never seen it before. Before joining Credit Suisse as a short-term rate strategist, Pozsar spent years at the Federal Reserve (where he created &lt;a href="https://exunoplures.hu/public/pdf/a-decade-on-money-2.pdf"&gt;the map of the shadow banking system&lt;/a&gt;, which prompted the G20 to initiate regulatory measures in this area) and the U.S. Treasury. His work focuses on what he calls the &amp;ldquo;plumbing&amp;rdquo; of financial markets, the often-overlooked mechanisms through which money actually flows through the system. His intellectual approach draws heavily from Perry Mehrling&amp;rsquo;s &amp;ldquo;money view,&amp;rdquo; which treats money as having four distinct prices rather than being a simple unit of account.&lt;/p&gt;
&lt;h2 id="inside-money-outside-money-and-the-reserve-currency-shift"&gt;Inside Money, Outside Money, and the Reserve Currency Shift&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://exunoplures.hu/public/pdf/a-decade-on-money-39.pdf"&gt;Pozsar&amp;rsquo;s Bretton Woods III framework&lt;/a&gt; rests on a straightforward distinction. &amp;ldquo;Inside money&amp;rdquo; refers to claims on institutions: Treasury securities, bank deposits, central bank reserves. &amp;ldquo;Outside money&amp;rdquo; refers to commodities like gold, oil, wheat, metals that have intrinsic value independent of any institution&amp;rsquo;s promise.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Bretton_Woods_system"&gt;Bretton Woods I (1944-1971)&lt;/a&gt; was backed by gold, outside money. The U.S. dollar was convertible to gold at a fixed rate, and other currencies were pegged to the dollar. When this system collapsed in 1971, Bretton Woods II emerged: a system where dollars were backed by U.S. Treasury securities, inside money. Countries accumulated dollar reserves, primarily in the form of Treasuries, to support their currencies and facilitate international trade.&lt;/p&gt;
&lt;p&gt;Pozsar&amp;rsquo;s argument: the moment Western nations froze Russian foreign exchange reserves, the assumed risk-free nature of these dollar holdings changed fundamentally. What had been viewed as having negligible credit risk suddenly carried confiscation risk. For any country potentially facing future sanctions, the calculus of holding large dollar reserve positions shifted. Hence Bretton Woods III: a system where countries increasingly prefer holding reserves in the form of commodities and gold, outside money that cannot be frozen by another government&amp;rsquo;s decision.&lt;/p&gt;
&lt;h2 id="perry-mehrlings-four-prices-of-money"&gt;Perry Mehrling&amp;rsquo;s Four Prices of Money&lt;/h2&gt;
&lt;p&gt;To understand Pozsar&amp;rsquo;s analysis, we need to understand his analytical framework. Perry Mehrling teaches that money has four prices: (1) Par: The one-for-one exchangeability of different types of money. Your bank deposit should convert to cash at par. Money market fund shares should trade at $1. When par breaks, as it did in 2008 when money market funds &amp;ldquo;broke the buck,&amp;rdquo; the payments system itself is threatened. (2) Interest: The price of future money versus money today. This is the domain of overnight rates, term funding rates, and the various &amp;ldquo;bases&amp;rdquo; (spreads) between different funding markets. When covered interest parity breaks down and cross-currency basis swaps widen, it signals stress in the ability to transform one currency into another over time. (3) Exchange rate: The price of foreign money. How many yen or euros does a dollar buy? Fixed exchange rate regimes can collapse when countries lack sufficient reserves, as happened across Southeast Asia in 1997. (4) Price level: The price of commodities in terms of money. How much does oil, wheat, or copper cost? This determines not just headline inflation but feeds through into the price of virtually everything in the economy.&lt;/p&gt;
&lt;p&gt;Central banks have powerful tools for managing the first three prices. They can provide liquidity to preserve par, influence interest rates through policy, and intervene in foreign exchange markets. But the fourth price, the price level, particularly when driven by commodity supply shocks, is far harder to control. As Pozsar puts it: &amp;ldquo;You can print money, but not oil to heat or wheat to eat.&amp;rdquo;&lt;/p&gt;
&lt;h2 id="commodity-plumbing-from-financial-domain-to-real-domain"&gt;Commodity Plumbing: From Financial Domain to Real Domain&lt;/h2&gt;
&lt;p&gt;Pozsar&amp;rsquo;s contribution was to extend Mehrling&amp;rsquo;s framework into what he calls the &amp;ldquo;real domain,&amp;rdquo; the physical infrastructure underlying commodity flows. For each of the three non-commodity prices of money, there&amp;rsquo;s a parallel in commodity markets: (1) Foreign exchange ↔ Foreign cargo: Just as you exchange currencies, you exchange dollars for foreign-sourced commodities. (2) Interest (time value of money) ↔ Shipping: Just as lending has a time dimension, moving commodities from port A to port B takes time and requires financing. (3) Par (stability) ↔ Protection: Just as central banks protect the convertibility of different money forms, military and diplomatic power protects commodity shipping routes.&lt;/p&gt;
&lt;p&gt;This mapping reveals something important: commodity markets have their own &amp;ldquo;plumbing&amp;rdquo; that works parallel to financial plumbing. And when this real infrastructure gets disrupted, it creates stresses that purely monetary policy cannot resolve.&lt;/p&gt;
&lt;h2 id="sanctions-shipping-and-the-commodity-financing-bottleneck"&gt;Sanctions, Shipping, and the Commodity Financing Bottleneck&lt;/h2&gt;
&lt;p&gt;One of the most concrete examples in Pozsar&amp;rsquo;s March 2022 dispatches illustrates this intersection between finance and physical reality. Consider what happens when Russian oil exports to Europe are disrupted and must be rerouted to Asia. Previously, Russian oil traveled roughly 1-2 weeks from Baltic ports to European refineries on Aframax carriers (ships carrying about 600,000 barrels). The financing required was relatively short-term, a week or two. Post-sanctions, the same oil must travel to Asian buyers. But the Baltic ports can&amp;rsquo;t accommodate Very Large Crude Carriers (VLCCs), which carry 2 million barrels. So the oil must first be loaded onto Aframax vessels, sailed to a transfer point, transferred ship-to-ship to VLCCs, then shipped to Asia, a journey of roughly four months.&lt;/p&gt;
&lt;p&gt;The same volume of oil, moved the same distance globally, now requires: (a) More ships (Aframax vessels for initial transport plus VLCCs for long-haul). (b) More time (4 months instead of 1-2 weeks). (c) More financing (commodity traders must borrow for much longer terms). (d) More capital tied up by banks (longer-duration loans against volatile commodities).&lt;/p&gt;
&lt;p&gt;Pozsar estimated this rerouting alone would encumber approximately 80 VLCCs, roughly 10% of global VLCC capacity, in permanent use. The financial implication: banks&amp;rsquo; &lt;a href="https://www.investopedia.com/terms/l/liquidity-coverage-ratio.asp"&gt;liquidity coverage ratios (LCRs)&lt;/a&gt; increase because they&amp;rsquo;re extending more term credit to finance these longer shipping durations. When commodity trading requires more financing for longer durations, it competes with other demands for bank balance sheet. If this happens simultaneously with quantitative tightening (QT), when the central bank is draining reserves from the system, funding stresses become more likely. As Pozsar noted: &amp;ldquo;In 2019, o/n repo rates popped because banks got to LCR and they stopped lending reserves. In 2022, term credit to commodity traders may dry up because QT will soon begin in an environment where banks&amp;rsquo; LCR needs are going up, not down.&amp;rdquo;&lt;/p&gt;
&lt;h2 id="dollar-funding-vulnerabilities-for-non-us-banks"&gt;Dollar Funding Vulnerabilities for Non-U.S. Banks&lt;/h2&gt;
&lt;p&gt;One aspect of the framework that deserves more attention relates to dollar funding for non-U.S. banks. According to recent Dallas Fed research, &lt;a href="https://www.dallasfed.org/-/media/Images/research/economics/2025/0930/dfe0930c1.png"&gt;banks headquartered outside the United States hold approximately $16 trillion in U.S. dollar assets&lt;/a&gt;, comparable in magnitude to the $22 trillion held by U.S.-based institutions. The critical difference: U.S. banks have access to the Federal Reserve&amp;rsquo;s emergency liquidity facilities during periods of stress. Foreign banks do not have a U.S. dollar lender of last resort. During the COVID-19 crisis, the &lt;a href="https://www.dallasfed.org/research/economics/2024/0521"&gt;Fed expanded dollar swap lines to foreign central banks&lt;/a&gt; precisely to address this vulnerability, about $450 billion, roughly one-sixth of the Fed&amp;rsquo;s balance sheet expansion in early 2020. The structural dependency on dollar funding creates ongoing vulnerabilities. When dollars become scarce globally, whether due to Fed policy tightening, shifts in risk sentiment, or disruptions in commodity financing, foreign banks face balance sheet pressures that can amplify stress. The covered interest parity violations that Pozsar frequently discusses reflect these frictions: direct dollar borrowing and synthetic dollar borrowing through FX swaps theoretically should cost the same, but in practice, significant basis spreads persist.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Continue reading &lt;a href="http://philippdubach.com/posts/pozsars-bretton-woods-iii-three-years-later-2/2/"&gt;Pozsar&amp;rsquo;s Bretton Woods III: Three Years Later [2/2]&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;</description></item><item><title>Everything is a DCF Model</title><link>http://philippdubach.com/posts/everything-is-a-dcf-model/</link><pubDate>Sun, 19 Oct 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/everything-is-a-dcf-model/</guid><description>&lt;p&gt;A brilliant piece of writing from &lt;a href="https://www.morganstanley.com/im/en-us/individual-investor/about-us/people-and-teams/investment-professionals/michael-mauboussin.html"&gt;Michael Mauboussin&lt;/a&gt; and &lt;a href="https://www.morganstanley.com/im/en-us/individual-investor/about-us/people-and-teams/investment-professionals/dan-callahan.html"&gt;Dan Callahan&lt;/a&gt; at Morgan Stanley that was formative in what I personally believe when it comes to valuation.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;[…] we want to suggest the mantra &amp;ldquo;everything is a DCF model.&amp;rdquo; The point is that whenever investors value a stake in a cash-generating asset, they should recognize that they are using a discounted cash flow (DCF) model. […] The value of those businesses is the present value of the cash they can distribute to their owners. This suggests a mindset that is very different from that of a speculator, who buys a stock in anticipation that it will go up without reference to its value. Investors and speculators have always coexisted in markets, and the behavior of many market participants is a blend of the two.&lt;/p&gt;
&lt;/blockquote&gt;</description></item><item><title>Agent-based Systems for Modeling Wealth Distribution</title><link>http://philippdubach.com/posts/agent-based-systems-for-modeling-wealth-distribution/</link><pubDate>Sat, 30 Aug 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/agent-based-systems-for-modeling-wealth-distribution/</guid><description>&lt;p&gt;A question &lt;a href="https://www.youtube.com/garyseconomics"&gt;Gary Stevenson&lt;/a&gt;, the self-proclaimed &lt;a href="https://on.ft.com/4n7z5jD"&gt;best trader in the world&lt;/a&gt;, has been asking for some time is &lt;a href="https://uclrethinkingeconomics.com/2025/06/25/gary-stevenson-can-a-wealth-tax-fix-britains-economy/"&gt;if a wealth tax can fix Britain&amp;rsquo;s economy&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;[&amp;hellip;] he believed the continued parlous state of the economy would halt any interest rate hikes. The reason? Because when ordinary people receive money, they spend it, stimulating the economy, while the wealthy tend to save it. But our economic model promotes the concentration of wealth among a select few at the expense of everybody else&amp;rsquo;s living standards.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;em&gt;Owen Jones on Gary Stevenson for &lt;a href="https://www.theguardian.com/commentisfree/2022/jan/13/super-rich-spend-2m-on-whisky-wealth-tax-pandemic"&gt;The Guardian&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Something I generally find very useful and appealing is visualizing systems, models and complexities. Any wealth distribution model is by nature a complex system, and agent-based simulation is one of the best ways to make that complexity visible. The &lt;a href="https://arxiv.org/abs/1604.02370"&gt;Affine Wealth Model&lt;/a&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;a stochastic, agent-based, binary-transaction Asset-Exchange Model (AEM) for wealth distribution that allows for agents with negative wealth&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;elegantly demonstrates how random transactions inevitably lead to Pareto distributions without intervention. In this &lt;a href="https://notebooks.manganiello.tech/fabio/wealth-inequality.ipynb"&gt;Jupyter Notebook Fabio Manganiello&lt;/a&gt; provides great visualizations of the wealth model. He shows how wealth distributes in an open market where a set of agents trades without any mechanisms in place to prevent a situation of extreme inequality.
&lt;a href="#lightbox-wealth-dist-0-tp5-tax-gif-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/wealth-dist-0-tp5-tax.gif 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/wealth-dist-0-tp5-tax.gif 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/wealth-dist-0-tp5-tax.gif 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/wealth-dist-0-tp5-tax.gif 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/wealth-dist-0-tp5-tax.gif 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/wealth-dist-0-tp5-tax.gif 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/wealth-dist-0-tp5-tax.gif 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/wealth-dist-0-tp5-tax.gif 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/wealth-dist-0-tp5-tax.gif 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/wealth-dist-0-tp5-tax.gif 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/wealth-dist-0-tp5-tax.gif 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/wealth-dist-0-tp5-tax.gif"
alt="Animation of two side-by-side histogram charts showing wealth distribution. Left chart titled Wealth Distribution (wealth tax: 0%) Right chart titled Wealth Distribution (wealth tax: 5%)"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
It can be seen in the left graph that with no tax wealth quickly stashes up in the pockets of a very small group of agents, while most of the other agents end up piling up in the lowest bucket. As we introduce a wealth tax of 25%, then 5% (right graph) and 1% we can see how the distribution becomes more even and therefore more desirable from the perspective of wealth equality, and also very stable over time, with the agents in the highest buckets quickly having at most 3-4x of their initial amount.&lt;/p&gt;
&lt;p&gt;As with any model, the paper as well as the simulation have it&amp;rsquo;s &lt;a href="https://notebooks.manganiello.tech/fabio/wealth-inequality.ipynb#Limitations"&gt;limitations&lt;/a&gt;, but again my interest is more in the way a few lines of code can visualize a economic relationships elegantly. It would be interesting to further investigate: (1) How sensitive are these equilibrium distributions to the transaction constraint (max_exchanged_share)? Does allowing larger transfers accelerate concentration or fundamentally alter the &lt;a href="https://en.wikipedia.org/wiki/Gini_coefficient"&gt;steady-state Gini coefficient&lt;/a&gt;? (2) The above wealth tax implementation taxes the sender - but what happens if we model progressive taxation on received amounts above median wealth instead? Does the locus of taxation matter for distributional outcomes?&lt;/p&gt;</description></item><item><title>Visualizing Gradients with PyTorch</title><link>http://philippdubach.com/posts/visualizing-gradients-with-pytorch/</link><pubDate>Sat, 23 Aug 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/visualizing-gradients-with-pytorch/</guid><description>&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Gradient"&gt;Gradients&lt;/a&gt; are one of the most important concepts in calculus and machine learning, but it&amp;rsquo;s often poorly understood. Trying to understand them better myself, I wanted to build a visualization tool that helps me develop the correct mental picture of what the gradient of a function is. I came across &lt;a href="https://github.com/GistNoesis/VisualizeGradient"&gt;GistNoesis/VisualizeGradient&lt;/a&gt;, so I went on from there to write my own iteration. This mental model generalizes beautifully to higher dimensions and is the foundation for understanding optimization algorithms like gradient descent.
&lt;a href="#lightbox-torch-gradients_Figure_2-png-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/torch-gradients_Figure_2.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/torch-gradients_Figure_2.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/torch-gradients_Figure_2.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/torch-gradients_Figure_2.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/torch-gradients_Figure_2.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/torch-gradients_Figure_2.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/torch-gradients_Figure_2.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/torch-gradients_Figure_2.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/torch-gradients_Figure_2.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/torch-gradients_Figure_2.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/torch-gradients_Figure_2.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/torch-gradients_Figure_2.png"
alt="2D Gradient Plot: The colored surface shows function values. Black arrows show gradient vectors in the input plane (x-y space), pointing toward the direction of steepest ascent."
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;em&gt;The colored surface shows function values. Black arrows show gradient vectors in the input plane (x-y space), pointing toward the direction of steepest ascent.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;If you are interested in having a closer look or replicating my approach, the full project can be found on my &lt;a href="https://github.com/philippdubach/torch-gradients/"&gt;GitHub&lt;/a&gt;. I&amp;rsquo;m also looking forward to doing something similar on the &lt;a href="https://blog.foletta.net/post/2025-07-14-clt/"&gt;Central Limit Theorem&lt;/a&gt; as well as doing a short tutorial on &lt;a href="https://static.philippdubach.com/opt_vol_surface_plot_fig1.png"&gt;plotting options volatility surfaces with python&lt;/a&gt;, a project I have been waiting to finish for some time now.&lt;/p&gt;</description></item><item><title>Sentiment Trading Revisited</title><link>http://philippdubach.com/posts/sentiment-trading-revisited/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/sentiment-trading-revisited/</guid><description>&lt;p&gt;Interesting new paper on news sentiment embeddings for stock price forecasting that builds on many of the ideas &lt;a href="http://philippdubach.com/posts/trading-on-market-sentiment/"&gt;I explored in this project&lt;/a&gt;. The research, by Ayaan Qayyum, an &lt;a href="https://soe.rutgers.edu/news/ayaan-qayyum-electrical-and-computer-engineering"&gt;Undergraduate Research Scholar at Rutgers&lt;/a&gt;, shows that the core concept of using advanced language models for sentiment trading is not only viable but highly effective. The study takes a similar but more advanced approach. Instead of using a model like GPT-3.5 to generate a simple sentiment score, it uses &lt;a href="https://platform.openai.com/docs/guides/embeddings/embedding-models"&gt;OpenAI&amp;rsquo;s embedding models&lt;/a&gt; to convert news headlines into rich, high-dimensional vectors. By training a &lt;a href="https://arxiv.org/html/2507.01970v1/extracted/6556003/diagrams/model_comb_diagram.png"&gt;battery of neural networks&lt;/a&gt; including&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Gated Recurrent Units (GRU), Hidden Markov Model (HMM), Long Short-Term Memory (LSTM), Temporal Convolutional Networks (TCN), and a Feed-Forward Neural Network (FFNN). All were implemented using PyTorch.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;on these embeddings alongside economic data, the study found it could &lt;a href="https://arxiv.org/html/2507.01970v1/extracted/6556003/diagrams/models_ranked_smape.png"&gt;reduce prediction errors by up to 40%&lt;/a&gt; compared to models without the news data.&lt;/p&gt;
&lt;p&gt;The most surprising insight to me, and one that directly addresses the challenge of temporal drift I discussed, was that Qayyum&amp;rsquo;s time-independent models performed just as well, if not better, than the time-dependent ones. By shuffling the data, the models were forced to learn the pure semantic impact of a headline, independent of its specific place in time. This suggests that the market reacts to the substance of news in consistent ways, even if the narratives themselves change.&lt;/p&gt;</description></item><item><title>Counting Cards with Computer Vision</title><link>http://philippdubach.com/posts/counting-cards-with-computer-vision/</link><pubDate>Sun, 06 Jul 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/counting-cards-with-computer-vision/</guid><description>&lt;p&gt;After installing &lt;a href="https://www.anthropic.com/claude-code"&gt;Claude Code&lt;/a&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;the agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster through natural language commands&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I was looking for a task to test its abilities. Fairly quickly we wrote &lt;a href="https://gist.github.com/philippdubach/741cbd56498e43375892966ca691b9c2"&gt;less than 200 lines of python code predicting blackjack odds&lt;/a&gt; using Monte Carlo simulation. When I went on to test this little tool on &lt;a href="https://games.washingtonpost.com/games/blackjack"&gt;Washington Post&amp;rsquo;s&lt;/a&gt; online blackjack (I also didn&amp;rsquo;t know that existed!) I quickly noticed how impractical it was to manually input all the card values on the table. What if the tool could also handle blackjack card detection automatically and calculate the odds from it? I have never done anything with computer vision so this seemed like a good challenge.
&lt;a href="#lightbox-classification-gif-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/classification.gif 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/classification.gif 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/classification.gif 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/classification.gif 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/classification.gif 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/classification.gif 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/classification.gif 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/classification.gif 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/classification.gif 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/classification.gif 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/classification.gif 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/classification.gif"
alt="alt text here"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
To get to any reasonable result we have to start with classification where we &amp;ldquo;teach&amp;rdquo; the model to categorize data by showing them lots of examples with correct labels. But where do the labels come from? I manually annotated &lt;a href="https://universe.roboflow.com/cards-agurd/playing_card_classification"&gt;409 playing cards across 117 images&lt;/a&gt; using Roboflow Annotate (at first I only did half as much - why this wasn&amp;rsquo;t a good idea we&amp;rsquo;ll see in a minute). Once enough screenshots of cards were annotated we can train the model to recognize the cards and predict card values on tables it has never seen before. I was able to use a &lt;a href="https://www.nvidia.com/en-us/data-center/tesla-t4/"&gt;NVIDIA T4 GPU&lt;/a&gt; inside Google Colab which offers some GPU time for free when capacity is available.
&lt;a href="#lightbox-gpu_setup_colab-png-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/gpu_setup_colab.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/gpu_setup_colab.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/gpu_setup_colab.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/gpu_setup_colab.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/gpu_setup_colab.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/gpu_setup_colab.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/gpu_setup_colab.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/gpu_setup_colab.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/gpu_setup_colab.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/gpu_setup_colab.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/gpu_setup_colab.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/gpu_setup_colab.png"
alt="alt text here"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
During training, the algorithm learns patterns from this example data, adjusting its internal parameters millions of times until it gets really good at recognizing the differences between categories (in this case different cards). Once trained, the model can then make predictions on new, unseen data by applying the patterns it learned. With the annotated dataset ready, it was time to implement the actual computer vision model. I chose to run inference on &lt;a href="https://docs.ultralytics.com/de/models/yolo11/"&gt;Ultralytics&amp;rsquo; YOLOv11&lt;/a&gt; pre-trained model, a leading object detection algorithm. I set up the environment in Google Colab following the &lt;a href="https://colab.research.google.com/github/roboflow-ai/notebooks/blob/main/notebooks/train-yolo11-object-detection-on-custom-dataset.ipynb"&gt;&amp;ldquo;How to Train YOLO11 Object Detection on a Custom Dataset&amp;rdquo;&lt;/a&gt; notebook. After extracting the annotated dataset from Roboflow, I began training the model using the pre-trained YOLOv11s weights as a starting point. This approach, called &lt;a href="https://en.wikipedia.org/wiki/Transfer_learning"&gt;transfer learning&lt;/a&gt;, allows the model to reuse patterns already learned from millions of general images and adapt them to this specific task.
I initially set it up to &lt;a href="https://docs.ultralytics.com/guides/model-training-tips/#other-techniques-to-consider-when-handling-a-large-dataset"&gt;run for 350 epochs&lt;/a&gt;, though the model&amp;rsquo;s built-in early stopping mechanism kicked in after 242 epochs when no improvement was observed for 100 consecutive epochs. The best results were achieved at epoch 142, taking around 13 minutes to complete on the Tesla T4 GPU.
The initial results were quite promising, with an overall mean Average Precision (mAP) of 80.5% at IoU threshold 0.5. Most individual card classes achieved good precision and recall scores, with only a few cards like the 6 and Queen showing slightly lower precision values.
&lt;a href="#lightbox-run1_results-png-2" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/run1_results.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/run1_results.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/run1_results.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/run1_results.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/run1_results.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/run1_results.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/run1_results.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/run1_results.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/run1_results.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/run1_results.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/run1_results.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/run1_results.png"
alt="Training results showing confusion matrix and loss curves"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
However, looking at the confusion matrix and loss curves revealed some interesting patterns. While the model was learning effectively (as shown by the steadily decreasing loss), there were still some misclassifications between similar cards, particularly among the numbered cards. This highlighted exactly why I mentioned earlier that annotating only half the amount of data initially &amp;ldquo;wasn&amp;rsquo;t a good idea&amp;rdquo; - more training examples would likely improve these edge cases and reduce confusion between similar-looking cards. My first attempt at solving the remaining accuracy issues was to add another layer to the workflow by sending the detected cards to Anthropic&amp;rsquo;s Claude API for additional OCR processing.
&lt;a href="#lightbox-claude_vision_workflow_results-png-3" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/claude_vision_workflow_results.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/claude_vision_workflow_results.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/claude_vision_workflow_results.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/claude_vision_workflow_results.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/claude_vision_workflow_results.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/claude_vision_workflow_results.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/claude_vision_workflow_results.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/claude_vision_workflow_results.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/claude_vision_workflow_results.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/claude_vision_workflow_results.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/claude_vision_workflow_results.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/claude_vision_workflow_results.png"
alt="Roboflow workflow with Claude API integration"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
This hybrid approach was very effective - the combination of YOLO&amp;rsquo;s object detection to dynamically crop down the Black Jack table to individual cards with Claude&amp;rsquo;s advanced vision capabilities yielded 99.9% accuracy on the predicted cards. However, this solution came with a significant drawback: the additional API layer consumed valuable time and the large model&amp;rsquo;s processing overhead, making it impractical for real-time gameplay.&lt;/p&gt;
&lt;p&gt;Seeking a faster solution, I implemented the same workflow &lt;a href="https://github.com/JaidedAI/EasyOCR"&gt;locally using easyOCR&lt;/a&gt; instead. EasyOCR seems to be really good at extracting black text on white background but &lt;a href="https://stackoverflow.com/questions/68261703/how-to-improve-accuracy-prediction-for-easyocr"&gt;might struggle with everything else&lt;/a&gt;. While it was able to correctly identify the card numbers when it detected them, it struggled to recognize around half of the cards in the first place - even when fed pre-cropped card images directly from the YOLO model. This inconsistency made it unreliable for the application.
Rather than continue band-aid solutions, I decided to go back and improve my dataset. I doubled the training data by adding another 60 screenshots with the same train/test split as before. More importantly, I went through all the previous annotations and fixed many of the bounding polygons. I noticed that several misidentifications were caused by the model detecting face-down dealer cards as valid cards, which happened because some annotations for face-up cards inadvertently included parts of the card backs next to them. The improved dataset and cleaned annotations delivered what I was hoping for: The confusion matrix now shows a much cleaner diagonal pattern, indicating that the model now correctly identifies most cards without the cross-contamination issues we saw earlier.
&lt;a href="#lightbox-run_best-png-5" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/run_best.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/run_best.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/run_best.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/run_best.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/run_best.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/run_best.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/run_best.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/run_best.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/run_best.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/run_best.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/run_best.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/run_best.png"
alt="Final training results with improved dataset"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Both the training and validation losses converge smoothly without signs of overfitting, while the precision and recall metrics climb steadily to plateau near perfect scores. The mAP@50 reaches an impressive 99.5%. Most significantly, the confusion matrix now shows that the model has virtually eliminated false positives with background elements. The &amp;ldquo;background&amp;rdquo; column (rightmost) in the confusion matrix is now much cleaner, with only minimal misclassifications of actual cards as background noise.
&lt;a href="#lightbox-local_run_interference_visual-png-6" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/local_run_interference_visual.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/local_run_interference_visual.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/local_run_interference_visual.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/local_run_interference_visual.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/local_run_interference_visual.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/local_run_interference_visual.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/local_run_interference_visual.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/local_run_interference_visual.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/local_run_interference_visual.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/local_run_interference_visual.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/local_run_interference_visual.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/local_run_interference_visual.png"
alt="Real-time blackjack card detection and odds calculation"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
With the model trained and performing, it was time to deploy it and play some blackjack. Initially, I tested the system using &lt;a href="https://docs.roboflow.com/deploy/serverless-hosted-api-v2"&gt;Roboflow&amp;rsquo;s hosted API&lt;/a&gt;, which took around 4 seconds per inference - far too slow for practical gameplay. However, running the model locally on my laptop dramatically improved performance, achieving inference times of less than 0.1 seconds per image (1.3ms preprocess, 45.5ms inference, 0.4ms postprocess per image). I then &lt;a href="https://python-mss.readthedocs.io/"&gt;integrated the model with MSS&lt;/a&gt; to capture a real-time feed of my browser window. The system automatically overlays the detected cards with their predicted values and confidence scores
&lt;a href="#lightbox-black_jack_odds_demo-gif-7" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/black_jack_odds_demo.gif 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/black_jack_odds_demo.gif 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/black_jack_odds_demo.gif 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/black_jack_odds_demo.gif 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/black_jack_odds_demo.gif 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/black_jack_odds_demo.gif 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/black_jack_odds_demo.gif 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/black_jack_odds_demo.gif 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/black_jack_odds_demo.gif 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/black_jack_odds_demo.gif 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/black_jack_odds_demo.gif 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/black_jack_odds_demo.gif"
alt="Overview of selected fitted curves"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
The final implementation successfully combines the pieces: the computer vision model detects and identifies cards in real-time, feeds this information to the Monte Carlo simulation, and displays both the card recognition results and the calculated odds directly on screen - do not try this at your local (online) casino!&lt;/p&gt;</description></item><item><title>Novo Nordisk's Post-Patent Strategy</title><link>http://philippdubach.com/posts/novo-nordisks-post-patent-strategy/</link><pubDate>Sun, 29 Jun 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/novo-nordisks-post-patent-strategy/</guid><description>&lt;p&gt;Novo Nordisk, a long time member of my &amp;ldquo;regrets&amp;rdquo; stock list, has become &lt;a href="https://finance.yahoo.com/quote/NVO/chart/"&gt;reasonably affordable lately (-48% yoy)&lt;/a&gt;. Part of the reason being that they currently sit atop a ~$20 billion Ozempic/Wegovy franchise that faces &lt;a href="https://journals.library.columbia.edu/index.php/stlr/blog/view/653"&gt;patent expiration in 2031&lt;/a&gt;. That&amp;rsquo;s roughly seven years to replace their blockbuster drug. We revisit them today, since per &lt;a href="https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(25)01185-7/fulltext"&gt;newly published Lancet data&lt;/a&gt;, Novo&amp;rsquo;s lead replacement candidate—amycretin—just posted some genuinely impressive Phase 1 results. The injectable version delivered &lt;a href="https://www.thelancet.com/cms/10.1016/S0140-6736(25)01185-7/asset/6f4ec048-c12e-4185-a860-a2dc988746c4/main.assets/gr3_lrg.jpg"&gt;24.3% average weight loss versus 1.1% for placebo&lt;/a&gt;, beating both current market leaders (Wegovy at 15% and Lilly&amp;rsquo;s Zepbound at 22.5%). Even the oral version hit 13.1% weight loss in just 12 weeks, with patients still losing weight when the trial ended.&lt;/p&gt;
&lt;p&gt;Amycretin is very elegantly designed: It combines semaglutide (the active ingredient in Ozempic/Wegovy) with amylin, creating what&amp;rsquo;s essentially a dual-pathway satiety signal. Semaglutide activates GLP-1 receptors to slow gastric emptying and reduce appetite centrally, while amylin works through complementary mechanisms to enhance fullness signals. This way both your stomach and your brain&amp;rsquo;s &amp;ldquo;appetite control center&amp;rdquo; are getting the &amp;ldquo;stop eating&amp;rdquo; message simultaneously. One concern raised by &lt;a href="https://www.statnews.com/staff/elaine-chen/"&gt;Elaine Chen at STAT&lt;/a&gt; is that the &lt;a href="https://www.statnews.com/2025/06/20/novo-nordisk-weight-loss-drug-amylin-hormone-injection-effective-but-side-effects-an-issue/"&gt;results of a Phase 1/2 study include unusual findings around dosage&lt;/a&gt;. The full text article is behind a paywall unfortunately, so I did not have access. However, looking at the actual data from the study, I am assuming she is referring to Parts C, D, and E, which tested maintenance doses of 20 mg, 5 mg, and 1.25 mg respectively. The weight loss results were:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Part C (20 mg): -22.0% weight loss at 36 weeks &lt;br&gt;
Part D (5 mg): -16.2% weight loss at 28 weeks &lt;br&gt;
Part E (1.25 mg): -9.7% weight loss at 20 weeks&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;While there is a dose-response relationship, what&amp;rsquo;s notable is the curves in &lt;a href="https://www.thelancet.com/cms/10.1016/S0140-6736(25)01185-7/asset/6f4ec048-c12e-4185-a860-a2dc988746c4/main.assets/gr3_lrg.jpg"&gt;Figure 3&lt;/a&gt; show relatively similar trajectories during the overlapping time periods. Typically in drug development, researchers would expect clear separation between dose groups (with higher doses producing proportionally greater effects). When weight-loss curves overlap significantly (which they do in this case), it suggests the doses may be producing similar effects despite different drug concentrations. If lower doses produce similar weight loss with potentially fewer side effects, this could favor using the lower, better-tolerated dose. Further, it might indicate that amycretin reaches maximum effect at relatively low doses. This should probably influence how future Phase 3 trials are designed, potentially focusing on the optimal dose rather than the maximum tolerated dose. Given that gastrointestinal side effects were dose-dependent but efficacy curves overlapped, this supports using the lowest effective dose. How that might be a bad thing I have yet to find out.&lt;/p&gt;
&lt;p&gt;From a financial perspective, &lt;a href="https://www.novonordisk.com/science-and-technology/r-d-pipeline.html"&gt;Novo Nordisk&amp;rsquo;s pipeline&lt;/a&gt; is very interesting: Amycretin&amp;rsquo;s injectable version is currently in Phase 2, suggesting Phase 3 trials around 2026-2027, with potential approval by 2031; basically right as the Ozempic patents expire. But Novo isn&amp;rsquo;t betting everything on amycretin. They&amp;rsquo;re running what appears to be a diversified pipeline strategy with multiple shots on goal: &lt;a href="https://www.novonordisk-trials.com/trials-conditions/all-trials-v2/NN9541-4919.html"&gt;NNC-0519&lt;/a&gt; (another next-gen GLP-1), &lt;a href="https://www.novonordisk-trials.com/trials-conditions/all-trials-v2/NN9662-7694.html"&gt;NNC-0662&lt;/a&gt; (details kept confidential), and cagrilintide combinations. This makes sense: you want multiple candidates because the &lt;a href="https://pmc.ncbi.nlm.nih.gov/articles/PMC9293739/"&gt;failure rate in drug development&lt;/a&gt; makes even the most promising compounds statistically likely to fail. Eli Lilly&amp;rsquo;s tirzepatide (Mounjaro/Zepbound) &lt;a href="https://mounjaro.lilly.com/hcp/how-mounjaro-works"&gt;works through a different mechanism&lt;/a&gt;—GLP-1 plus GIP receptor activation—and appears to be gaining market share. &lt;a href="https://investor.lilly.com/news-releases/news-release-details/lillys-oral-glp-1-orforglipron-demonstrated-statistically"&gt;Lilly&amp;rsquo;s orforglipron, an oral GLP-1 that hit 14.7% weight loss in Phase 2&lt;/a&gt;, represents another competitive threat. Judging by &lt;a href="https://finance.yahoo.com/quote/LLY/chart/"&gt;LLY&amp;rsquo;s price development&lt;/a&gt;, investors currently seem to think that Lilly is doing a better job at architecting a portfolio than Novo (or at least providing more disclosure about their pipeline). Yet, the overall competitive landscape might actually benefit both companies. The &amp;ldquo;war&amp;rdquo; between Novo and Lilly is expanding the overall market for obesity treatments, potentially growing the pie faster than either company is losing share. Also, to analyze the financial impact of the expiring Ozempic patents, we have to look further than just Novo&amp;rsquo;s research pipeline. Manufacturing these GLP-1 compounds and their &lt;a href="https://yds.ypsomed.com/files/media/03_Documents/12_Articles/%23171_2025_AprMay_Sustainability_Ypsomed.pdf"&gt;delivery devices&lt;/a&gt; is &amp;ldquo;pretty tough.&amp;rdquo; Complex peptides requiring &lt;a href="https://www.bachem.com/articles/commercial-apis/glucagon-like-peptide-1-glp-1/"&gt;specialized manufacturing capabilities&lt;/a&gt;, plus the injection devices themselves are patent-protected. This creates what we would call &lt;a href="https://www.morganstanley.com/im/publication/insights/articles/article_measuringthemoat.pdf"&gt;a capacity constraint moat&lt;/a&gt; in corporate strategy. Novo&amp;rsquo;s manufacturing capabilities/partnerships and injectable device patents are a key competitive advantage. Even when semaglutide goes generic in 2031, the entire generic pharmaceutical industry would essentially need to coordinate to build sufficient manufacturing capacity to meaningfully dent Novo&amp;rsquo;s market share. Meanwhile, Novo could potentially defend by lowering prices while maintaining manufacturing advantages in a monopoly-to-oligopoly transition.&lt;/p&gt;
&lt;p&gt;The other day I came across &lt;a href="https://github.com/martinshkreli/models/blob/main/NOVOB.xlsx"&gt;Martin Shkreli&amp;rsquo;s NOVO model&lt;/a&gt;. Conservatively, it puts Novo&amp;rsquo;s fair value around 705 DKK (21% upside from ~585 DKK), while a failure scenario drops valuation to 385 DKK. The range reflects what you&amp;rsquo;d expect for a large-cap pharmaceutical company;the market has already incorporated most knowable information about pipeline risks and patent timelines. This also underscores the point that manufacturing capabilities and continuous innovation pipelines can potentially maintain quasi-monopolistic positions longer than traditional patent protection would suggest. Shkreli&amp;rsquo;s analysis suggests Novo Nordisk is reasonably valued with modest upside potential, contingent on successful pipeline execution. Novo Nordisk is at a critical juncture, with substantial franchise value dependent on successful pipeline execution over the next 7-8 years. While the current valuation appears reasonable, the binary nature of drug development success creates both upside potential and significant downside risk.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;&lt;h6&gt;This article is for informational purposes only, you should not consider any information or other material on this site as investment, financial, or other advice. There are risks associated with investing.&lt;/h6&gt;&lt;/em&gt;&lt;/p&gt;
&lt;aside class="disclaimer" role="note" aria-label="Disclaimer"&gt;
&lt;div class="disclaimer-content"&gt;&lt;p&gt;&lt;strong&gt;Disclaimer:&lt;/strong&gt; For informational purposes only, not medical advice. Consult a qualified healthcare provider for any medical questions or conditions.&lt;/p&gt;&lt;/div&gt;
&lt;/aside&gt;</description></item><item><title>Behavioral Economics &amp; Transit Policy</title><link>http://philippdubach.com/posts/behavioral-economics-transit-policy/</link><pubDate>Sun, 22 Jun 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/behavioral-economics-transit-policy/</guid><description>&lt;p&gt;Over the weekend a &lt;a href="https://www.wsj.com/opinion/new-yorks-choice-cuomo-or-socialism-election-mayor-race-vote-mamdani-ede84c75"&gt;WSJ editorial on the 2025 New York City mayoral election&lt;/a&gt; called one of the potential Democratic candidates Zohran Mamdani &amp;ldquo;a literal socialist&amp;rdquo; for - among other things - running on the promise of &lt;a href="https://www.thenation.com/article/society/new-york-city-bus-free-fare/"&gt;free bus rides for all&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Zohran won New York&amp;rsquo;s first fare-free bus pilot on five lines across the city. As Mayor, he&amp;rsquo;ll permanently eliminate the fare on every city bus [&amp;hellip;] Fast and free buses will not only make buses reliable and accessible but will improve safety for riders and operators – creating the world-class service New Yorkers deserve.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Free public transit seems to be a &lt;a href="https://en.wikipedia.org/wiki/Free_public_transport#List_of_towns_and_cities_with_area-wide_zero-fare_transport"&gt;recurring idea&lt;/a&gt; among politicians: For some reason, making it free feels revolutionary in a way that making it cheaper never could. There&amp;rsquo;s actually some solid behavioral economics behind this intuition: &amp;ldquo;Zero as a Special Price: The True Value of Free Products.&amp;rdquo; (Yes, before the &lt;a href="https://www.youtube.com/watch?v=Q3tSG8h_O3A&amp;amp;pp=ygUPZGFuIEFyaWVseSBmYWtl"&gt;fabricated data scandals&lt;/a&gt;, Ariely did write research that has replicated consistently.) The basic finding: people don&amp;rsquo;t treat &amp;ldquo;free&amp;rdquo; as just another very low price. When you price something at zero, it gets a special psychological boost that makes people value it way more than they should based on pure cost-benefit analysis: Give people a choice between a Hershey&amp;rsquo;s Kiss for 1¢ and a Lindt truffle for 15¢. Most people choose the obviously superior Lindt. Now make it Hershey&amp;rsquo;s for free versus Lindt for 14¢—keeping the price difference exactly the same—and suddenly everyone wants the Hershey&amp;rsquo;s. Free doesn&amp;rsquo;t just eliminate cost; it creates additional perceived value. The mechanism is pure affect. &amp;ldquo;Free&amp;rdquo; makes people feel good in a way that &amp;ldquo;1¢&amp;rdquo; doesn&amp;rsquo;t, even though the economic difference is trivial. When you force people to think analytically about the trade-offs, the effect disappears. But in normal decision-making, that warm fuzzy feeling of getting something for nothing dominates rational calculation.&lt;/p&gt;
&lt;p&gt;The difference between a $2.75 bus fare and $0 isn&amp;rsquo;t meaningfully different from the difference between $2.75 and $0.75 for most riders&amp;rsquo; budgets. But psychologically? Free transit feels like a gift from the city. Cheap transit feels like commerce. The first activates social norms (gratitude, civic participation, shared ownership). The second activates market norms (cost-benefit analysis, value-for-money calculations, consumer complaints when service is bad). On the other side, any positive price, no matter how small, forces people into analytical mode. People start thinking about trade-offs, evaluating whether the service is worth it, considering alternatives.
This is why congestion pricing works so well. A $5 charge to drive in Manhattan (&lt;a href="https://en.wikipedia.org/wiki/Congestion_pricing"&gt;or Singapore, London, Stockholm, Milan, Gothenburg&lt;/a&gt;) isn&amp;rsquo;t going to bankrupt anyone who can afford to drive in Manhattan. But it makes people think about each trip in a way they never did when driving felt &amp;ldquo;free&amp;rdquo; (ignoring gas, parking, insurance, etc.). Once you&amp;rsquo;re thinking analytically rather than just following habit, you&amp;rsquo;re much more likely to take the subway.&lt;/p&gt;
&lt;p&gt;But!! Free transit might actually make it easier to cut transit funding, not harder. Right now, when transit agencies face budget cuts, fare-paying riders get angry. They&amp;rsquo;re customers! They paid for service! They demand value for money! This creates a natural constituency defending transit budgets. Make transit free, and you&amp;rsquo;ve eliminated that market relationship. Riders become passive beneficiaries rather than paying customers. When service gets worse, they can&amp;rsquo;t complain about not getting their money&amp;rsquo;s worth; they&amp;rsquo;re getting exactly what they paid for.
If I were a politician looking to slash subsidies without political blowback, step one would be eliminating fares. Tell everyone it&amp;rsquo;s about equity and access. Then, once people stop thinking of themselves as customers, start the real cuts. No more late-night service; hey, it&amp;rsquo;s free! Longer waits, dirtier stations, broken escalators; what did you expect for nothing? The behavioral economics is clear: when something is free, people have lower expectations and less standing to complain. The zero-price effect works both ways. None of this means transit shouldn&amp;rsquo;t be affordable!&lt;/p&gt;</description></item><item><title>It Just Ain’t So</title><link>http://philippdubach.com/posts/it-just-aint-so/</link><pubDate>Sun, 15 Jun 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/it-just-aint-so/</guid><description>&lt;blockquote&gt;
&lt;p&gt;It ain&amp;rsquo;t what you don&amp;rsquo;t know that gets you into trouble. It&amp;rsquo;s what you know for sure that just ain&amp;rsquo;t so.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This (not actually) Mark Twain quote from &lt;a href="https://en.wikipedia.org/wiki/The_Big_Short_(film)"&gt;The Big Short&lt;/a&gt; captures the sentiment of realizing that some foundational assumptions might be empirically wrong.&lt;/p&gt;
&lt;p&gt;A recent article by &lt;a href="https://antonvorobets.substack.com"&gt;Anton Vorobets&lt;/a&gt; that I came across in &lt;a href="https://www.bloomberg.com/authors/AQ0Te4IePFE/justina-lee"&gt;Justina Lee&lt;/a&gt;&amp;rsquo;s Quant Newsletter presents compelling evidence that challenges one of the field&amp;rsquo;s fundamental statistical assumptions, that asset returns follow normal distributions. Using 26 years of data from 10 US equity indices, he ran formal normality tests (Shapiro-Wilk, D&amp;rsquo;Agostino&amp;rsquo;s K², Anderson-Darling) and found that the normal distribution hypothesis gets rejected in most cases. The supposed &amp;ldquo;Aggregational Gaussianity&amp;rdquo; that academics invoke through Central Limit Theorem arguments? It&amp;rsquo;s mostly wishful thinking enabled by small sample sizes. As Vorobets observes:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Finance and economics academia is unfortunately driven by several convenient myths, i.e., claims that are taken for granted and spread among university academics despite their poor empirical support.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The article highlights significant practical consequences for portfolio management and risk assessment. Portfolio optimization based on normal distribution assumptions ignores fat left tails—exactly the kind of extreme downside events that can wipe out portfolios. This misspecification can lead to inadequate risk management and suboptimal asset allocation decisions. Vorobets suggests &lt;a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4034316"&gt;alternative approaches, including Monte Carlo simulations combined with Conditional Value-at-Risk (CVaR) optimization&lt;/a&gt;, which better accommodate the complex distributional properties observed in financial data. While computationally more demanding, these methods offer improved alignment with empirical reality.&lt;/p&gt;
&lt;p&gt;Reading this piece gave me a few ideas for extensions I might want to explore in an upcoming personal project:
(1) While Vorobets focuses on US equity indices, similar analysis across fixed income, commodities, currencies, and alternative assets would provide a more comprehensive view of distributional properties across financial markets. Each asset class exhibits distinct market microstructure characteristics that may influence distributional behavior.
(2) Global Market Coverage: Extending the geographic scope to include developed, emerging, and frontier markets would illuminate whether the documented deviations from normality represent universal phenomena or are specific to US market structures. Cross-regional analysis could reveal important insights about market development, regulatory frameworks, and institutional differences.
(3) Building upon Vorobets&amp;rsquo; foundation, there are opportunities to incorporate multivariate normality testing, regime-dependent analysis, and time-varying parameter models. Additionally, investigating the power and robustness of different statistical tests across various market conditions would strengthen the methodological contribution.
(4) Examining different time horizons, market regimes (pre- and post-financial crisis, COVID period), and potentially higher-frequency data could provide deeper insights into when and why distributional assumptions break down.&lt;/p&gt;</description></item><item><title>Not All AI Skeptics Think Alike</title><link>http://philippdubach.com/posts/not-all-ai-skeptics-think-alike/</link><pubDate>Thu, 12 Jun 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/not-all-ai-skeptics-think-alike/</guid><description>&lt;p&gt;Apple&amp;rsquo;s recent paper &amp;ldquo;The Illusion of Thinking&amp;rdquo; has been widely understood to demonstrate that reasoning models don&amp;rsquo;t &amp;lsquo;actually&amp;rsquo; reason. Using controllable puzzle environments instead of contaminated math benchmarks, they discovered something fascinating: there are three distinct performance regimes when it comes to AI reasoning complexity. For simple problems, standard models actually outperform reasoning models while being more token-efficient. At medium complexity, reasoning models show their advantage. But at high complexity? Both collapse completely.
Here&amp;rsquo;s the kicker: reasoning models exhibit counterintuitive scaling behavior—their thinking effort increases with problem complexity up to a point, then declines despite having adequate token budget. It&amp;rsquo;s like watching a student give up mid-exam when the questions get too hard, even though they have plenty of time left.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We observe that reasoning models initially increase their thinking tokens proportionally with problem complexity. However, upon approaching a critical threshold—which closely corresponds to their accuracy collapse point—models counterintuitively begin to reduce their reasoning effort despite increasing problem difficulty.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The researchers found something even more surprising: even when they provided explicit algorithms—essentially giving the models the answers—performance didn&amp;rsquo;t improve. The collapse happened at roughly the same complexity threshold.&lt;/p&gt;
&lt;p&gt;On the other hand, &lt;a href="https://www.seangoedecke.com/illusion-of-thinking/"&gt;Sean Goedecke&lt;/a&gt; is not buying Apple&amp;rsquo;s methodology: His core objection? Puzzles &amp;ldquo;require computer-like algorithm-following more than they require the kind of reasoning you need to solve math problems.&amp;rdquo;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;You can&amp;rsquo;t compare eight-disk to ten-disk Tower of Hanoi, because you&amp;rsquo;re comparing &amp;ldquo;can the model work through the algorithm&amp;rdquo; to &amp;ldquo;can the model invent a solution that avoids having to work through the algorithm&amp;rdquo;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;From his own testing, models &amp;ldquo;decide early on that hundreds of algorithmic steps are too many to even attempt, so they refuse to even start.&amp;rdquo; That&amp;rsquo;s strategic behavior, not reasoning failure. This matters because it shows how evaluation methodology shapes our understanding of AI capabilities. Goedecke argues Tower of Hanoi puzzles aren&amp;rsquo;t useful for determining reasoning ability, and that the complexity threshold of reasoning models may not be fixed.&lt;/p&gt;</description></item><item><title>Gambling vs. Investing</title><link>http://philippdubach.com/posts/gambling-vs.-investing/</link><pubDate>Fri, 30 May 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/gambling-vs.-investing/</guid><description>&lt;p&gt;&lt;a href="https://kalshi.com/"&gt;Kalshi&lt;/a&gt;, a prediction market startup, is using its federal financial license to offer sports betting nationwide, even in states where it&amp;rsquo;s not legal. The move has earned them cease-and-desist letters from state gaming regulators, but CEO Tarek Mansour isn&amp;rsquo;t backing down:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We can go one by one for every financial market and it would fall under the definition of gambling. So what&amp;rsquo;s the difference?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;It&amp;rsquo;s a question that cuts to the heart of modern finance. The founders argue that Wall Street blurred the line between investing and gambling long ago, and casting Kalshi as the latter is inconsistent at best. They have a point—if you can bet on oil futures, Nvidia&amp;rsquo;s stock price, or interest rate movements, why is wagering on NFL touchdowns more objectionable?&lt;/p&gt;
&lt;p&gt;Benefiting from the Trump administration&amp;rsquo;s hands-off regulatory approach, with the CFTC dropping its legal challenge to their election contracts, the odds might be in their favor. Even better, a Kalshi board member is awaiting confirmation to lead the very agency that was previously their biggest antagonist.&lt;/p&gt;
&lt;p&gt;The technical distinction matters: Kalshi operates as an exchange between traders rather than a house taking bets against customers. But functionally, with 79% of their recent trading volume being sports-related, they&amp;rsquo;re forcing us to confront an uncomfortable reality about risk, speculation, and what we choose to call &amp;ldquo;investing.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;Whether you call it innovation or regulatory arbitrage, Kalshi is exposing the arbitrary nature of the lines we&amp;rsquo;ve drawn around acceptable financial speculation.&lt;/p&gt;
&lt;p&gt;&lt;br&gt;_ _&lt;/p&gt;
&lt;p&gt;&lt;em&gt;(17/06/2025) Update: Matt Levine - one of the finance columnists I enjoy reading most - just published a long piece &lt;a href="https://www.bloomberg.com/opinion/newsletters/2025-06-17/it-s-not-gambling-it-s-predicting"&gt;&amp;ldquo;It&amp;rsquo;s Not Gambling, It&amp;rsquo;s Predicting&amp;rdquo;&lt;/a&gt; in his newsletter on exactly this issue:&lt;/em&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Kalshi offers a prediction market where you can bet on sports. No! Sorry! Wrong! It offers a prediction market where you can predict which team will win a sports game, and if you predict correctly you make money, and if you predict incorrectly you lose money. Not &amp;ldquo;bet on sports.&amp;rdquo; &amp;ldquo;Predict sports outcomes for money.&amp;rdquo; Completely different.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;</description></item><item><title>Modeling Glycemic Response with XGBoost</title><link>http://philippdubach.com/posts/modeling-glycemic-response-with-xgboost/</link><pubDate>Fri, 30 May 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/modeling-glycemic-response-with-xgboost/</guid><description>&lt;br&gt;
&lt;p&gt;Earlier this year I wrote how &lt;a href="http://philippdubach.com/posts/i-built-a-cgm-data-reader/"&gt;I built a CGM data reader&lt;/a&gt; after wearing a continuous glucose monitor myself. Since I was already logging my macronutrients and learning more about molecular biology in an &lt;a href="https://ocw.mit.edu/courses/res-7-008-7-28x-molecular-biology/"&gt;MIT MOOC&lt;/a&gt;, I became curious: given a meal&amp;rsquo;s macronutrients (carbs, protein, fat) and some basic individual characteristics (age, BMI), could a machine learning model predict the shape of my postprandial glucose curve? I came across &lt;a href="https://www.cell.com/cell/fulltext/S0092-8674(15)01481-6"&gt;Zeevi et al.&lt;/a&gt;&amp;rsquo;s paper on Personalized Nutrition by Prediction of Glycemic Responses, which used machine learning to predict individual glycemic responses from meal data. Exactly what I had in mind. Unfortunately, neither the data nor the code were publicly available. So I decided to build my own model. In the process I wrote this &lt;a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5914902"&gt;working paper&lt;/a&gt;.&lt;/p&gt;
&lt;a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5914902"&gt;
&lt;a href="#lightbox-working_paper_overview-jpg-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/working_paper_overview.jpg 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/working_paper_overview.jpg 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/working_paper_overview.jpg 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/working_paper_overview.jpg 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/working_paper_overview.jpg 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/working_paper_overview.jpg 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/working_paper_overview.jpg 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/working_paper_overview.jpg 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/working_paper_overview.jpg 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/working_paper_overview.jpg 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/working_paper_overview.jpg 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/working_paper_overview.jpg"
alt="Overview of Working Paper Pages"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;/a&gt;
&lt;p&gt;The paper documents my attempt to build an open, reproducible glucose prediction pipeline, and what I learned about why that is harder than it sounds. The methodologies employed were largely inspired by &lt;a href="https://www.cell.com/cell/fulltext/S0092-8674(15)01481-6?_returnURL=https%3A%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS0092867415014816%3Fshowall%3Dtrue"&gt;Zeevi et al.&lt;/a&gt;&amp;rsquo;s approach. This matters because the landscape of personalized nutrition is increasingly dominated by proprietary systems. Companies like ZOE, DayTwo, and Ultrahuman all run versions of this pipeline on closed data. Open-source alternatives remain scarce.&lt;/p&gt;
&lt;h2 id="why-not-only-use-my-own-data"&gt;Why not only use my own data?&lt;/h2&gt;
&lt;p&gt;I quickly realized that training a model only on my own CGM data was not going to work. Over several weeks of diligent logging, I collected roughly 40 meal-response pairs. To make matters worse, &lt;a href="https://doi.org/10.1093/ajcn/nqaa198"&gt;Howard, Guo &amp;amp; Hall (2020)&lt;/a&gt; showed that two CGMs worn simultaneously on the same person can give discordant meal rankings for postprandial glucose, meaning some of the variance in the signal is measurement noise, not biology.&lt;/p&gt;
&lt;p&gt;To get enough data, I used the publicly available &lt;a href="https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.2005143"&gt;Hall dataset&lt;/a&gt; containing continuous glucose monitoring data from 57 adults, which I narrowed down to 112 standardized meals from 19 non-diabetic subjects with their respective glucose curve after the meal (full methodology in the paper).&lt;/p&gt;
&lt;a href="#lightbox-cgm-workflow-graph-jpg-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/cgm-workflow-graph.jpg 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/cgm-workflow-graph.jpg 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/cgm-workflow-graph.jpg 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/cgm-workflow-graph.jpg 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/cgm-workflow-graph.jpg 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/cgm-workflow-graph.jpg 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/cgm-workflow-graph.jpg 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/cgm-workflow-graph.jpg 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/cgm-workflow-graph.jpg 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/cgm-workflow-graph.jpg 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/cgm-workflow-graph.jpg 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/cgm-workflow-graph.jpg"
alt="Overview of the CGM pipeline workflow"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;h2 id="gaussian-curve-fitting"&gt;Gaussian curve fitting&lt;/h2&gt;
&lt;p&gt;Rather than trying to predict the entire glucose curve, I simplified the problem by fitting each postprandial response to a normalized Gaussian function. This gave me three key parameters to predict: amplitude (how high glucose rises), time-to-peak (when it peaks), and curve width (how long the response lasts).&lt;/p&gt;
&lt;a href="#lightbox-cgm-fitted-curve-large1-jpg-2" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/cgm-fitted-curve-large1.jpg 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/cgm-fitted-curve-large1.jpg 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/cgm-fitted-curve-large1.jpg 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/cgm-fitted-curve-large1.jpg 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/cgm-fitted-curve-large1.jpg 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/cgm-fitted-curve-large1.jpg 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/cgm-fitted-curve-large1.jpg 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/cgm-fitted-curve-large1.jpg 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/cgm-fitted-curve-large1.jpg 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/cgm-fitted-curve-large1.jpg 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/cgm-fitted-curve-large1.jpg 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/cgm-fitted-curve-large1.jpg"
alt="Overview of single fitted curve of cgm measurements"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;p&gt;The Gaussian approximation worked surprisingly well for characterizing most glucose responses. While some curves fit better than others, the majority of postprandial responses were well-captured, though there is clear variation between individuals and meals. Some responses were high amplitude, narrow width, while others are more gradual and prolonged.&lt;/p&gt;
&lt;a href="#lightbox-example-fitted-cgm-measurements-jpg-3" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/example-fitted-cgm-measurements.jpg 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/example-fitted-cgm-measurements.jpg 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/example-fitted-cgm-measurements.jpg 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/example-fitted-cgm-measurements.jpg 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/example-fitted-cgm-measurements.jpg 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/example-fitted-cgm-measurements.jpg 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/example-fitted-cgm-measurements.jpg 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/example-fitted-cgm-measurements.jpg 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/example-fitted-cgm-measurements.jpg 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/example-fitted-cgm-measurements.jpg 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/example-fitted-cgm-measurements.jpg 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/example-fitted-cgm-measurements.jpg"
alt="Overview of selected fitted curves"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;h2 id="xgboost-pipeline"&gt;XGBoost pipeline&lt;/h2&gt;
&lt;p&gt;I then trained an XGBoost regressor with 27 engineered features including meal composition, participant characteristics, and interaction terms. XGBoost was chosen for its ability to handle mixed data types, built-in feature importance, and strong performance on tabular data. The pipeline included hyperparameter tuning with 5-fold cross-validation to optimize learning rate, tree depth, and regularization parameters. Rather than relying solely on basic meal macronutrients, I engineered features across multiple categories and implemented CGM statistical features calculated over different time windows (24-hour and 4-hour periods), including time-in-range and glucose variability metrics. Architecture-wise, I trained three separate XGBoost regressors, one for each Gaussian parameter.&lt;/p&gt;
&lt;h2 id="results"&gt;Results&lt;/h2&gt;
&lt;p&gt;The model could predict &lt;em&gt;how high&lt;/em&gt; my blood sugar rises after a meal with moderate accuracy (R² = 0.46, correlation = 0.73, p &amp;lt; 0.001). Not good enough for clinical guidance, which typically requires R² &amp;gt; 0.7, but meaningfully better than the multi-linear regression baseline (R² = 0.24).&lt;/p&gt;
&lt;p&gt;The more telling result is what the model could not do. It had no idea &lt;em&gt;when&lt;/em&gt; blood sugar would peak. The time-to-peak prediction was literally worse than guessing the average every time (R² = -0.76, p = 0.896). Curve width prediction was marginally better but still not useful (R² = 0.10). In other words: meal composition tells you something about the magnitude of your glucose spike, but almost nothing about its timing or duration. That is a meaningful finding in itself, consistent with the idea that temporal dynamics are driven by factors like gastric emptying, insulin sensitivity, and gut microbiome composition, none of which were captured in the feature set.&lt;/p&gt;
&lt;p&gt;For context, &lt;a href="https://www.sciencedirect.com/science/article/abs/pii/S1746809423012429"&gt;Cappon et al. (2023)&lt;/a&gt; trained a similar XGBRegressor on 3,296 meals from 927 healthy individuals and achieved a correlation of r = 0.48 for predicting glycemic response magnitude. Their larger dataset did not dramatically improve over my amplitude correlation of 0.73, but they also found systematic bias in predictions, suggesting that XGBoost captures the general direction well while missing individual-level variation. Separately, &lt;a href="https://www.nature.com/articles/s41598-025-01367-7"&gt;Shin et al. (2025)&lt;/a&gt; tried a bidirectional LSTM on 171 healthy adults and achieved r = 0.43, worse than XGBoost on amplitude. Deep learning does not automatically win here, especially at small-to-medium dataset sizes. Data quantity matters more than model complexity.&lt;/p&gt;
&lt;p&gt;A study on &lt;a href="https://www.nature.com/articles/s41522-025-00650-9"&gt;glycemic prediction in pregnant women&lt;/a&gt; found that adding gut microbiome data increased explained variance in glucose peaks from 34% to 42%, underscoring that meal composition alone leaves a lot on the table.&lt;/p&gt;
&lt;p&gt;The complete code, Jupyter notebooks, processed datasets, and supplementary results are available in my &lt;a href="https://github.com/philippdubach/glucose-response-analysis"&gt;GitHub repository&lt;/a&gt;.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;em&gt;(10/06/2025) Update: Today I came across Marcel Salathé&amp;rsquo;s &lt;a href="https://www.linkedin.com/posts/salathe_myfoodrepo-digitalhealth-precisionnutrition-activity-7337806988082393088-2Lsu?utm_source=share&amp;amp;utm_medium=member_ios&amp;amp;rcm=ACoAADeInT4BJMhtg5DSjxX1jVtIAs5w_KxZm-g"&gt;LinkedIn post&lt;/a&gt; on a publication out of EPFL: &lt;a href="https://www.frontiersin.org/journals/nutrition/articles/10.3389/fnut.2025.1539118/full"&gt;Personalized glucose prediction using in situ data only&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;With data from over 1,000 participants of the Food &amp;amp; You digital cohort, we show that a machine learning model using only food data from myFoodRepo and a glucose monitor can closely track real blood sugar responses to any meal (correlation of 0.71).&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;em&gt;As expected, Singh et al. achieve substantially better predictive performance (R = 0.71 vs R² = 0.46). The most critical difference is sample size: their 1,000+ participants versus my 19 (from the &lt;a href="https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.2005143"&gt;Hall dataset&lt;/a&gt;). They leveraged the &lt;a href="https://pubmed.ncbi.nlm.nih.gov/38033170/"&gt;&amp;ldquo;Food &amp;amp; You&amp;rdquo; study&lt;/a&gt; with high-resolution nutritional intake data from more than 46 million kcal collected across 315,126 dishes, 1,470,030 blood glucose measurements, and 1,024 gut microbiota samples. Both studies use XGBoost, SHAP for interpretability, cross-validation for evaluation, and mathematical approaches to characterize glucose responses (Gaussian curve fitting in my case, incremental AUC in theirs). The methodological overlap is reassuring; what separates the results is data at scale.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;The &lt;a href="https://www.nature.com/articles/s41597-025-05851-7"&gt;CGMacros dataset&lt;/a&gt; (Das et al., Scientific Data, 2025) now provides the first publicly available multimodal dataset with CGM readings, food macronutrients, meal photos, activity data, and microbiome profiles for 45 participants. It even includes an XGBoost example script for predicting postprandial AUC. This is exactly the kind of open resource the field needs more of.&lt;/em&gt;&lt;a id="update"&gt;&lt;/p&gt;
&lt;aside class="disclaimer" role="note" aria-label="Disclaimer"&gt;
&lt;div class="disclaimer-content"&gt;&lt;p&gt;&lt;strong&gt;Disclaimer:&lt;/strong&gt; For informational purposes only, not medical advice. Consult a qualified healthcare provider for any medical questions or conditions.&lt;/p&gt;&lt;/div&gt;
&lt;/aside&gt;</description></item><item><title>The Model Said So</title><link>http://philippdubach.com/posts/the-model-said-so/</link><pubDate>Wed, 28 May 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/the-model-said-so/</guid><description>&lt;p&gt;LLMs make your life easier until they don&amp;rsquo;t.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Their intrinsic complexity and lack of transparency pose significant challenges, especially in the highly regulated financial sector&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Unlike other industries where &amp;ldquo;the model said so&amp;rdquo; might suffice, finance demands audit trails, bias detection,
and explainable decision-making—requirements that sit uncomfortably with neural networks containing billions of parameters.
The research highlights a fundamental tension that&amp;rsquo;s about to reshape fintech:
the same complexity that makes LLMs powerful at parsing market sentiment or generating investment reports also makes them regulatory nightmares
in a sector where you need to explain every decision to examiners.&lt;/p&gt;</description></item><item><title>Dual Mandate Tensions</title><link>http://philippdubach.com/posts/dual-mandate-tensions/</link><pubDate>Wed, 21 May 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/dual-mandate-tensions/</guid><description>&lt;p&gt;Something interesting just happened at the National Bureau of Economic Research NBER&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We study the optimal monetary policy response to the imposition of tariffs in a model
with imported intermediate inputs. In a simple open-economy framework, we show
that a tariff maps exactly into a cost-push shock in the standard closed-economy New
Keynesian model, shifting the Phillips curve upward. We then characterize optimal
monetary policy, showing that it partially accommodates the shock to smooth the
transition to a more distorted long-run equilibrium—at the cost of higher short-run
inflation.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here&amp;rsquo;s where it gets interesting for current policy: Werning et. al.
show that &amp;ldquo;optimal&amp;rdquo; monetary policy would actually calls for partial accommodation
of tariff shocks—essentially allowing some inflation to persist to smooth the transition
to what they euphemistically call &amp;ldquo;a more distorted long-run equilibrium.&amp;rdquo;
With core PCE still running above the Fed&amp;rsquo;s 2% target and renewed tariff threats on the horizon,
this research suggests Powell may need to abandon his recent dovish pivot and prepare
for rate hikes that prioritize price stability over employment concerns.
The dual mandate was never meant to be dual when the two mandates point in opposite directions.&lt;/p&gt;</description></item><item><title>Beyond Monte Carlo: Tensor-Based Market Modeling</title><link>http://philippdubach.com/posts/beyond-monte-carlo-tensor-based-market-modeling/</link><pubDate>Sun, 11 May 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/beyond-monte-carlo-tensor-based-market-modeling/</guid><description>&lt;p&gt;A fascinating new paper from Stefano Iabichino at UBS Investment Bank explores what happens when you take the attention mechanisms powering modern AI and apply them to Wall Street&amp;rsquo;s most fundamental pricing problems, tackling what might be quantitative finance&amp;rsquo;s most intractable challenge.&lt;/p&gt;
&lt;p&gt;The problem is elegantly simple yet profound: machine learning models are great at finding patterns in historical data, but financial theory demands that arbitrage-free prices be independent of past information. As the authors put it:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We contend that a fundamental tension exists between the usage of ML methodologies in risk and pricing and the First Fundamental Theorem of Finance (FFTF). While ML models rely on historical data to identify recurring patterns, the FFTF posits that arbitrage-free market prices are independent of past information.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Their solution? Transition Probability Tensors (TPTs) that function like attention mechanisms in neural networks, dynamically weighting relationships between risk factors while maintaining mathematical rigor. Instead of learning from history, these tensors capture &amp;ldquo;dynamic, context-aware relationships across dimensions&amp;rdquo; in real-time.&lt;/p&gt;
&lt;p&gt;The practical results are impressive: simulating 210 quantitative investment strategies across 100,000 market scenarios in just 70 seconds, while identifying optimal hedging strategies and stress-testing future market conditions. The framework even adapts to different volatility regimes, shifting focus toward tail events during high-volatility periods—exactly like attention mechanisms focusing on relevant context. Whether it scales beyond this impressive proof-of-concept remains to be seen, but it&amp;rsquo;s seems to be a genuine attempt to resolve the fundamental tension between AI&amp;rsquo;s pattern-seeking nature and finance&amp;rsquo;s requirement for arbitrage-free pricing.&lt;/p&gt;</description></item><item><title>Trading on Market Sentiment</title><link>http://philippdubach.com/posts/trading-on-market-sentiment/</link><pubDate>Thu, 20 Feb 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/trading-on-market-sentiment/</guid><description>&lt;p&gt;&lt;em&gt;This post is based in part on a 2022 presentation I gave for the &lt;a href="https://www.ft.com/content/3bd45acd-b323-3c6b-ba98-ac78b456f308"&gt;ICBS Student Investment Fund&lt;/a&gt; and my seminar work at Imperial College London.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;As we were looking for new investment strategies for our Macro Sentiment Trading team, OpenAI had just published their &lt;a href="https://platform.openai.com/docs/models/gpt-3-5-turbo"&gt;GPT-3.5 Model&lt;/a&gt;. After first experiments with the model, we asked ourselves: How would large language models like GPT-3.5 perform in predicting sentiment in financial markets, where the signal-to-noise ratio is notoriously low? And could they potentially even outperform industry benchmarks at interpreting market sentiment from news headlines? The idea wasn&amp;rsquo;t entirely new. &lt;a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3389884"&gt;Studies&lt;/a&gt; &lt;a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1702854"&gt;[2]&lt;/a&gt; &lt;a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=685145"&gt;[3]&lt;/a&gt; have shown that investor sentiment, extracted from news and social media, can forecast market movements. But most approaches rely on traditional NLP models or proprietary systems like &lt;a href="https://www.ravenpack.com"&gt;RavenPack&lt;/a&gt;. With the recent advances in large language models, I wanted to test whether these more sophisticated models could provide a competitive edge in sentiment-based trading. Before looking at model selection, it&amp;rsquo;s worth understanding what makes trading on sentiment so challenging. News headlines present two fundamental problems that any robust system must address.
&lt;a href="#lightbox-news-relevance-timeline-jpg-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/news-relevance-timeline.jpg 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/news-relevance-timeline.jpg 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/news-relevance-timeline.jpg 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/news-relevance-timeline.jpg 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/news-relevance-timeline.jpg 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/news-relevance-timeline.jpg 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/news-relevance-timeline.jpg 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/news-relevance-timeline.jpg 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/news-relevance-timeline.jpg 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/news-relevance-timeline.jpg 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/news-relevance-timeline.jpg 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/news-relevance-timeline.jpg"
alt="Relative frequency of monthly Google News Search terms over 5 years. Numbers represent search interest relative to highest point. A value of 100 is the peak popularity for the term."
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
First, headlines are inherently non-stationary. Unlike other data sources, news reflects the constantly shifting landscape of global events, political climates, economic trends, etc. A model trained on COVID-19 vaccine headlines from 2020 might struggle with geopolitical tensions in 2023. This temporal drift means algorithms must be adaptive to maintain relevance.
&lt;a href="#lightbox-headline-market-impact-jpg-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/headline-market-impact.jpg 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/headline-market-impact.jpg 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/headline-market-impact.jpg 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/headline-market-impact.jpg 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/headline-market-impact.jpg 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/headline-market-impact.jpg 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/headline-market-impact.jpg 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/headline-market-impact.jpg 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/headline-market-impact.jpg 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/headline-market-impact.jpg 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/headline-market-impact.jpg 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/headline-market-impact.jpg"
alt="Impact of headlines measured by subsequent index move (Data Source: Bloomberg)"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Second, the relationship between headlines and market impact is far from obvious. Consider these actual headlines from November 2020: &amp;ldquo;Pfizer Vaccine Prevents 90% of COVID Infections&amp;rdquo; drove the S&amp;amp;P 500 up 1.85%, while &amp;ldquo;Pfizer Says Safety Milestone Achieved&amp;rdquo; barely moved the market at -0.05%. The same company, similar positive news, dramatically different market reactions.&lt;/p&gt;
&lt;p&gt;When developing a sentiment-based trading system, you essentially have two conceptual approaches: forward-looking and backward-looking.
Forward-looking models try to predict which news themes will drive markets, often working qualitatively by creating logical frameworks that capture market expectations. This approach is highly adaptable but requires deep domain knowledge and is time-consuming to maintain.
Backward-looking models analyze historical data to understand which headlines have moved markets in the past, then look for similarities in current news. This approach can leverage large datasets and scale efficiently, but suffers from low signal-to-noise ratios and the challenge that past relationships may not hold in the future.
For this project, I chose the backward-looking approach, primarily for its scalability and ability to work with existing datasets.&lt;/p&gt;
&lt;p&gt;Rather than rely on traditional approaches like &lt;a href="https://github.com/ProsusAI/finBERT"&gt;FinBERT&lt;/a&gt; (which only provides discrete positive/neutral/negative classifications), I decided to test OpenAI&amp;rsquo;s GPT-3.5 Turbo model. The key advantage was its ability to provide continuous sentiment scores from -1 to 1, giving much more nuanced signals for trading decisions. I used news headlines from the Dow Jones Newswire covering the 30 DJI companies from 2018-2022, filtering for quality sources like the Wall Street Journal and Bloomberg. After removing duplicates, this yielded 2,072 headlines. I then prompted GPT-3.5 to score sentiment with the instruction: &lt;code&gt;Rate the sentiment of the following news headlines from -1 (very bad) to 1 (very good), with two decimal precision&lt;/code&gt;. To validate the approach, I compared GPT-3.5 scores against RavenPack—the industry&amp;rsquo;s leading commercial sentiment provider.
&lt;a href="#lightbox-score-comparison-openai-rpa-jpg-3" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/score-comparison-openai-rpa.jpg 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/score-comparison-openai-rpa.jpg 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/score-comparison-openai-rpa.jpg 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/score-comparison-openai-rpa.jpg 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/score-comparison-openai-rpa.jpg 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/score-comparison-openai-rpa.jpg 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/score-comparison-openai-rpa.jpg 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/score-comparison-openai-rpa.jpg 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/score-comparison-openai-rpa.jpg 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/score-comparison-openai-rpa.jpg 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/score-comparison-openai-rpa.jpg 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/score-comparison-openai-rpa.jpg"
alt="Sample entries of the combined data set."
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
The correlation was 0.59, indicating the models generally agreed on sentiment direction while providing different granularities of scoring. More interesting was comparing the distribution of the sentiment ratings between the two models. This could have been approximated closer through some fine tuning of the (minimal) prompt used earlier.
&lt;a href="#lightbox-distribution-of-sentiment-openai-rpa-jpg-4" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/distribution-of-sentiment-openai-rpa.jpg 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/distribution-of-sentiment-openai-rpa.jpg 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/distribution-of-sentiment-openai-rpa.jpg 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/distribution-of-sentiment-openai-rpa.jpg 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/distribution-of-sentiment-openai-rpa.jpg 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/distribution-of-sentiment-openai-rpa.jpg 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/distribution-of-sentiment-openai-rpa.jpg 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/distribution-of-sentiment-openai-rpa.jpg 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/distribution-of-sentiment-openai-rpa.jpg 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/distribution-of-sentiment-openai-rpa.jpg 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/distribution-of-sentiment-openai-rpa.jpg 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/distribution-of-sentiment-openai-rpa.jpg"
alt="Comparing the distribution of the sentiment scores generated using the GPT-3.5 model with the benchmark scores from RavenPack."
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
I implemented a simple strategy: go long when sentiment hits the top 5% of scores, close positions at 25% profit (to reduce transaction costs), and maintain a fully invested portfolio with 1% commission per trade.
The results were mixed but promising. Over the full 2018-2022 period, the GPT-3.5 strategy generated 41.02% returns compared to RavenPack&amp;rsquo;s 40.99%—essentially matching the industry benchmark. However, both underperformed a simple buy-and-hold approach (58.13%) during this generally bullish period. Relying on market sentiment when news flow is low can be a tricky strategy. As can be seen from the example of the Salesforce stock performance**,** the strategy remained uninvested over a large period of time due to a (sometimes long-lasting) negative sentiment signal.
&lt;a href="#lightbox-crm-stock-sentiment-jpg-5" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/crm-stock-sentiment.jpg 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/crm-stock-sentiment.jpg 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/crm-stock-sentiment.jpg 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/crm-stock-sentiment.jpg 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/crm-stock-sentiment.jpg 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/crm-stock-sentiment.jpg 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/crm-stock-sentiment.jpg 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/crm-stock-sentiment.jpg 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/crm-stock-sentiment.jpg 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/crm-stock-sentiment.jpg 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/crm-stock-sentiment.jpg 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/crm-stock-sentiment.jpg"
alt="Stock performance of Salesforce (CRM) for 5 years from 2018 with sentiment indicators overlayed."
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
When I tested different timeframes, the sentiment strategy showed its strength during volatile periods. From 2020-2022, it outperformed buy-and-hold (22.83% vs 21.00%). As expected, sentiment-based approaches work better when markets are less directional and more driven by news flow. To evaluate whether the scores generated by our GPT prompt were more accurate than those from the RavenPack benchmark, I calculated returns for different holding windows. The scores generated by our GPT prompt perform significantly better in the short term (1 and 10 days) for positive sentiment and in the long term (90 days) for negative sentiment.
&lt;a href="#lightbox-sentiment-trading-results-jpg-6" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/sentiment-trading-results.jpg 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/sentiment-trading-results.jpg 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/sentiment-trading-results.jpg 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/sentiment-trading-results.jpg 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/sentiment-trading-results.jpg 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/sentiment-trading-results.jpg 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/sentiment-trading-results.jpg 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/sentiment-trading-results.jpg 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/sentiment-trading-results.jpg 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/sentiment-trading-results.jpg 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/sentiment-trading-results.jpg 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/sentiment-trading-results.jpg"
alt="Average 1, 10, 30, and 90-day holding period return for both models."
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
&lt;em&gt;(Note: For lower sentiment, negative returns are desirable since the stock would be shorted)&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;While the model performed well technically, this project highlighted several practical challenges. First, data accessibility remains a major hurdle—getting real-time, high-quality news feeds is expensive and often restricted. Second, the strategy worked better in a more volatile environment, which prompted many individual trades, creating substantial transaction costs that significantly impact returns. Perhaps most importantly, any real-world implementation would need to compete with high-frequency traders who can act on news within milliseconds. The few seconds required for GPT-3.5 to process headlines and generate sentiment scores are far from being competitive. Despite these challenges, the project demonstrated that LLMs can match industry benchmarks for sentiment analysis—and this was using a general-purpose model, not one specifically fine-tuned for financial applications. OpenAI (and others) today offer more powerful models at very low cost as well as fine-tuning capabilities that could further improve performance. The bigger opportunity might be in combining sentiment signals with other factors, using sentiment as one input in a more sophisticated trading system rather than the sole decision criterion. There&amp;rsquo;s also potential in expanding beyond simple long-only strategies to include short positions on negative sentiment, or developing &amp;ldquo;sentiment indices&amp;rdquo; that smooth out individual headline noise.
Market sentiment strategies may not be optimal for long-term investing, but they show clear promise for shorter-term trading in volatile environments. As LLMs continue to improve and become more accessible, this might offer an opportunity to revisit this project.&lt;/p&gt;
&lt;aside class="disclaimer" role="note" aria-label="Disclaimer"&gt;
&lt;div class="disclaimer-content"&gt;&lt;p&gt;&lt;strong&gt;Disclaimer:&lt;/strong&gt; All opinions expressed are my own. This is not investment, financial, tax, or legal advice. Past performance does not indicate future results. Do your own research and consult qualified professionals before making financial decisions. No liability accepted for any losses.&lt;/p&gt;&lt;/div&gt;
&lt;/aside&gt;</description></item><item><title>Passive Investing's Active Problem</title><link>http://philippdubach.com/posts/passive-investings-active-problem/</link><pubDate>Sat, 15 Feb 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/passive-investings-active-problem/</guid><description>&lt;p&gt;(1) A new academic paper suggests the rise of passive investing may be fueling fragile market moves.
(2) According to a study to be published in the American Economic Review, evidence is building that active managers are slow to scoop up stocks en masse when prices move away from their intrinsic worth.
(3) Thanks to this lethargic trading behavior and the relentless boom in benchmark-tracking index funds, the impact of each trade on prices gets amplified, explaining how sell orders can induce broader equity gyrations&lt;/p&gt;
&lt;p&gt;Passive investing, the supposedly boring strategy of buying and holding index funds, might actually be making markets more volatile. A new study set to be published in the American Economic Review finds that active managers are slow to scoop up stocks when prices move away from their intrinsic worth. Meanwhile, the relentless boom in benchmark-tracking index funds means that each trade gets amplified, explaining how sell orders can induce broader equity gyrations.
Justina Lee for Bloomberg writes that this week&amp;rsquo;s AI-fueled market swings perfectly illustrate the phenomenon. Big equity gauges plunged on Monday over fears about an AI model, before swiftly rebounding.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Thanks to this lethargic trading behavior and the relentless boom in benchmark-tracking index funds, the impact of each trade on prices gets amplified.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The researchers from UCLA, Stockholm School of Economics, and University of Minnesota have identified what they call &amp;ldquo;Big Passive&amp;rdquo;—a financial landscape that&amp;rsquo;s proving less dynamic and more volatile. When most investors are on autopilot, the few remaining active traders have disproportionate influence.
This doesn&amp;rsquo;t invalidate passive investing&amp;rsquo;s core benefits—lower costs and better long-term returns for most investors remain compelling. But it does suggest that our increasingly passive financial system has some unintended consequences.&lt;/p&gt;</description></item><item><title>I Built a CGM Data Reader</title><link>http://philippdubach.com/posts/i-built-a-cgm-data-reader/</link><pubDate>Thu, 02 Jan 2025 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/i-built-a-cgm-data-reader/</guid><description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;If you&amp;rsquo;re reading this, you might also be interested in: &lt;a href="http://philippdubach.com/posts/modeling-glycemic-response-with-xgboost/"&gt;Modeling Glycemic Response with XGBoost&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Last year I put a Continuous Glucose Monitor (CGM) sensor, specifically the &lt;a href="https://www.freestyle.abbott"&gt;Abbott Freestyle Libre 3&lt;/a&gt;, on my left arm. Why? I wanted to optimize my nutrition for endurance cycling competitions. Where I live, the sensor is easy to get—without any medical prescription—and even easier to use. Unfortunately, Abbott&amp;rsquo;s &lt;a href="https://apps.apple.com/us/app/freestyle-librelink-us/id1325992472"&gt;FreeStyle LibreLink&lt;/a&gt; app is less than optimal (3,250 other people with an average rating of 2.9/5.0 seem to agree). In their defense, the web app LibreView does offer some nice reports which can be generated as PDFs—not very dynamic, but still something! What I had in mind was more in the fashion of the &lt;a href="https://ultrahuman.com/m1"&gt;Ultrahuman M1 dashboard&lt;/a&gt;. Unfortunately, I wasn&amp;rsquo;t allowed to use my Libre sensor (EU firmware) with their app (yes, I spoke to customer service).&lt;/p&gt;
&lt;p&gt;At that point, I wasn&amp;rsquo;t left with much enthusiasm, only a coin-sized sensor in my arm. The LibreView website fortunately lets you download most of your (own) data in a CSV report (&lt;em&gt;there is also a &lt;a href="https://github.com/FokkeZB/libreview-unofficial"&gt;reverse engineered API&lt;/a&gt;&lt;/em&gt;), which is nice. So that&amp;rsquo;s what I did: download the data, &lt;code&gt;pd.read_csv()&lt;/code&gt; it into my notebook, calculate summary statistics, and plot the values.
&lt;a href="#lightbox-libre-measurements-jpg-0" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/libre-measurements.jpg 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/libre-measurements.jpg 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/libre-measurements.jpg 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/libre-measurements.jpg 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/libre-measurements.jpg 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/libre-measurements.jpg 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/libre-measurements.jpg 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/libre-measurements.jpg 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/libre-measurements.jpg 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/libre-measurements.jpg 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/libre-measurements.jpg 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/libre-measurements.jpg"
alt="Visualized CGM Datapoints"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
After some interpolation, I now had the same view as the LibreLink app (which I had rejected earlier) provided. Yet, this setup allowed me to do further analysis and visualizations by adding other datapoints (workouts, sleep, nutrition) I was also collecting at that time:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Blood sugar from &lt;a href="https://www.libreview.com/"&gt;LibreView&lt;/a&gt;: Measurement timestamps + glucose values&lt;/li&gt;
&lt;li&gt;Nutrition from &lt;a href="https://macrofactorapp.com/"&gt;MacroFactor&lt;/a&gt;: Meal timestamps + macronutrients (carbs, protein, and fat)&lt;/li&gt;
&lt;li&gt;Sleep data from &lt;a href="https://sleepcycle.com/"&gt;Sleep Cycle&lt;/a&gt;: Sleep start timestamp + time in bed + time asleep (+ sleep quality, which is a proprietary measure calculated by the app)&lt;/li&gt;
&lt;li&gt;Cardio workouts from &lt;a href="https://connect.garmin.com/"&gt;Garmin&lt;/a&gt;: Workout start timestamp + workout duration&lt;/li&gt;
&lt;li&gt;Strength workouts from &lt;a href="https://www.hevyapp.com/"&gt;Hevy&lt;/a&gt;: Workout start timestamp + workout duration&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;a href="#lightbox-cgm-dashboard-jpg-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/cgm-dashboard.jpg 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/cgm-dashboard.jpg 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/cgm-dashboard.jpg 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/cgm-dashboard.jpg 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/cgm-dashboard.jpg 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/cgm-dashboard.jpg 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/cgm-dashboard.jpg 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/cgm-dashboard.jpg 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/cgm-dashboard.jpg 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/cgm-dashboard.jpg 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/cgm-dashboard.jpg 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/cgm-dashboard.jpg"
alt="Final Dashboard"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
After structuring those datapoints in a dataframe and normalizing timestamps, I was able to quickly highlight sleep (blue boxes with callouts for time in bed, time asleep, and sleep quality) and workouts (red traces on glucose measurements for strength workouts, green traces for cardio workouts) by plotting highlighted traces on top of the historic glucose trail for a set period. Furthermore, I was able to add annotations for nutrition events with the respective macronutrients.&lt;/p&gt;
&lt;p&gt;I asked Claude to create some sample data and streamline the functions to reduce dependencies on the specific data sources I used. The resulting notebook is a comprehensive CGM data analysis tool that loads and processes glucose readings alongside lifestyle data (nutrition, workouts, and sleep), then creates an integrated dashboard for visualization. The code handles data preprocessing including interpolation of missing glucose values, timeline synchronization across different data sources, and statistical analysis with key metrics like time-in-range and coefficient of variation. The main output is a day-by-day dashboard that overlays workout periods, nutrition events, and sleep phases onto continuous glucose monitoring data, enabling users to identify patterns and correlations between lifestyle factors and blood sugar responses.&lt;/p&gt;
&lt;p&gt;You can find the complete &lt;a href="https://github.com/philippdubach/glucose-tracker/blob/fd5992961cfb4630dad439c782430190937414a3/notebooks/data_exploration.ipynb"&gt;notebook&lt;/a&gt; as well as the sample data in my &lt;a href="https://github.com/philippdubach/glucose-tracker/"&gt;GitHub repository&lt;/a&gt;.&lt;/p&gt;</description></item><item><title>Crypto Mean Reversion Trading</title><link>http://philippdubach.com/posts/crypto-mean-reversion-trading/</link><pubDate>Mon, 11 Nov 2024 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/crypto-mean-reversion-trading/</guid><description>&lt;p&gt;In late 2021, Lars Kaiser&amp;rsquo;s paper on &lt;a href="https://www.sciencedirect.com/science/article/abs/pii/S1544612318304513"&gt;seasonality in cryptocurrencies&lt;/a&gt; inspired me to use my &lt;a href="https://docs.kraken.com/api/"&gt;Kraken API Key&lt;/a&gt; to try and make some money. A quick summary of the paper: (1) Kaiser analyzes seasonality patterns across 10 cryptocurrencies (Bitcoin, Ethereum, etc.), examining returns, volatility, trading volume, and spreads (2) Finds no consistent calendar effects in cryptocurrency returns, supporting weak-form market efficiency (3) Observes robust patterns in trading activity - lower volume, volatility, and spreads in January, weekends, and summer months (4) Documents significant impact of January 2018 market sell-off on seasonality patterns (5) Reports a &amp;ldquo;reverse Monday effect&amp;rdquo; for Bitcoin (positive Monday returns) and &amp;ldquo;reverse January effect&amp;rdquo; (negative January returns) (6) Trading activity patterns suggest crypto markets are dominated by retail rather than institutional investors.&lt;/p&gt;
&lt;p&gt;The paper&amp;rsquo;s main finding: crypto markets appear efficient in terms of returns but show behavioral patterns in trading.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The efficient-market hypothesis (EMH) is a hypothesis in financial economics that states that asset prices reflect all available information. A direct implication is that it is impossible to &amp;ldquo;beat the market&amp;rdquo; consistently on a risk-adjusted basis since market prices should only react to new information.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The EMH has interesting implications for cryptocurrency markets. While major cryptocurrencies like Bitcoin and Ethereum have gained significant institutional adoption and liquidity, they may still be less efficient than traditional markets due to their relative youth and large audience of retail traders (who might not act as rationally as larger, institutional traders). This inefficiency becomes even more pronounced with smaller altcoins, which often have: (1) Lower trading volumes and liquidity (2) Less institutional participation (3) Higher information asymmetries (and/or greater susceptibility to manipulation). These factors create opportunities for exploiting market inefficiencies, particularly in the short term when prices may overreact to news or technical signals before eventually correcting.&lt;/p&gt;
&lt;p&gt;Unlike Kaiser&amp;rsquo;s seasonality research, I didn&amp;rsquo;t focus on calendar-based anomalies over longer time horizons. After reviewing further research on cryptocurrency market inefficiencies &lt;a href="https://www.sciencedirect.com/science/article/abs/pii/S1544612319306415"&gt;[1]&lt;/a&gt; &lt;a href="https://academic.oup.com/jfec/article-abstract/18/2/233/5133597"&gt;[2]&lt;/a&gt; &lt;a href="https://www.sciencedirect.com/science/article/abs/pii/S1057521921001228"&gt;[3]&lt;/a&gt; &lt;a href="https://onlinelibrary.wiley.com/doi/10.1002/isaf.1488"&gt;[4]&lt;/a&gt;, I was intrigued by predictable patterns in returns following large price movements. This led me to develop a classic mean reversion strategy instead (mean reversion suggests that asset prices tend to revert to their long-term average after extreme movements due to market overreactions and subsequent corrections).
&lt;a href="#lightbox-crypto-changepoint-returns-jpg-1" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/crypto-changepoint-returns.jpg 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/crypto-changepoint-returns.jpg 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/crypto-changepoint-returns.jpg 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/crypto-changepoint-returns.jpg 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/crypto-changepoint-returns.jpg 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/crypto-changepoint-returns.jpg 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/crypto-changepoint-returns.jpg 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/crypto-changepoint-returns.jpg 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/crypto-changepoint-returns.jpg 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/crypto-changepoint-returns.jpg 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/crypto-changepoint-returns.jpg 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/crypto-changepoint-returns.jpg"
alt="Scatter plot showing the relationship between return at time of jump (x-axis, ranging from -0.100 to 0.075) and return after jump (y-axis, ranging from -0.06 to 0.10), with red data points and a fitted regression line showing a slight negative correlation, r = -0.2142, p &amp;lt; 0.0"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
First, I had to find &amp;ldquo;change points.&amp;rdquo; The PELT algorithm efficiently identifies points in ETH/EUR where the statistical properties of the time series change significantly. These changes could indicate market events, trend reversals, or volatility shifts in the cryptocurrency price.
&lt;a href="#lightbox-ETHUSD-pelt-changepoint-png-2" style="display: block; width: 80%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/ETHUSD-pelt-changepoint.png 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/ETHUSD-pelt-changepoint.png 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/ETHUSD-pelt-changepoint.png 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/ETHUSD-pelt-changepoint.png 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ETHUSD-pelt-changepoint.png 1200w"
sizes="80vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/ETHUSD-pelt-changepoint.png 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/ETHUSD-pelt-changepoint.png 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/ETHUSD-pelt-changepoint.png 1440w"
sizes="80vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/ETHUSD-pelt-changepoint.png 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/ETHUSD-pelt-changepoint.png 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/ETHUSD-pelt-changepoint.png 2000w"
sizes="80vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/ETHUSD-pelt-changepoint.png"
alt="Structural break detection in financial time series using the PELT (Pruned Exact Linear Time) algorithm with RBF kernel. The analysis identifies 12 significant changepoints during June 15-29, 2021, using a penalty parameter of 35. Vertical dashed lines indicate detected regime changes in the price dynamics."
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
I then implemented an automated mean reversion trading strategy following this logical flow: &lt;em&gt;Continuous monitoring → Signal detection → Buy execution → Hold period → Sell execution&lt;/em&gt;. The script continuously monitored prices for certain cryptocurrencies on Kraken exchange. It executed buy orders when the price moved more than four standard deviations over a 2-hour period, then automatically sold after exactly 2 hours regardless of price movement. The strategy used fixed position sizes and limit orders to minimize fees. It assumed that large price drops represent temporary market overreactions that will reverse within the holding period.&lt;/p&gt;
&lt;p&gt;This little script earned some good change, but then again, it was 2021.&lt;/p&gt;
&lt;aside class="disclaimer" role="note" aria-label="Disclaimer"&gt;
&lt;div class="disclaimer-content"&gt;&lt;p&gt;&lt;strong&gt;Disclaimer:&lt;/strong&gt; All opinions expressed are my own. This is not investment, financial, tax, or legal advice. Past performance does not indicate future results. Do your own research and consult qualified professionals before making financial decisions. No liability accepted for any losses.&lt;/p&gt;&lt;/div&gt;
&lt;/aside&gt;</description></item><item><title>AlphaFold 3: Free for Science</title><link>http://philippdubach.com/posts/alphafold-3-free-for-science/</link><pubDate>Sun, 12 May 2024 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/alphafold-3-free-for-science/</guid><description>&lt;p&gt;Nothing says &amp;ldquo;we&amp;rsquo;re serious about dominating a market&amp;rdquo; quite like giving away breakthrough technology for free. Google&amp;rsquo;s latest move with AlphaFold 3 might be their most audacious version of this strategy yet.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&amp;ldquo;AlphaFold 3 can predict the structure and interactions of all of life&amp;rsquo;s molecules with unprecedented accuracy&amp;rdquo;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This isn&amp;rsquo;t just an incremental improvement - While previous versions of AlphaFold could predict protein structures, AlphaFold 3 models the interactions between proteins, DNA, RNA, and small molecules. It&amp;rsquo;s the difference between having a parts catalog and understanding how the entire machine works.&lt;/p&gt;
&lt;p&gt;Drug discovery typically costs billions and takes decades. If AlphaFold 3 can meaningfully accelerate that process - even by modest percentages—the value creation is staggering. Yet Google is handing it to researchers for free through the AlphaFold Server, with the predictable caveat of commercial restrictions. Is this Google&amp;rsquo;s cloud strategy playing out in life sciences? Establish the platform, get everyone dependent on your infrastructure, then monetize the ecosystem. The pharmaceutical industry, already grappling with AI disruption, now faces a world where molecular interactions can be predicted with &amp;ldquo;50% better accuracy&amp;rdquo; than existing methods. The real question isn&amp;rsquo;t whether AI will transform drug discovery - it&amp;rsquo;s whether Google will own that transformation.&lt;/p&gt;</description></item><item><title>My First 'Optimal' Portfolio</title><link>http://philippdubach.com/posts/my-first-optimal-portfolio/</link><pubDate>Fri, 15 Mar 2024 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/my-first-optimal-portfolio/</guid><description>&lt;p&gt;My introduction to quantitative portfolio optimization happened during my undergraduate years, inspired by Attilio Meucci&amp;rsquo;s &lt;a href="https://link.springer.com/book/10.1007/978-3-540-27904-4"&gt;Risk and Asset Allocation&lt;/a&gt; and the convex optimization &lt;a href="https://web.stanford.edu/~boyd/teaching.html"&gt;teachings of Diamond and Boyd at Stanford&lt;/a&gt;. With enthusiasm and perhaps more confidence than expertise, I created my first &amp;ldquo;optimal&amp;rdquo; portfolio. What struck me most was the disconnect between theory and accessibility. Modern Portfolio Theory had been established since 1990, yet the optimization tools remained largely locked behind proprietary software.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Nevertheless, only a few comprehensive software models are available publicly to use, study, or modify. We tackle this issue by engineering practical tools for asset allocation and implementing them in the Python programming language.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This gap inspired what would eventually be published as: &lt;a href="https://digitalcollection.zhaw.ch/handle/11475/24351"&gt;A Python integration of practical asset allocation based on modern portfolio theory and its advancements&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;My approach centered on a simple philosophy:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The focus is to keep the tools simple enough for interested practitioners to understand the underlying theory yet provide adequate numerical solutions.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Today, the landscape has evolved dramatically. Projects like &lt;a href="https://github.com/robertmartin8/PyPortfolioOpt"&gt;PyPortfolioOpt&lt;/a&gt; and &lt;a href="https://github.com/dcajasn/Riskfolio-Lib"&gt;Riskfolio-Lib&lt;/a&gt; have established themselves as sophisticated open-source alternatives, far surpassing my early efforts in both scope and sophistication. Despite its limitations, the project yielded several meaningful insights:
&lt;a href="#lightbox-efficient-frontier-jpg-0" style="display: block; width: 70%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/efficient-frontier.jpg 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/efficient-frontier.jpg 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/efficient-frontier.jpg 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/efficient-frontier.jpg 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/efficient-frontier.jpg 1200w"
sizes="70vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/efficient-frontier.jpg 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/efficient-frontier.jpg 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/efficient-frontier.jpg 1440w"
sizes="70vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/efficient-frontier.jpg 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/efficient-frontier.jpg 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/efficient-frontier.jpg 2000w"
sizes="70vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/efficient-frontier.jpg"
alt="Efficient Frontier Visualization"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
First, I set out to visualize Modern Portfolio Theory&amp;rsquo;s fundamental principle—the risk-return tradeoff that drives optimization decisions. This scatter plot showing the efficient frontier demonstrates this core concept.
&lt;a href="#lightbox-results-vs-benchmark-table-jpg-1" style="display: block; width: 70%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/results-vs-benchmark-table.jpg 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/results-vs-benchmark-table.jpg 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/results-vs-benchmark-table.jpg 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/results-vs-benchmark-table.jpg 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/results-vs-benchmark-table.jpg 1200w"
sizes="70vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/results-vs-benchmark-table.jpg 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/results-vs-benchmark-table.jpg 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/results-vs-benchmark-table.jpg 1440w"
sizes="70vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/results-vs-benchmark-table.jpg 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/results-vs-benchmark-table.jpg 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/results-vs-benchmark-table.jpg 2000w"
sizes="70vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/results-vs-benchmark-table.jpg"
alt="Benchmark vs Optimized Results"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
The results of my first optimization: maintaining a 9.386% return while reducing volatility from 14.445% to 5.574%, effectively tripling the Sharpe ratio from 0.650 to 1.684.
&lt;a href="#lightbox-risk-aversion-parameters-jpg-2" style="display: block; width: 70%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/risk-aversion-parameters.jpg 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/risk-aversion-parameters.jpg 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/risk-aversion-parameters.jpg 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/risk-aversion-parameters.jpg 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/risk-aversion-parameters.jpg 1200w"
sizes="70vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/risk-aversion-parameters.jpg 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/risk-aversion-parameters.jpg 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/risk-aversion-parameters.jpg 1440w"
sizes="70vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/risk-aversion-parameters.jpg 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/risk-aversion-parameters.jpg 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/risk-aversion-parameters.jpg 2000w"
sizes="70vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/risk-aversion-parameters.jpg"
alt="Risk Aversion Parameter Effects"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
By varying the risk aversion parameter (gamma), the framework successfully adapted to different investor profiles, demonstrating the flexibility of the optimization approach. This efficient frontier plot with different gamma values illustrates how the optimization framework adapts to different investor risk preferences.
&lt;a href="#lightbox-oos-performance-table-jpg-3" style="display: block; width: 70%; margin: 0 auto; padding: 1.5rem 0; text-decoration: none;"&gt;
&lt;picture class="img-lightbox"&gt;
&lt;source media="(max-width: 768px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=320,quality=80,format=webp/oos-performance-table.jpg 320w,
https://static.philippdubach.com/cdn-cgi/image/width=480,quality=80,format=webp/oos-performance-table.jpg 480w,
https://static.philippdubach.com/cdn-cgi/image/width=640,quality=80,format=webp/oos-performance-table.jpg 640w,
https://static.philippdubach.com/cdn-cgi/image/width=960,quality=80,format=webp/oos-performance-table.jpg 960w,
https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/oos-performance-table.jpg 1200w"
sizes="70vw"&gt;
&lt;source media="(max-width: 1024px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=768,quality=80,format=webp/oos-performance-table.jpg 768w,
https://static.philippdubach.com/cdn-cgi/image/width=1024,quality=80,format=webp/oos-performance-table.jpg 1024w,
https://static.philippdubach.com/cdn-cgi/image/width=1440,quality=80,format=webp/oos-performance-table.jpg 1440w"
sizes="70vw"&gt;
&lt;source media="(min-width: 1025px)"
srcset="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80,format=webp/oos-performance-table.jpg 1200w,
https://static.philippdubach.com/cdn-cgi/image/width=1600,quality=80,format=webp/oos-performance-table.jpg 1600w,
https://static.philippdubach.com/cdn-cgi/image/width=2000,quality=80,format=webp/oos-performance-table.jpg 2000w"
sizes="70vw"&gt;
&lt;img src="https://static.philippdubach.com/cdn-cgi/image/width=1200,quality=80/oos-performance-table.jpg"
alt="Out-of-Sample Performance"
class=""
width="1200"
height="630"
loading="lazy"
decoding="async"
style="width: 100%; height: auto; display: block;"&gt;
&lt;/picture&gt;
&lt;/a&gt;
Perhaps most importantly, out-of-sample testing across diverse market conditions—including the 2018 bear market and 2019 bull market—demonstrated consistent CVaR reduction and improved risk-adjusted returns.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We demonstrate how even in an environment with high correlation, achieving a competitive return with a lower expected shortfall and lower excess risk than the given benchmark over multiple periods is possible.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Looking back, the project feels embarrassingly naive—and surprisingly foundational. While it earned some recognition at the time, it now serves as a valuable reminder: sometimes the best foundation is built before you know enough to doubt yourself.&lt;/p&gt;</description></item><item><title>The Tech behind this Site</title><link>http://philippdubach.com/posts/the-tech-behind-this-site/</link><pubDate>Mon, 15 Jan 2024 00:00:00 +0000</pubDate><author>me@philippdubach.com (Philipp D. Dubach)</author><guid>http://philippdubach.com/posts/the-tech-behind-this-site/</guid><description>&lt;p&gt;This site runs on Hugo, deployed to GitHub Pages with Cloudflare CDN. Images are hosted on R2 (&lt;code&gt;static.philippdubach.com&lt;/code&gt;) with automatic resizing and WebP conversion.&lt;/p&gt;
&lt;p&gt;The core challenge was responsive images. Standard markdown &lt;code&gt;![alt](url)&lt;/code&gt; doesn&amp;rsquo;t support multiple sizes. I built a &lt;a href="https://gist.github.com/philippdubach/167189c7090c6813c5110c467cb5ebe9"&gt;Hugo shortcode&lt;/a&gt; that generates &lt;code&gt;&amp;lt;picture&amp;gt;&lt;/code&gt; elements with breakpoint-specific sources—upload once at full quality, serve optimized versions (320px mobile to 1600px desktop) automatically.&lt;/p&gt;
&lt;br&gt;
&lt;p&gt;&lt;strong&gt;Updates&lt;/strong&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;March 2026&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;em&gt;Hugo Upgrade&lt;/em&gt; — Upgraded from Hugo v0.128.0 to v0.157.0. Migrated deprecated &lt;code&gt;.Site.AllPages&lt;/code&gt; to &lt;code&gt;.Site.Pages&lt;/code&gt; in the sitemap template and &lt;code&gt;.Site.Data&lt;/code&gt; to &lt;code&gt;site.Data&lt;/code&gt; across navigation, structured data, and research templates. Removed a dead &lt;code&gt;readFile&lt;/code&gt; security config key from &lt;code&gt;hugo.toml&lt;/code&gt;. No breaking changes, zero deprecation warnings.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;February 2026&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;em&gt;Frontmatter Unification&lt;/em&gt; — Converted all 70 YAML frontmatter posts to TOML and added Key Takeaways to all 73 posts. Takeaways render as a visible summary box between the post header and content body, optimized for Generative Engine Optimization (GEO) so AI search engines can extract citation-ready passages.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Design Streamlining&lt;/em&gt; — Unified left-bordered aside components (key takeaways, newsletter CTA, disclaimer) to consistent 3px borders and aligned padding. Established a vertical spacing rhythm across post zones: key takeaways, content body, newsletter CTA, footer divider, related posts. Added breathing room around images (1.5rem padding). Refined key takeaways heading to 0.85rem uppercase label with square bullets.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Homepage Redesign&lt;/em&gt; — Rebuilt the homepage with a tabbed layout (Articles/Projects), year dividers, and thumbnail images served via Cloudflare Image Resizing. Consolidated navigation into a unified sidebar.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Security Headers Worker&lt;/em&gt; — Deployed a dedicated Cloudflare Worker on &lt;code&gt;philippdubach.com/*&lt;/code&gt; that injects HSTS, CSP with &lt;code&gt;frame-ancestors&lt;/code&gt;, COEP, COOP, and &lt;code&gt;Permissions-Policy&lt;/code&gt; headers. GitHub Pages doesn&amp;rsquo;t process &lt;code&gt;_headers&lt;/code&gt; files, so the Worker fills that gap. SHA-pinned all GitHub Actions and added Hugo binary checksum verification in CI.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Machine-Readable Feeds&lt;/em&gt; — Added &lt;a href="http://philippdubach.com/feed.json"&gt;JSON Feed 1.1&lt;/a&gt; alongside RSS, a &lt;a href="http://philippdubach.com/api/posts.json"&gt;Posts API&lt;/a&gt; for programmatic access, and &lt;code&gt;llms.txt&lt;/code&gt;/&lt;code&gt;llms-full.txt&lt;/code&gt; for AI crawler discovery. All output formats configured in &lt;code&gt;hugo.toml&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;GoatCounter &amp;ldquo;Most Read&amp;rdquo; API&lt;/em&gt; — Built a Cloudflare Worker proxy that queries the GoatCounter API for the top 10 posts over the past 7 days. The footer&amp;rsquo;s &amp;ldquo;Most Read&amp;rdquo; section now fetches live data from this worker instead of a static list.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;FAQ Section&lt;/em&gt; — New &lt;code&gt;/faq/&lt;/code&gt; section with per-category pages (Finance, AI, Tech, Economics, Medicine). Each post can define &lt;code&gt;faq&lt;/code&gt; entries in frontmatter; Hugo aggregates them into browsable FAQ pages with &lt;code&gt;FAQPage&lt;/code&gt; structured data for search engines.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Readnext Shortcode&lt;/em&gt; — Inline &amp;ldquo;Related&amp;rdquo; link to another post: &lt;code&gt;{{&amp;lt; readnext slug=&amp;quot;post-slug&amp;quot; &amp;gt;}}&lt;/code&gt;. Links are validated against live permalinks at build time.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;RSS Feed Fixes&lt;/em&gt; — Stripped lightbox overlay elements from full-content RSS to prevent images appearing twice in feed readers. Added XSLT stylesheet for browser-friendly RSS rendering.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Cloudflare Cache Purge&lt;/em&gt; — GitHub Actions deployment now automatically purges the Cloudflare cache after each build.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Research Page&lt;/em&gt; — Dynamic &lt;code&gt;/research/&lt;/code&gt; page pulling publication data from &lt;code&gt;data/research.yaml&lt;/code&gt; with SSRN links, DOIs, and structured data.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;January 2026&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;em&gt;Social Automation &amp;amp; AI Model Upgrade&lt;/em&gt; — Upgraded Workers AI model from Llama 3.1 8B to &lt;strong&gt;Llama 4 Scout 17B&lt;/strong&gt; for better post generation. Added Twitter/X automation worker alongside Bluesky. AI generates neutral, non-clickbait posts with extensive banned word filtering.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Post Composer Enhancements&lt;/em&gt; — Added auto-closing brackets &lt;code&gt;[ ( {&lt;/code&gt; in editor. Updated footer with social links matching main site. Deployed at &lt;a href="https://post-composer.pages.dev"&gt;post-composer.pages.dev&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;UI/UX Polish&lt;/em&gt; — Fixed mobile footer spacing consistency. Increased homepage post spacing (3.75rem). Disclaimers now only display on individual posts, hidden on homepage. Centered related posts heading.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Content Organization&lt;/em&gt; — Taxonomy system with categories (Finance, AI, Medicine, Tech, Economics) and types (Project, Commentary, Essay, Review). Hugo generates browsable &lt;code&gt;/categories/&lt;/code&gt; and &lt;code&gt;/types/&lt;/code&gt; pages.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Disclaimer Shortcode&lt;/em&gt; — Six types: finance, medical, general, AI, research, gambling. Syntax: &lt;code&gt;{{&amp;lt; disclaimer type=&amp;quot;finance&amp;quot; &amp;gt;}}&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;IndexNow Integration&lt;/em&gt; — Automated submissions via GitHub Actions for faster search engine discovery. Only pings recently changed URLs based on &lt;code&gt;lastmod&lt;/code&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;December 2025&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;em&gt;Code Blocks&lt;/em&gt; — Syntax highlighting via Chroma with line numbers in table layout. GitHub-inspired color theme.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Newsletter System&lt;/em&gt; — &lt;a href="http://philippdubach.com/posts/building-a-no-tracking-newsletter-from-markdown-to-distribution/"&gt;Integrated email subscriptions&lt;/a&gt; via Cloudflare Workers + KV. Welcome emails via Resend.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Security &amp;amp; Performance Audit&lt;/em&gt; — Fixed multiple H1 tags per page. Hardened CSP with &lt;code&gt;frame-ancestors&lt;/code&gt;. Added preconnect hints for external domains. Added &lt;code&gt;seoTitle&lt;/code&gt; frontmatter for long titles.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;November 2025&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;em&gt;Shortcodes&lt;/em&gt; — &lt;a href="https://gist.github.com/philippdubach/b703005536d6030c87e17d21cb0d430b"&gt;HTML table shortcode&lt;/a&gt;. Lightbox support on images.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;June 2025&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;em&gt;SEO &amp;amp; Math&lt;/em&gt; — &lt;a href="https://gist.github.com/philippdubach/39838f8e9e1b9fb085947a6b92062e0a"&gt;Open Graph integration&lt;/a&gt; for social previews. Per-post keyword management. &lt;a href="https://gist.github.com/philippdubach/42ef6e05f5c44b76ef3f66f27a17c41e"&gt;LaTeX rendering via MathJax 3&lt;/a&gt; (conditional loading).&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;May 2025&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;em&gt;Full Rebuild&lt;/em&gt; — Migrated from &lt;a href="https://github.com/hugo-sid/hugo-blog-awesome"&gt;hugo-blog-awesome&lt;/a&gt; fork to fully custom Hugo build.&lt;/p&gt;</description></item></channel></rss>