Passive Investing's Active Problem

(1) A new academic paper suggests the rise of passive investing may be fueling fragile market moves. (2) According to a study to be published in the American Economic Review, evidence is building that active managers are slow to scoop up stocks en masse when prices move away from their intrinsic worth. (3) Thanks to this lethargic trading behavior and the relentless boom in benchmark-tracking index funds, the impact of each trade on prices gets amplified, explaining how sell orders can induce broader equity gyrations

Passive investing, the supposedly boring strategy of buying and holding index funds, might actually be making markets more volatile. A new study set to be published in the American Economic Review finds that active managers are slow to scoop up stocks when prices move away from their intrinsic worth. Meanwhile, the relentless boom in benchmark-tracking index funds means that each trade gets amplified, explaining how sell orders can induce broader equity gyrations. Justina Lee for Bloomberg writes that this week’s AI-fueled market swings perfectly illustrate the phenomenon. Big equity gauges plunged on Monday over fears about an AI model, before swiftly rebounding.

Thanks to this lethargic trading behavior and the relentless boom in benchmark-tracking index funds, the impact of each trade on prices gets amplified.

The researchers from UCLA, Stockholm School of Economics, and University of Minnesota have identified what they call “Big Passive”—a financial landscape that’s proving less dynamic and more volatile. When most investors are on autopilot, the few remaining active traders have disproportionate influence. This doesn’t invalidate passive investing’s core benefits—lower costs and better long-term returns for most investors remain compelling. But it does suggest that our increasingly passive financial system has some unintended consequences.

I Built a CGM Data Reader

Last year I put a Continuous Glucose Monitor (CGM) sensor, specifically the Abbott Freestyle Libre 3, on my left arm. Why? I wanted to optimize my nutrition for an endurance cycling competitions. Where I live, the sensor is easy to get—without any medical prescription—and even easier to use. Unfortunately, Abbott’s FreeStyle LibreLink app is less than optimal (3,250 other people with an average rating of 2.9/5.0 seem to agree). To their defense, the web app LibreView does offer some nice reports which can be generated as PDFs—not very dynamic, but still something! What I had in mind was more in the fashion of the Ultrahuman M1 dashboard. Unfortunately, I wasn’t allowed to use my Libre sensor (EU firmware) with their app (yes, I spoke to customer service).

At that point, I wasn’t left with much enthusiasm, only a coin-sized sensor in my arm. The LibreView website fortunately lets you download most of your (own) data in a CSV report (there is also a reverse engineered API), which is nice. So that’s what I did: download the data, pd.read_csv() it into my notebook, calculate summary statistics, and plot the values. Visualized CGM Datapoints After some interpolation, I now had the same view as the LibreLink app (which I had rejected earlier) provided. Yet, this setup allowed me to do further analysis and visualizations by adding other datapoints (workouts, sleep, nutrition) I was also collecting at that time:

  • Blood sugar from LibreView: Measurement timestamps + glucose values
  • Nutrition from MacroFactor: Meal timestamps + macronutrients (carbs, protein, and fat)
  • Sleep data from Sleep Cycle: Sleep start timestamp + time in bed + time asleep (+ sleep quality, which is a proprietary measure calculated by the app)
  • Cardio workouts from Garmin: Workout start timestamp + workout duration
  • Strength workouts from Hevy: Workout start timestamp + workout duration

Final Dashboard After structuring those datapoints in a dataframe and normalizing timestamps, I was able to quickly highlight sleep (blue boxes with callouts for time in bed, time asleep, and sleep quality) and workouts (red traces on glucose measurements for strength workouts, green traces for cardio workouts) by plotting highlighted traces on top of the historic glucose trail for a set period. Furthermore, I was able to add annotations for nutrition events with the respective macronutrients.

I asked Claude to create some sample data and streamline the functions to reduce dependencies on the specific data sources I used. The resulting notebook is a comprehensive CGM data analysis tool that loads and processes glucose readings alongside lifestyle data (nutrition, workouts, and sleep), then creates an integrated dashboard for visualization. The code handles data preprocessing including interpolation of missing glucose values, timeline synchronization across different data sources, and statistical analysis with key metrics like time-in-range and coefficient of variation. The main output is a day-by-day dashboard that overlays workout periods, nutrition events, and sleep phases onto continuous glucose monitoring data, enabling users to identify patterns and correlations between lifestyle factors and blood sugar responses.

You can find the complete notebook as well as the sample data in my GitHub repository.

The Green Bond Commitment Premium

The difference between green finance that works and green finance that doesn’t work seems to be commitment: Using a Difference-in-Differences model analyzing 2013-2023 bond data, researchers found no significant correlation between green bond issuance and CO2 emissions after net-zero policies were adopted. That’s the disappointing part. On the upside: companies issuing only green bonds showed higher ESG ratings, lower CO2 emissions, and lower financing costs, achieving substantial environmental benefits and economic advantages. Meanwhile, entities issuing both conventional and green bonds showed no environmental benefits, raising concerns about potential greenwashing.

Those issuing only green bonds tend to have higher ESG ratings, lower CO2 emissions, and lower financing costs.

This could be called the commitment premium: Companies that go all-in on green finance see real results – both environmental and financial. Those trying to have it both ways? They’re essentially paying green bond premiums for conventional bond performance while fooling nobody about their environmental impact. What are the implications for investors? We should favor pure-play green issuers, and regulators need standards that discourage this mixed-portfolio greenwashing. The study suggests current carbon reduction policies haven’t created sufficient pressure on bond issuers, but perhaps the market is already creating its own incentives.

Crypto Mean Reversion Trading

In late 2021, Lars Kaiser’s paper on seasonality in cryptocurrencies inspired me to use my Kraken API Key to try and make some money. A quick summary of the paper: (1) Kaiser analyzes seasonality patterns across 10 cryptocurrencies (Bitcoin, Ethereum, etc.), examining returns, volatility, trading volume, and spreads (2) Finds no consistent calendar effects in cryptocurrency returns, supporting weak-form market efficiency (3) Observes robust patterns in trading activity - lower volume, volatility, and spreads in January, weekends, and summer months (4) Documents significant impact of January 2018 market sell-off on seasonality patterns (5) Reports a “reverse Monday effect” for Bitcoin (positive Monday returns) and “reverse January effect” (negative January returns) (6)Trading activity patterns suggest crypto markets are dominated by retail rather than institutional investors.

The paper’s main finding: crypto markets appear efficient in terms of returns but show behavioral patterns in trading.

The efficient-market hypothesis (EMH) is a hypothesis in financial economics that states that asset prices reflect all available information. A direct implication is that it is impossible to “beat the market” consistently on a risk-adjusted basis since market prices should only react to new information.

The EMH has interesting implications for cryptocurrency markets. While major cryptocurrencies like Bitcoin and Ethereum have gained significant institutional adoption and liquidity, they may still be less efficient than traditional markets due to their relative youth and large audience of retail traders (who might not act as rational as larger, institutional traders). This inefficiency becomes even more pronounced with smaller altcoins, which often have: (1) Lower trading volumes and liquidity (2) Less institutional participation (3) Higher information asymmetries (and/or greater susceptibility to manipulation). These factors create opportunities for exploiting market inefficiencies, particularly in the short term when prices may overreact to news or technical signals before eventually correcting.

Unlike Kaiser’s seasonality research, I didn’t focus on calendar-based anomalies over longer time horizons. After reviewing further research on cryptocurrency market inefficiencies [1] [2] [3] [4], I was intrigued by predictable patterns in returns following large price movements. This led me to develop a classic mean reversion strategy instead (mean reversion suggests that asset prices tend to revert to their long-term average after extreme movements due to market overreactions and subsequent corrections). Scatter plot showing the relationship between return at time of jump (x-axis, ranging from -0.100 to 0.075) and return after jump (y-axis, ranging from -0.06 to 0.10), with red data points and a fitted regression line showing a slight negative correlation, r = -0.2142, p < 0.0 First, I had to find “change points.” The PELT algorithm efficiently identifies points in ETH/EUR where the statistical properties of the time series change significantly. These changes could indicate market events, trend reversals, or volatility shifts in the cryptocurrency price. Structural break detection in financial time series using the PELT (Pruned Exact Linear Time) algorithm with RBF kernel. The analysis identifies 12 significant changepoints during June 15-29, 2021, using a penalty parameter of 35. Vertical dashed lines indicate detected regime changes in the price dynamics. I then implemented an automated mean reversion trading strategy following this logical flow: Continuous monitoring → Signal detection → Buy execution → Hold period → Sell execution. The script continuously monitored prices for certain cryptocurrencies on Kraken exchange. It executed buy orders when the price moved more than four standard deviations over a 2-hour period, then automatically sold after exactly 2 hours regardless of price movement. The strategy used fixed position sizes and limit orders to minimize fees. It assumed that large price drops represent temporary market overreactions that will reverse within the holding period.

This little script earned some good change—but then again, it was 2021.

AI Learns Economics Like Undergrads

This cuts to the heart of how LLMs actually work: Testing Large Language Models on economics problems reveals that these supposedly sophisticated systems don’t just learn correct reasoning—they absorb our misconceptions too. The study found LLMs performing reasonably well on undergraduate economics questions (around 65% accuracy) but falling flat on graduate-level problems (35% accuracy). More tellingly, the specific errors weren’t random failures but systematic mistakes that mirror exactly what human students get wrong.

“Interestingly, the errors made by LLMs often mirror those made by human students, suggesting that these models may have learned not just correct economic reasoning but also common misconceptions.”

Which kind of makes sense when we understand how language models actually work: They’re not reasoning through economic principles—they’re pattern-matching against their training data, which includes millions of wrong answers, confused explanations, and half-understood concepts scattered across the internet.

What are the practical implications? If you’re using AI for financial analysis or economic modeling, you’re essentially getting a very confident undergraduate who’s memorized a lot of material but fundamentally doesn’t understand when to apply which concepts. The models particularly struggled with dynamic optimization and game theory—exactly the areas where getting it wrong costs real money. Perhaps most unsettling: chain-of-thought prompting barely helped. Even when asked to show their work, the models maintained their confident confusion, just with more elaborate explanations of why 2+2 equals 5.

Note: From the paper: “testing set: January 1, 2023, to December 31, 2023”

Meta's Edge AI Gambit

While the AI industry obsesses over ever-larger cloud models, Meta just made a somewhat contrarian bet with Llama 3.2. Instead of chasing GPT-4 with another massive, they’re going small and local — releasing lightweight AI models designed to run entirely on your phone. The technical achievement is genuinely impressive: vision-capable models that can analyze images and text, plus compact versions that “fit in as little as 1GB of memory.” But the real story might be more strategic. Meta is essentially arguing that the future of AI isn’t in OpenAI’s cloud-centric paradigm, but in edge computing where your data never leaves your device.

“The on-device models are designed to enable developers to build personalized experiences that don’t require an internet connection and keep your data private.”

There’s some irony here: Meta — a company built on harvesting user data — suddenly championing privacy. Besides the marketing speak, this makes perfect business sense. Edge AI could democratize access to AI capabilities, reduce infrastructure costs, and conveniently sidestep the regulatory scrutiny facing cloud AI providers. By giving away competitive AI models, Meta simultaneously weakens competitors’ moats while positioning themselves as the champion of AI democratization. It’s the classic platform play: make the complementary technology free to increase demand for your scarce resource—in this case, developer mindshare and ecosystem control.

Whether on-device models can match cloud performance remains to be seen. But Meta is betting that “good enough” plus privacy plus offline capability beats “perfect” in the cloud. In a world increasingly skeptical of Big Tech data practices, that might just be a winning hand.

How Some Active Funds Create Their Own Returns

(1) Many active funds hold concentrated portfolios. Flow-driven trading in these securities causes price pressure, which pushes up the funds’ existing positions resulting in realized returns. (2) The researchers decomposes fund returns into a price pressure (self-inflated) and a fundamental component and show that when allocating capital across funds, investors are unable to identify whether realized returns are self-inflated or fundamental. (3) Because investors chase self-inflated fund returns at a high frequency, even short-lived impact meaningfully affects fund flows at longer time scales. (4) The combination of price impact and return chasing causes an endogenous feedback loop and a reallocation of wealth to early fund investors, which unravels once the price pressure reverts. (5) The researchers find that flows chasing self-inflated returns predict bubbles in ETFs and their subsequent crashes, and lead to a daily wealth reallocation of 500 Million from ETFs alone. (6) Around 2% of all daily flows and 8-12% of flows in the top decile of illiquid funds can be attributed to “Ponzi flows”. The researcher estimate that every day around $500 Million of investor wealth is reallocated because of the price impact of Ponzi flows.

In active funds investors are unable to identify whether realized returns are self-inflated or fundamental. Here’s how the magic trick works: Many active funds hold concentrated portfolios. Flow-driven trading in these securities causes price pressure, which pushes up the funds’ existing positions resulting in realized returns. The mechanism is as follows: Fund managers pick concentrated positions, new money flows in, that money pushes up prices of the fund’s existing holdings, creating impressive returns that attract more money, which pushes prices higher still. As the researchers put it:

Via their own price impact, active funds effectively reallocate capital from late to early investors.

The numbers are staggering. Around 2% of all daily flows and 8-12% of flows in the top decile of illiquid funds can be attributed to Ponzi flows, with around $500 Million of investor wealth reallocated daily because of this price impact. Even more striking: funds with high Ponzi flows experience subsequent drawdowns of over 200%. This isn’t just academic theorizing—flows chasing self-inflated returns predict bubbles in ETFs and their subsequent crashes. The researchers propose a simple fix: a fund illiquidity measure that captures a fund’s potential for self-inflated returns.

OpenAI Cuts Prices, Raises Stakes

OpenAI’s GPT-4o launch is a classic Silicon Valley competitive strategy disguised as a product announcement.

GPT-4o is 2x faster, half the price, and has 5x higher rate limits compared to GPT-4 Turbo

The real headline isn’t the multimodal wizardry — though watching an AI tutor walk through math problems or harmonize in real-time is genuinely impressive. It’s the economics. OpenAI is essentially paying developers to build on their platform while making it prohibitively expensive for competitors to match these specs profitably.

The free tier expansion is equally calculated. By giving ChatGPT’s 100+ million users access to frontier AI capabilities, OpenAI creates a consumer expectation that every other AI assistant will struggle to meet. It’s the Amazon playbook: lose money on the product, make it back on the ecosystem.

That being said, the technical achievement shouldn’t be understated—training a single model end-to-end across text, vision, and audio represents a genuine breakthrough in multimodal AI. Real-time voice conversation with natural interruptions moves us from “chatbot” to something approaching actual dialogue. Strip away the demos and you’re left with a company making an aggressive bet that they can outspend the competition into submission. Whether that works depends on how quickly Google, Anthropic, and others can respond—and whether OpenAI’s cash reserves outlast their patience. The AI wars just got expensive. For everyone except OpenAI’s customers.

AlphaFold 3: Free for Science

Nothing says “we’re serious about dominating a market” quite like giving away breakthrough technology for free. Google’s latest move with AlphaFold 3 might be their most audacious version of this strategy yet.

“AlphaFold 3 can predict the structure and interactions of all of life’s molecules with unprecedented accuracy”

This isn’t just an incremental improvement - While previous versions of AlphaFold could predict protein structures, AlphaFold 3 models the interactions between proteins, DNA, RNA, and small molecules. It’s the difference between having a parts catalog and understanding how the entire machine works.

Drug discovery typically costs billions and takes decades. If AlphaFold 3 can meaningfully accelerate that process - even by modest percentages—the value creation is staggering. Yet Google is handing it to researchers for free through the AlphaFold Server, with the predictable caveat of commercial restrictions. Is this Google’s cloud strategy playing out in life sciences? Establish the platform, get everyone dependent on your infrastructure, then monetize the ecosystem. The pharmaceutical industry, already grappling with AI disruption, now faces a world where molecular interactions can be predicted with “50% better accuracy” than existing methods.The real question isn’t whether AI will transform drug discovery - it’s whether Google will own that transformation.

My First 'Optimal' Portfolio

My introduction to quantitative portfolio optimization happened during my undergraduate years, inspired by Attilio Meucci’s Risk and Asset Allocation and the convex optimization teachings of Diamond and Boyd at Stanford. With enthusiasm and perhaps more confidence than expertise, I created my first “optimal” portfolio. What struck me most was the disconnect between theory and accessibility. Modern Portfolio Theory had been established since 1990, yet the optimization tools remained largely locked behind proprietary software.

Nevertheless, only a few comprehensive software models are available publicly to use, study, or modify. We tackle this issue by engineering practical tools for asset allocation and implementing them in the Python programming language.

This gap inspired what would eventually become a published as: A Python integration of practical asset allocation based on modern portfolio theory and its advancements.

My approach centered on a simple philosophy:

The focus is to keep the tools simple enough for interested practitioners to understand the underlying theory yet provide adequate numerical solutions.

Today, the landscape has evolved dramatically. Projects like PyPortfolioOpt and Riskfolio-Lib have established themselves as sophisticated open-source alternatives, far surpassing my early efforts in both scope and sophistication. Despite its limitations, the project yielded several meaningful insights: Efficient Frontier Visualization First I set out to visualize Modern Portfolio Theory’s fundamental principle—the risk-return tradeoff that drives optimization decisions. This scatter plot showing the efficient frontier demonstrates said core concept. Benchmark vs Optimized Results The results of my first optimization: maintaining a 9.386% return while reducing volatility from 14.445% to 5.574%, effectively tripling the Sharpe ratio from 0.650 to 1.684. Risk Aversion Parameter Effects By varying the risk aversion parameter (gamma), the framework successfully adapted to different investor profiles, showcasing the flexibility of the optimization approach. This efficient frontier plot with different gamma values illustrates how the optimization framework adapts to different investor risk preferences. Out-of-Sample Performance Perhaps most importantly, out-of-sample testing across diverse market conditions—including the 2018 bear market and 2019 bull market—demonstrated consistent CVaR reduction and improved risk-adjusted returns.

We demonstrate how even in an environment with high correlation, achieving a competitive return with a lower expected shortfall and lower excess risk than the given benchmark over multiple periods is possible.

Looking back, the project feels embarrassingly naive—and surprisingly foundational. While it earned some recognition at the time, it now serves as a valuable reminder: sometimes the best foundation is built before you know enough to doubt yourself.