Kalshi, a prediction market startup, is using its federal financial license to offer sports betting nationwide, even in states where it’s not legal. The move has earned them cease-and-desist letters from state gaming regulators, but CEO Tarek Mansour isn’t backing down:
We can go one by one for every financial market and it would fall under the definition of gambling. So what’s the difference?
It’s a question that cuts to the heart of modern finance. The founders argue that Wall Street blurred the line between investing and gambling long ago, and casting Kalshi as the latter is inconsistent at best. They have a point—if you can bet on oil futures, Nvidia’s stock price, or interest rate movements, why is wagering on NFL touchdowns more objectionable?
Benefiting from the Trump administration’s hands-off regulatory approach, with the CFTC dropping its legal challenge to their election contracts, the odds might be in their favor. Even better, a Kalshi board member is awaiting confirmation to lead the very agency that was previously their biggest antagonist.
The technical distinction matters: Kalshi operates as an exchange between traders rather than a house taking bets against customers. But functionally, with 79% of their recent trading volume being sports-related, they’re forcing us to confront an uncomfortable reality about risk, speculation, and what we choose to call “investing.”
Whether you call it innovation or regulatory arbitrage, Kalshi is exposing the arbitrary nature of the lines we’ve drawn around acceptable financial speculation. _ _
(17/06/2025) Update: Matt Levine - one of the finance columnists I enjoy reading most - just published a long piece “It’s Not Gambling, It’s Predicting” in his newsletter on exactly this issue:
Kalshi offers a prediction market where you can bet on sports. No! Sorry! Wrong! It offers a prediction market where you can predict which team will win a sports game, and if you predict correctly you make money, and if you predict incorrectly you lose money. Not “bet on sports.” “Predict sports outcomes for money.” Completely different.
Their intrinsic complexity and lack of transparency pose significant challenges, especially in the highly regulated financial sector
Unlike other industries where “the model said so” might suffice, finance demands audit trails, bias detection,
and explainable decision-making—requirements that sit uncomfortably with neural networks containing billions of parameters.
The research highlights a fundamental tension that’s about to reshape fintech:
the same complexity that makes LLMs powerful at parsing market sentiment or generating investment reports also makes them regulatory nightmares
in a sector where you need to explain every decision to examiners.
Something interesting just happened at the National Bureau of Economic Research NBER
We study the optimal monetary policy response to the imposition of tariffs in a model
with imported intermediate inputs. In a simple open-economy framework, we show
that a tariff maps exactly into a cost-push shock in the standard closed-economy New
Keynesian model, shifting the Phillips curve upward. We then characterize optimal
monetary policy, showing that it partially accommodates the shock to smooth the
transition to a more distorted long-run equilibrium—at the cost of higher short-run
inflation.
Here’s where it gets interesting for current policy: Werning et. al.
show that “optimal” monetary policy would actually calls for partial accommodation
of tariff shocks—essentially allowing some inflation to persist to smooth the transition
to what they euphemistically call “a more distorted long-run equilibrium.”
With core PCE still running above the Fed’s 2% target and renewed tariff threats on the horizon,
this research suggests Powell may need to abandon his recent dovish pivot and prepare
for rate hikes that prioritize price stability over employment concerns.
The dual mandate was never meant to be dual when the two mandates point in opposite directions.
A fascinating new paper from Stefano Iabichino at UBS Investment Bank explores what happens when you take the attention mechanisms powering modern AI and apply them to Wall Street’s most fundamental pricing problems, tackling what might be quantitative finance’s most intractable challenge.
The problem is elegantly simple yet profound: machine learning models are great at finding patterns in historical data, but financial theory demands that arbitrage-free prices be independent of past information. As the authors put it:
We contend that a fundamental tension exists between the usage of ML methodologies in risk and pricing and the First Fundamental Theorem of Finance (FFTF). While ML models rely on historical data to identify recurring patterns, the FFTF posits that arbitrage-free market prices are independent of past information.
Their solution? Transition Probability Tensors (TPTs) that function like attention mechanisms in neural networks, dynamically weighting relationships between risk factors while maintaining mathematical rigor. Instead of learning from history, these tensors capture “dynamic, context-aware relationships across dimensions” in real-time.
The practical results are impressive: simulating 210 quantitative investment strategies across 100,000 market scenarios in just 70 seconds, while identifying optimal hedging strategies and stress-testing future market conditions. The framework even adapts to different volatility regimes, shifting focus toward tail events during high-volatility periods—exactly like attention mechanisms focusing on relevant context. Whether it scales beyond this impressive proof-of-concept remains to be seen, but it’s seems to be a genuine attempt to resolve the fundamental tension between AI’s pattern-seeking nature and finance’s requirement for arbitrage-free pricing.
A new academic review by Ali Farhani reveals that institutional Total Value Locked in DeFi protocols hit $42 billion in 2024, with BlackRock leading the charge by launching a $250 million tokenized fund on Centrifuge.
The numbers tell a remarkable story of maturation. Layer 2 solutions like Optimism and Arbitrum now dominate the scaling landscape, while zero-knowledge proofs have reduced compliance costs by 30%. Even the terminology is evolving—researchers now discuss “Total Value Redeemable” instead of the traditional TVL metric, acknowledging that not all locked value is immediately liquid. Despite technological advances, security incidents persist with painful regularity: $350 million lost in the Wormhole bridge exploit, $81 million in Orbit Chain’s multi-signature failure. Cross-chain bridges remain “high-risk attack targets,” a sobering reminder that connecting different blockchains is still more art than science. The regulatory landscape is complicated as well. Europe’s MiCA regulation provides clear frameworks, while the SEC maintains its enforcement-first approach. Hong Kong’s innovation sandbox offers a third path, balancing experimentation with oversight.
DeFi is transitioning from a disruptive experiment to an integrated component of the global financial system
That transition isn’t complete—Layer 2 solutions are projected to host over 70% of DeFi TVL by mid-2025—but the direction is clear.
This post is based in part on a 2022 presentation I gave for the ICBS Student Investment Fund and my seminar work at Imperial College London.
As we were looking for new investment strategies for our Macro Sentiment Trading team, OpenAI had just published their GPT-3.5 Model. After first experiments with the model, we asked ourselves: How would large language models like GPT-3.5 perform in predicting sentiment in financial markets, where the signal-to-noise ratio is notoriously low? And could they potentially even outperform industry benchmarks at interpreting market sentiment from news headlines? The idea wasn’t entirely new. Studies[2][3] have shown that investor sentiment, extracted from news and social media, can forecast market movements. But most approaches rely on traditional NLP models or proprietary systems like RavenPack. With the recent advances in large language models, I wanted to test whether these more sophisticated models could provide a competitive edge in sentiment-based trading. Before looking at model selection, it’s worth understanding what makes trading on sentiment so challenging. News headlines present two fundamental problems that any robust system must address.
First, headlines are inherently non-stationary. Unlike other data sources, news reflects the constantly shifting landscape of global events, political climates, economic trends, etc. A model trained on COVID-19 vaccine headlines from 2020 might struggle with geopolitical tensions in 2023. This temporal drift means algorithms must be adaptive to maintain relevance.
Second, the relationship between headlines and market impact is far from obvious. Consider these actual headlines from November 2020: “Pfizer Vaccine Prevents 90% of COVID Infections” drove the S&P 500 up 1.85%, while “Pfizer Says Safety Milestone Achieved” barely moved the market at -0.05%. The same company, similar positive news, dramatically different market reactions.
When developing a sentiment-based trading system, you essentially have two conceptual approaches: forward-looking and backward-looking.
Forward-looking models try to predict which news themes will drive markets, often working qualitatively by creating logical frameworks that capture market expectations. This approach is highly adaptable but requires deep domain knowledge and is time-consuming to maintain.
Backward-looking models analyze historical data to understand which headlines have moved markets in the past, then look for similarities in current news. This approach can leverage large datasets and scale efficiently, but suffers from low signal-to-noise ratios and the challenge that past relationships may not hold in the future.
For this project, I chose the backward-looking approach, primarily for its scalability and ability to work with existing datasets.
Rather than rely on traditional approaches like FinBERT (which only provides discrete positive/neutral/negative classifications), I decided to test OpenAI’s GPT-3.5 Turbo model. The key advantage was its ability to provide continuous sentiment scores from -1 to 1, giving much more nuanced signals for trading decisions. I used news headlines from the Dow Jones Newswire covering the 30 DJI companies from 2018-2022, filtering for quality sources like the Wall Street Journal and Bloomberg. After removing duplicates, this yielded 2,072 headlines. I then prompted GPT-3.5 to score sentiment with the instruction: Rate the sentiment of the following news headlines from -1 (very bad) to 1 (very good), with two decimal precision. To validate the approach, I compared GPT-3.5 scores against RavenPack—the industry’s leading commercial sentiment provider.
The correlation was 0.59, indicating the models generally agreed on sentiment direction while providing different granularities of scoring. More interesting was comparing the distribution of the sentiment ratings between the two models. This could have been approximated closer through some fine tuning of the (minimal) prompt used earlier.
I implemented a simple strategy: go long when sentiment hits the top 5% of scores, close positions at 25% profit (to reduce transaction costs), and maintain a fully invested portfolio with 1% commission per trade.
The results were mixed but promising. Over the full 2018-2022 period, the GPT-3.5 strategy generated 41.02% returns compared to RavenPack’s 40.99%—essentially matching the industry benchmark. However, both underperformed a simple buy-and-hold approach (58.13%) during this generally bullish period. Relying on market sentiment when news flow is low can be a tricky strategy. As can be seen from the example of the Salesforce stock performance**,** the strategy remained uninvested over a large period of time due to a (sometimes long-lasting) negative sentiment signal.
When I tested different timeframes, the sentiment strategy showed its strength during volatile periods. From 2020-2022, it outperformed buy-and-hold (22.83% vs 21.00%). As expected, sentiment-based approaches work better when markets are less directional and more driven by news flow. To evaluate whether the scores generated by our GPT prompt were more accurate than those from the RavenPack benchmark, I calculated returns for different holding windows. The scores generated by our GPT prompt perform significantly better in the short term (1 and 10 days) for positive sentiment and in the long term (90 days) for negative sentiment.
(Note: For lower sentiment, negative returns are desirable since the stock would be shorted)
While the model performed well technically, this project highlighted several practical challenges. First, data accessibility remains a major hurdle—getting real-time, high-quality news feeds is expensive and often restricted. Second, the strategy worked better in a more volatile environment, which prompted many individual trades, creating substantial transaction costs that significantly impact returns. Perhaps most importantly, any real-world implementation would need to compete with high-frequency traders who can act on news within milliseconds. The few seconds required for GPT-3.5 to process headlines and generate sentiment scores are far from being competitive. Despite these challenges, the project demonstrated that LLMs can match industry benchmarks for sentiment analysis—and this was using a general-purpose model, not one specifically fine-tuned for financial applications. OpenAI (and others) today offer more powerful models at very low cost as well as fine-tuning capabilities that could further improve performance. The bigger opportunity might be in combining sentiment signals with other factors, using sentiment as one input in a more sophisticated trading system rather than the sole decision criterion. There’s also potential in expanding beyond simple long-only strategies to include short positions on negative sentiment, or developing “sentiment indices” that smooth out individual headline noise.
Market sentiment strategies may not be optimal for long-term investing, but they show clear promise for shorter-term trading in volatile environments. As LLMs continue to improve and become more accessible, this might offer an opportunity to revisit this project.
(1) A new academic paper suggests the rise of passive investing may be fueling fragile market moves.
(2) According to a study to be published in the American Economic Review, evidence is building that active managers are slow to scoop up stocks en masse when prices move away from their intrinsic worth.
(3) Thanks to this lethargic trading behavior and the relentless boom in benchmark-tracking index funds, the impact of each trade on prices gets amplified, explaining how sell orders can induce broader equity gyrations
Passive investing, the supposedly boring strategy of buying and holding index funds, might actually be making markets more volatile. A new study set to be published in the American Economic Review finds that active managers are slow to scoop up stocks when prices move away from their intrinsic worth. Meanwhile, the relentless boom in benchmark-tracking index funds means that each trade gets amplified, explaining how sell orders can induce broader equity gyrations.
Justina Lee for Bloomberg writes that this week’s AI-fueled market swings perfectly illustrate the phenomenon. Big equity gauges plunged on Monday over fears about an AI model, before swiftly rebounding.
Thanks to this lethargic trading behavior and the relentless boom in benchmark-tracking index funds, the impact of each trade on prices gets amplified.
The researchers from UCLA, Stockholm School of Economics, and University of Minnesota have identified what they call “Big Passive”—a financial landscape that’s proving less dynamic and more volatile. When most investors are on autopilot, the few remaining active traders have disproportionate influence.
This doesn’t invalidate passive investing’s core benefits—lower costs and better long-term returns for most investors remain compelling. But it does suggest that our increasingly passive financial system has some unintended consequences.
Last year I put a Continuous Glucose Monitor (CGM) sensor, specifically the Abbott Freestyle Libre 3, on my left arm. Why? I wanted to optimize my nutrition for endurance cycling competitions. Where I live, the sensor is easy to get—without any medical prescription—and even easier to use. Unfortunately, Abbott’s FreeStyle LibreLink app is less than optimal (3,250 other people with an average rating of 2.9/5.0 seem to agree). In their defense, the web app LibreView does offer some nice reports which can be generated as PDFs—not very dynamic, but still something! What I had in mind was more in the fashion of the Ultrahuman M1 dashboard. Unfortunately, I wasn’t allowed to use my Libre sensor (EU firmware) with their app (yes, I spoke to customer service).
At that point, I wasn’t left with much enthusiasm, only a coin-sized sensor in my arm. The LibreView website fortunately lets you download most of your (own) data in a CSV report (there is also a reverse engineered API), which is nice. So that’s what I did: download the data, pd.read_csv() it into my notebook, calculate summary statistics, and plot the values.
After some interpolation, I now had the same view as the LibreLink app (which I had rejected earlier) provided. Yet, this setup allowed me to do further analysis and visualizations by adding other datapoints (workouts, sleep, nutrition) I was also collecting at that time:
Blood sugar from LibreView: Measurement timestamps + glucose values
Nutrition from MacroFactor: Meal timestamps + macronutrients (carbs, protein, and fat)
Sleep data from Sleep Cycle: Sleep start timestamp + time in bed + time asleep (+ sleep quality, which is a proprietary measure calculated by the app)
Cardio workouts from Garmin: Workout start timestamp + workout duration
Strength workouts from Hevy: Workout start timestamp + workout duration
After structuring those datapoints in a dataframe and normalizing timestamps, I was able to quickly highlight sleep (blue boxes with callouts for time in bed, time asleep, and sleep quality) and workouts (red traces on glucose measurements for strength workouts, green traces for cardio workouts) by plotting highlighted traces on top of the historic glucose trail for a set period. Furthermore, I was able to add annotations for nutrition events with the respective macronutrients.
I asked Claude to create some sample data and streamline the functions to reduce dependencies on the specific data sources I used. The resulting notebook is a comprehensive CGM data analysis tool that loads and processes glucose readings alongside lifestyle data (nutrition, workouts, and sleep), then creates an integrated dashboard for visualization. The code handles data preprocessing including interpolation of missing glucose values, timeline synchronization across different data sources, and statistical analysis with key metrics like time-in-range and coefficient of variation. The main output is a day-by-day dashboard that overlays workout periods, nutrition events, and sleep phases onto continuous glucose monitoring data, enabling users to identify patterns and correlations between lifestyle factors and blood sugar responses.
The difference between green finance that works and green finance that doesn’t work seems to be commitment: Using a Difference-in-Differences model analyzing 2013-2023 bond data, researchers found no significant correlation between green bond issuance and CO2 emissions after net-zero policies were adopted. That’s the disappointing part. On the upside: companies issuing only green bonds showed higher ESG ratings, lower CO2 emissions, and lower financing costs, achieving substantial environmental benefits and economic advantages. Meanwhile, entities issuing both conventional and green bonds showed no environmental benefits, raising concerns about potential greenwashing.
Those issuing only green bonds tend to have higher ESG ratings, lower CO2 emissions, and lower financing costs.
This could be called the commitment premium: Companies that go all-in on green finance see real results – both environmental and financial. Those trying to have it both ways? They’re essentially paying green bond premiums for conventional bond performance while fooling nobody about their environmental impact.
What are the implications for investors? We should favor pure-play green issuers, and regulators need standards that discourage this mixed-portfolio greenwashing. The study suggests current carbon reduction policies haven’t created sufficient pressure on bond issuers, but perhaps the market is already creating its own incentives.
In late 2021, Lars Kaiser’s paper on seasonality in cryptocurrencies inspired me to use my Kraken API Key to try and make some money. A quick summary of the paper: (1) Kaiser analyzes seasonality patterns across 10 cryptocurrencies (Bitcoin, Ethereum, etc.), examining returns, volatility, trading volume, and spreads (2) Finds no consistent calendar effects in cryptocurrency returns, supporting weak-form market efficiency (3) Observes robust patterns in trading activity - lower volume, volatility, and spreads in January, weekends, and summer months (4) Documents significant impact of January 2018 market sell-off on seasonality patterns (5) Reports a “reverse Monday effect” for Bitcoin (positive Monday returns) and “reverse January effect” (negative January returns) (6) Trading activity patterns suggest crypto markets are dominated by retail rather than institutional investors.
The paper’s main finding: crypto markets appear efficient in terms of returns but show behavioral patterns in trading.
The efficient-market hypothesis (EMH) is a hypothesis in financial economics that states that asset prices reflect all available information. A direct implication is that it is impossible to “beat the market” consistently on a risk-adjusted basis since market prices should only react to new information.
The EMH has interesting implications for cryptocurrency markets. While major cryptocurrencies like Bitcoin and Ethereum have gained significant institutional adoption and liquidity, they may still be less efficient than traditional markets due to their relative youth and large audience of retail traders (who might not act as rationally as larger, institutional traders). This inefficiency becomes even more pronounced with smaller altcoins, which often have: (1) Lower trading volumes and liquidity (2) Less institutional participation (3) Higher information asymmetries (and/or greater susceptibility to manipulation). These factors create opportunities for exploiting market inefficiencies, particularly in the short term when prices may overreact to news or technical signals before eventually correcting.
Unlike Kaiser’s seasonality research, I didn’t focus on calendar-based anomalies over longer time horizons. After reviewing further research on cryptocurrency market inefficiencies [1][2][3][4], I was intrigued by predictable patterns in returns following large price movements. This led me to develop a classic mean reversion strategy instead (mean reversion suggests that asset prices tend to revert to their long-term average after extreme movements due to market overreactions and subsequent corrections).
First, I had to find “change points.” The PELT algorithm efficiently identifies points in ETH/EUR where the statistical properties of the time series change significantly. These changes could indicate market events, trend reversals, or volatility shifts in the cryptocurrency price.
I then implemented an automated mean reversion trading strategy following this logical flow: Continuous monitoring → Signal detection → Buy execution → Hold period → Sell execution. The script continuously monitored prices for certain cryptocurrencies on Kraken exchange. It executed buy orders when the price moved more than four standard deviations over a 2-hour period, then automatically sold after exactly 2 hours regardless of price movement. The strategy used fixed position sizes and limit orders to minimize fees. It assumed that large price drops represent temporary market overreactions that will reverse within the holding period.
This little script earned some good change—but then again, it was 2021.