65% of Hacker News Posts Have Negative Sentiment, and They Outperform

Negativity Bias and Engagement on Hacker News

This Hacker News sentiment analysis began with a simple observation: posts with negative sentiment average 35.6 points on Hacker News. The overall average is 28 points. That’s a 27% performance premium for negativity.

Hacker News sentiment analysis distribution across 32,000 posts showing negative skew Hacker News sentiment analysis distribution across 32,000 posts showing negative skew

This finding comes from an empirical study I’ve been running on HN attention dynamics, covering decay curves, preferential attachment, survival probability, and early-engagement prediction. The preprint is available on SSRN. I already had a gut feeling. Across 32,000 posts and 340,000 comments, nearly 65% register as negative. This might be a feature of my classifier being miscalibrated toward negativity; yet the pattern holds across six different models.

Six-Model Sentiment Comparison: Transformers vs LLMs

Sentiment classification comparison across six NLP models: DistilBERT, BERT Multi, RoBERTa, Llama 3.1 8B, Mistral 3.1 24B, and Gemma 3 12B Sentiment classification comparison across six NLP models: DistilBERT, BERT Multi, RoBERTa, Llama 3.1 8B, Mistral 3.1 24B, and Gemma 3 12B

I tested three transformer-based classifiers (DistilBERT, BERT Multi, RoBERTa) and three LLMs (Llama 3.1 8B, Mistral 3.1 24B, Gemma 3 12B). The distributions vary, but the negative skew persists across all of them (inverted scale for 2-6). The results I use in my dashboard are from DistilBERT because it runs efficiently in my Cloudflare-based pipeline.

What counts as “negative” here? Criticism of technology, skepticism toward announcements, complaints about industry practices, frustration with APIs. The usual. It’s worth noting that technical critique reads differently than personal attacks; most HN negativity is substantive rather than toxic. But, does negativity cause engagement, or does controversial content attract both negative framing and attention? Probably some of both.


HackerBook Dataset: Cross-Validation With 22GB of Hacker News Data

Related to this, I also saw this Show HN: 22GB of Hacker News in SQLite, served via WASM shards. Downloaded the HackerBook export and ran a subset of my paper’s analytics on it.

Caveat: HackerBook is a single static snapshot (no time-series data). Therefore I could not analyze lifecycle analysis, early-velocity prediction, or decay fitting. What can be computed: distributional statistics, inequality metrics, circadian patterns.

Summary statistics table for HackerBook Hacker News data sample Summary statistics table for HackerBook Hacker News data sample

Score Distribution and Power-Law Fit

Hacker News score distribution CCDF with power-law fit showing heavy-tailed engagement Hacker News score distribution CCDF with power-law fit showing heavy-tailed engagement

Attention Inequality: Lorenz Curve and Gini Coefficient

Lorenz curve of Hacker News story scores measuring attention inequality with Gini coefficient Lorenz curve of Hacker News story scores measuring attention inequality with Gini coefficient

Circadian Posting Patterns

Hacker News circadian posting patterns in UTC showing volume versus mean score by hour Hacker News circadian posting patterns in UTC showing volume versus mean score by hour

Score vs Comment Engagement

Hacker News score versus direct comments log-log scatter plot Hacker News score versus direct comments log-log scatter plot Direct comments distribution CCDF on Hacker News showing power-law tail Direct comments distribution CCDF on Hacker News showing power-law tail Mean score versus direct comments on Hacker News binned in log-spaced buckets Mean score versus direct comments on Hacker News binned in log-spaced buckets
Topics: Hacker News sentiment analysis, negativity bias social media engagement, BERT RoBERTa sentiment comparison, attention inequality power law social media, Hacker News data analysis engagement