Tech FAQ

Question 1

Why move a Hugo blog off GitHub Pages?

Accepted Answer

GitHub Pages has no access logs, a hard cap on file size, no control over response headers without a Cloudflare Worker in front, and no place to run anything dynamic alongside the static site. Moving to a self-hosted box adds operational responsibility but unlocks all of those, plus puts the site's data in a jurisdiction I chose.

Question 2

Why Hetzner instead of a hyperscaler?

Accepted Answer

Cost, transparency, and jurisdiction. A Hetzner CPX21 in Nuremberg has more than enough capacity for a Hugo blog plus supporting services, costs a fraction of equivalent compute on AWS or DigitalOcean, runs on contracted renewable energy, and is operated by a German company under EU law. The tradeoff is fewer managed services, which I wanted anyway.

Question 3

Why Cloudflare R2 EU instead of Hetzner Object Storage or Scaleway?

Accepted Answer

Cloudflare R2 EU jurisdiction keeps the data in EU data centers under stricter handling rules. R2 also has zero egress fees, an S3-compatible API, built-in integration with the existing Cloudflare Workers, and an image-resizing pipeline I'd already built. Cloudflare is still US-headquartered, so this is a partial sovereignty win, not a complete one.

Question 4

Does Cloudflare R2 EU jurisdiction protect against the US CLOUD Act?

Accepted Answer

Partially. R2 EU jurisdiction guarantees data is stored and processed inside EU data centers and applies stricter handling rules, but Cloudflare remains a US-headquartered company subject to the CLOUD Act, so a US court could in principle compel disclosure. For posts and images that tradeoff was acceptable. For newsletter subscriber lists I moved off third-party services entirely onto a self-hosted Listmonk instance.

Question 5

Why keep Cloudflare and Resend on US infrastructure?

Accepted Answer

Cloudflare's CDN, Workers, cache rules, and rate limiting make the site fast everywhere and keep brute-force probes away from self-hosted admin panels. Replacing that layer with a European alternative would be a significant downgrade in capability. Resend handles SPF, DKIM, DMARC, and reputation management for outbound mail. European alternatives exist; ones I trust for newsletter deliverability do not.

Question 6

What's the operational overhead of self-hosting a Hugo blog?

Accepted Answer

Higher than a managed service, lower than people assume. The box runs Caddy for TLS and reverse proxy, Postgres for the supporting services, SQLite for analytics, restic for nightly encrypted backups to R2 EU, and templated systemd alerts that email on any failed unit. Backups have monthly integrity checks. Most ongoing work is reading log digests and applying OS updates.

Question 7

How are backups handled and verified?

Accepted Answer

Nightly restic snapshots to a Cloudflare R2 EU bucket, encrypted with a passphrase stored in two places. Monthly partial-read integrity checks verify the archive is restorable. A full restore drill onto a clean parallel box was the hard gate before flipping DNS, and that drill is now part of the runbook.

Question 8

What is Software 3.0 according to Karpathy?

Accepted Answer

Karpathy frames three eras of software. Software 1.0 is humans writing explicit code. Software 2.0 is humans curating datasets and training neural networks, where the weights are the program. Software 3.0 is humans writing prompts, where the LLM is the interpreter and the context window is the program. The unit of programming shifts from a function to a paragraph.

Question 9

When did agentic coding actually start working?

Accepted Answer

Karpathy points to December 2024 as the inflection point. Before then, agentic tools were 'kind of helpful' but required constant correction. Over the December break, the latest models crossed a line where Karpathy stopped correcting them and started trusting the system. He flagged this on the record, warning that anyone whose mental model of AI was set by ChatGPT was already a generation stale.

Question 10

What is the difference between vibe coding and agentic engineering?

Accepted Answer

Vibe coding raises the floor: it lets non-engineers build software they could not build before. Agentic engineering raises the ceiling: it lets professional engineers preserve the existing quality bar while moving much faster. Karpathy thinks the productivity gap for the best users now exceeds the old 10x engineer benchmark by a wide margin.

Question 11

Why are LLMs good at code and math but bad at common-sense tasks?

Accepted Answer

Frontier labs train models with reinforcement learning, which requires verifiable rewards. Verifiable domains attract environments and signal, so they get the steepest gains. Everything outside the verifiable distribution stays jagged. Karpathy's takeaway for founders is that building a verifiable environment in your domain is real leverage. For workers, the more useful question than 'is my job safe' is 'is my job verifiable.'

Question 12

What does Karpathy mean by 'outsource your thinking, not your understanding'?

Accepted Answer

As agents do more execution, the bottleneck moves into the human's head. You still have to know what is worth building, why, and how to direct the work. Your value sits upstream of execution. Karpathy keeps building knowledge bases out of his own reading because the constraint of the next decade is less about compute than about how fast humans can deepen comprehension to keep directing systems that out-execute them.

Question 13

Why does Karpathy say MenuGen 'shouldn't exist'?

Accepted Answer

Karpathy built MenuGen as a full-stack app: photo a restaurant menu, OCR it, generate dish images, render a new menu. Then he saw the Software 3.0 version: hand the photo to Gemini, say 'use NanoBanana to overlay the dishes,' and a single model call returns the rendered menu. The lesson is that a lot of what gets built today is scaffolding around a capability the model could perform end-to-end. Before writing the next CRUD app, ask whether the model is the app.

Question 14

How should hiring change in the agentic era?

Accepted Answer

Karpathy argues whiteboard puzzles measure the wrong thing. Hiring should look like giving someone a really big project, having them implement it, and then trying to break it. His example: build a Twitter clone for agents, make it secure, simulate activity, then have ten Codex 5.4-X-high instances try to break the website. If your interview loop has not changed since 2022, you are selecting for the previous era.

Question 15

Why have smartphone upgrade cycles slowed down?

Accepted Answer

The average global smartphone replacement cycle has stretched to 3.5 years. Cameras, screens, and processors have reached a quality plateau where year-over-year improvements are incremental rather than transformative. Battery life has overtaken price as the top purchase driver for the first time, suggesting hardware differentiation has stalled.

Question 16

How does Apple use Google Gemini for on-device AI?

Accepted Answer

Google gave Apple complete access to the Gemini model in Apple's own data centers. Apple uses a process called distillation, where smaller models learn from Gemini's reasoning outputs to produce efficient models with Gemini-like performance at a fraction of the compute. These distilled models can run on-device without an internet connection.

Question 17

What is the Apple Foundation Model?

Accepted Answer

Apple's on-device Foundation Model is a roughly 3 billion parameter language model optimized for Apple Silicon through innovations like KV-cache sharing and 2-bit quantization. It runs at 30 tokens per second on iPhone 15 Pro and powers Apple Intelligence features including summarization, writing tools, and Siri enhancements.

Question 18

Could on-device AI model size become a marketing spec like megapixels?

Accepted Answer

Yes, and there are early signs of this. Samsung's Exynos 2600 markets 80 TOPS of NPU performance, more than double the prior generation. Samsung targets 800 million AI-enabled devices by end of 2026. But like megapixels before it, raw parameter count or TOPS may not correlate with actual user experience.

Question 19

Is it worth upgrading my phone for AI features in 2026?

Accepted Answer

It depends on your current device. On-device AI requires specific hardware: Apple Intelligence needs an A17 Pro or later, and Android AI features require recent NPUs. If your phone is more than two generations old, you cannot run the latest on-device models at all. Morgan Stanley's 2026 survey found iPhone upgrade intentions at an all-time high of 37%, driven partly by AI capabilities.

Question 20

How many parameters can a smartphone run on-device?

Accepted Answer

Current smartphones run 1-3 billion parameter models natively. Apple's Foundation Model is roughly 3 billion parameters. Google's Gemini Nano ships at 1.8 to 3.25 billion parameters. Developers have also demonstrated running a 400 billion parameter Mixture of Experts model on iPhone 17 Pro, though only 17 billion parameters are active per inference pass.