May 2026 Bakery
js-notes is my personal knowledge management system. It captures articles, podcasts, and videos, extracts their full text, and enriches them with AI-generated summaries and tags. This is everything that I added to my notes system in May 2026.
Sean Goedecke lays out the two ways to use LLMs in software — developer-controlled pipelines versus LLM-controlled agents — and argues agents should be the default: they're smarter, easier to build, and benefit more from each new model, with pipelines reserved for strict cost, context-size, or local-model constraints.

Steve Yegge argues the technical interview is finally dying after 50 unchanged years, citing the time Google's hiring committee unknowingly voted to reject its own members' packets. His proposed replacement is the "campfire": short paid stints of real work that generate portable, verifiable records of what a candidate actually produced.

Eric Seufert argues that independent agentic commerce is largely a mirage: Amazon blocks shopping agents at discovery to protect its $56B ad business and Shopify restricts them at purchase to protect transaction data, and neither dominant platform will cede the user relationship to a third-party agent rather than build its own.

Ben Thompson interviews Eric Seufert on using ad platforms as teacher models for creative pre-testing, why agentic checkout failed, and how Google quietly executed a Ship-of-Theseus transition of Search into AI Mode. Seufert closes with his "Prosperous Society" thesis: AI advertising is positive-sum because it matches infinite human desire to an exploding supply of products.
Dax Raad, co-founder of OpenCode, tells Gergely Orosz how the open-source coding agent grew to nearly 8 million monthly users — while candidly admitting AI tools haven't made his team move faster, just feel like they are. His warning: AI mutes the "prickle" of writing hacks, so judgment quietly degrades and tech debt piles up.
Jeff Weinstein's checklist for why your product isn't growing: a burning user problem, users reachable in a Slack/WhatsApp room, a 10X narrowly-scoped solution, a daily active-user chart, and a co-located team of 2-5 who like each other.
This came up twice today, so figured I'd tweet it.
— Jeff Weinstein (@jeff_weinstein) May 27, 2026
People, internally and externally, talk [read: complain] to me that their product isn't growing, or isn't growing enough.
100% of the time they are missing at least one of the following minimum requirements for success:
(A) A…
Scott Alexander used Claude as a research assistant for his California primary ballot and found it produced better voter guidance than any guide he'd seen — matching his own pick 5 of 10 times and his second choice 3 more — with most misses traceable to his own underspecified prompt rather than the model.

Ryan Carson walks Peter Yang through running a startup solo on AI agents, built around the principle that "agents are cron jobs and markdown files." His big inversion of the old startup playbook: spend heavily upfront on systems, docs, and agent config — that's what unlocks the leverage of ten people.
A fifteen-year Ruby veteran makes the case for Ruby as quietly underrated, spotlighting lesser-known features (refinements, Forwardable, tap/then, numbered block params) and modern tooling (Ruby LSP, ZJIT, Kamal) — and argues Ruby's syntactic density makes it unusually token-efficient when feeding code to LLMs.
Scott Alexander breaks "taste" into eight distinct components and argues that conflating them — especially folding novelty and provenance into aesthetic value — produces dishonest criticism. Drawing an analogy to RCT-grade medical rigor, he asks what beauty would be left if you controlled out all the context and novelty effects.

The story of rubyfmt, a zero-config Rust-based Ruby autoformatter that grew from a personal project into the tool Stripe used to format its 25-million-line monorepo in a single Saturday morning — including the wild detour of linking a full Ruby VM into a Rust binary to walk parse trees in memory.

Scott Alexander uses three "model organisms" — Reddit flag-design debates, movie plot holes, and tech company names — to dissect how taste rules work, arguing that many calcify from obsolete practical constraints into orthodoxy policed long after their reason for existing is gone.

Ron Shah argues that eyes-open pragmatism is a more useful kind of tech optimism than blind enthusiasm, forecasting far fewer full-time knowledge-worker jobs, real AGI risks, and shocks to centralized finance — with his ultimate bet placed on human collective intelligence carrying us through.

Steve Huynh argues that defaulting to pessimism — even when it feels like realism — quietly stalls careers by eroding leadership's trust and degrading the team around you. The fix is to follow every named risk with a path forward rather than a verdict of failure.
Scott Alexander defines bad taste as the overuse of cheap, easily-scripted tricks that wow unsophisticated audiences — whether deployed by an AI model, a Kenyan exam-taker, or a Lisa Frank poster — and good taste as the deliberate restraint of those tricks, then questions whether sophisticated taste actually delivers more pleasure or just more gatekeeping.

Ron Shah uses his daughter's 12th birthday as a reminder of how fast childhood passes, sharing seven principles for being a more present parent — daily 1:1 time, showing up for the hard moments, and loving each kid for their differences rather than as a smaller version of yourself.

Drawing on Amazon's slipping Kindle launch, the author distinguishes performative optimism (empty reassurance) from professional optimism (honest diagnosis paired with a grounded plan) — and argues only the latter earns lasting trust, because you have to do the homework before your optimism gets the right to enter the room.
Sean Goedecke argues prompts are a worse form of technical debt than code because they decay silently with every model release rather than failing loudly. His advice: use third-party tools left as unconfigured as possible, and keep AGENTS.md limited to concrete project facts rather than behavioral steering.

Fork your dependencies, trim them to your use case, never update unless it breaks for users. Mitchell Hashimoto says updating is riskier than tracked latent bugs — and feels vindicated by the wave of supply-chain attacks.
Fork your dependencies, trim them to only your use case, never update unless it breaks for your users. I’ve been vocal about this for 10+ years. I’ve always said that updating is way riskier than latent bugs (which can be tracked and CVEs monitored).
— Mitchell Hashimoto (@mitchellh) May 20, 2026
If you are updating a…
Sean Goedecke benchmarked Kelsey Piper's elaborate o3 "GeoGuessr protocol" prompt against a basic one across 200 images and found the fancy prompt performed no better — often slightly worse — a clean demonstration of how easy it is to fool yourself into thinking prompt engineering is working when the model was already capable.

Thariq's go-to prompt: implement a spec and keep a running implementation-notes file logging every decision, deviation, and tradeoff — turning the AI's reasoning into a living audit trail you can actually review.
a prompt I've been using a lot recently:
— Thariq (@trq212) May 18, 2026
implement <SPEC> and while you do, keep a running implementation-notes.html file (or markdown) with decisions you had to make weren't in the spec, things you had to change, tradeoffs you had to make or anything else I should know pic.twitter.com/qQFTES4fjo
Sean Goedecke argues the senior engineer whose job is to block complexity and say no was a creature of the ZIRP era's bloated teams — and that the post-ZIRP refocus on velocity and profitability, not AI, is what turned that archetype from an asset into a liability.

Neil Hacker traces how ASML, a struggling Philips spinout, became the sole maker of EUV lithography machines through modular outsourcing, transatlantic public-private partnerships, deep customer co-investment, and a decades-long bet on a technology many doubted would ever work.

Derek Thompson surveys 2026's megatrends, leading with America's "anti-social century" — a broad decline in time spent with partners, friends, coworkers, and kids — and the expanding therapeutic reach of GLP-1 drugs well beyond weight loss, into addiction, mental distress, and fatty liver disease.

Rohin Dharmakumar argues that Anthropic and OpenAI have flipped from enabler platforms into active consumers of internet value, running an Ingest-Codify-Sever playbook that absorbs and then displaces players across legal, finance, marketing, and IT services — making them existential threats to everyone from TCS to Intuit.

Sean Goedecke revisits his LLM workflow a year on: agents now write entire PRs and diagnose 80% of bugs on their own, with human judgment reserved for narrowing hard bugs, public communication, and UI testing. The core skill, he says, is calibrating how much to offload without going too far.

Anthropic argues the US and its allies must defend their compute advantage through tighter export controls and a crackdown on chip smuggling and distillation attacks, presenting two 2028 scenarios — a decisive democratic lead versus the CCP reaching near-parity — and warning the policy window is narrow.
Sean Goedecke explores "steering" — manipulating a model's activations mid-inference — now that DeepSeek-V4-Flash makes local experimentation practical, but concludes it's largely outcompeted by prompting for simple tasks and by fine-tuning for ambitious ones. Skeptical but curious, he'll be watching the open-source community.

Anthropic details how Claude Code navigates large enterprise codebases through live agentic search rather than RAG indexing, and lays out the layered "harness" — CLAUDE.md files, hooks, skills, plugins, LSP, MCP servers, subagents — that matters as much as the model itself for real-world performance.

Patrick McKenzie interviews Aaron Brown on why institutions that produce bad statistics face so few consequences — from contradictory tractor-fuel data to a flawed bus-safety study — arguing that prestige, missing audit trails, and career incentives suppress error correction, and that betting is "a tax on bullshit."

Marshall Houston offers a ten-dimension framework for distinguishing builder from naysayer behaviors, stressing that these are movable patterns rather than fixed identities and that the real differentiator isn't positivity but whether criticism comes with ownership of an alternative — skin in the game.

Bobby Morgan argues that implementation was always at most a quarter of software engineering — the rest is problem definition, design, communication, observability, product ownership, and strategy — and that as AI makes the code cheap, the judgment surrounding it becomes the real source of an engineer's value.
A startup founder describes how compulsive checking of email, Slack, and social was manufacturing chronic anxiety by simulating false urgency. Scheduling three fixed internet windows a day and protecting the morning largely dissolved it — revealing the emergency had been self-generated all along.

Sean Goedecke pushes back on the popular claim that heat dissipation makes space-based AI datacenters impossible — radiative cooling actually works better in vacuum — and argues the real obstacle is the enormous combined mass of solar panels, GPUs, and hardware you'd have to launch, not the cooling.

Shopify's Q1 2026 data shows shoppers arriving from AI platforms convert nearly 50% better and spend 14% more than organic-search visitors, driven by "journey compression" — AI collapses multi-session research into a single conversation and delivers pre-qualified buyers straight to product pages.

Patrick McKenzie investigates how an SPLC-led nonprofit coalition ran a multi-year pressure campaign to get tech and financial companies to censor content and cut off funding to targeted groups — and documents evidence the supposedly nonpartisan effort intervened against a declared presidential candidate's infrastructure, potentially breaching 501(c)(3) rules.

Gergely Orosz revisits Brooks's 1986 "No Silver Bullet" forty years on, arguing open source (supercharged by GitHub) is the strongest historical candidate for a true order-of-magnitude productivity gain, while AI agents produce far more code but only modest, mixed gains in real productivity and reliability.

Sean Goedecke unpacks Thinking Machines' first release — not a frontier model but a scaled-up, multimodal fully-duplex conversational system. The genuine achievement, he argues, is scaling a fully-duplex model large enough to ingest video; the reasoning-delegation and interruption features are partly clever benchmark-gaming.

Paul Krugman uses Ken Griffin's furious reaction to a proposed New York pied-à-terre tax to argue that extreme wealth breeds a dangerous pettiness — and that the personal grievances of oligarchs increasingly drive consequential public policy, with Musk's USAID cuts as the extreme case.
Sean Goedecke argues left-wing opposition to AI is mostly historical accident — guilt by association with crypto and Trump-aligned CEOs — and lays out genuinely left-coded arguments for LLMs as disability aids, tools for medical self-advocacy, class equalizers, and educational democratizers.

Tyler Cowen interviews Craig Newmark on a career of deliberate subtractions — keeping Craigslist plain, stepping aside as CEO, funding people and getting out of the way — and his theory that recognizing your own limits and relying on networks of networks is how you actually get more done.

A Claude Code team member argues HTML has surpassed Markdown as the best output format for AI agents — richer visuals, interactivity, and shareable links mean specs and reports actually get read — and that the format keeps humans more meaningfully in the loop despite slower generation and noisier diffs.
— Thariq (@trq212) May 8, 2026
A veteran engineer describes "output-competence decoupling" — generative AI letting workers produce expert-looking artifacts in domains they've never trained in, with sycophantic models never flagging the problem — and warns it bloats workplace documentation and hollows out the pipelines that used to build real expertise.
Jeff Atwood's classic argument that you should always assume a bug is in your own code, not the OS, compiler, or library — citing the "select isn't broken" parable and studies pegging ~95% of errors on programmers — and that taking full ownership is how you earn credibility.

Simon Willison admits his once-clear line between "vibe coding" and responsible "agentic engineering" has begun to blur, as reliable agents tempt even experienced engineers to stop reviewing every line — and explores how 10x code output shifts bottlenecks across the whole software lifecycle, while remaining unworried about his career.
Sean Goedecke argues coding agents have raised the floor for weak engineers, turning net-negative output into merely mediocre code — with some low performers now acting as human relay stations for an LLM. He warns it's unsustainable: companies will eventually ask what value the engineer adds to the AI, not the reverse.

Sean Goedecke shares counterintuitive lessons on incident response: most incidents are boring and resolve on their own, hasty intervention usually makes them worse, the best first move is often to do nothing, and deep system knowledge beats raw talent — though heroically resolving incidents is not a durable basis for power.

Sean Goedecke explains Stripe's new "Tempo" blockchain: not a new cryptocurrency but a payment rail for transacting existing stablecoins. Stripe's real motive is to own the rail itself rather than skim a margin on top of Visa or Mastercard, capturing the full transaction fee.

David George argues the AI-unemployment panic is just the debunked lump-of-labor fallacy in new branding — from tractors to spreadsheets, transformational tech has always expanded the labor market — and points to rising software and PM hiring as early evidence AI is augmenting workers rather than replacing them.
— David George (@DavidGeorge83) May 6, 2026
Sean Goedecke answers Dwarkesh Patel's challenge on why AI hasn't slowed despite longer-horizon RL costing more FLOPs per reward: efficiency gains from fixing engineering bugs, unreliable human intuition near human-level intelligence, and the fact that capability depends on many traits beyond raw intelligence.

Owen Williams, a design manager at Stripe, built Protodash — an internal AI prototyping tool that generates high-fidelity, on-brand prototypes from Stripe's own design system, sidestepping the "Tailwind indigo slop" problem. Surprisingly, PMs adopted it even more than designers, shifting reviews toward substance.
Nate B Jones argues Stripe's agentic-commerce launches signal power shifting from sellers to buyers, as AI agents form purchasing intent before ever entering a seller's funnel — forcing merchants to become legible and callable by software, and relocating payment authority and brand into the buyer's agent.
The last 20% isn't most of the work — it's all of the work. Jason Fried's one-liner inverting the Pareto framing of where the real difficulty in shipping lives.
The last 20% isn't most of the work, it's all of the work.
— Jason Fried (@jasonfried) May 5, 2026
Ben Thompson reads Google's and Meta's Q1 2026 earnings side by side: Google Cloud's 63% growth (and a chunk of its profit) is heavily tied to Anthropic's compute spend, while Meta's strong numbers were punished because Zuckerberg framed AI spending as existential with no near-term payoff — a bet Thompson thinks may prove an advantage.
Ben Thompson analyzes Microsoft's pivot from per-seat to "seats plus consumption" pricing as agents threaten to shrink seat counts, and Apple's surprise Mac shortage — driven not by on-device inference but by agentic cloud-LLM workflows that hammer the CPU and exposed even Apple's supply chain.
Ben Thompson argues Amazon Supply Chain Services validates his decade-old prediction that Amazon would commercialize its logistics the way it did AWS — be its own best customer, then sell to third parties — and that AWS's architectural bets on custom silicon and disaggregated compute now suit the inference-dominated AI market.

A history of dishwashing from Sumerian soap and sand-scouring through London's coal-driven shift to soap-and-hot-water, to Josephine Cochrane's first commercially successful dishwasher — concluding that modern dishwashers use roughly a third the energy and a seventh the water of washing by hand.

Gruber and Thompson dig into macOS Tahoe's washed-out, low-contrast UI before turning to Q1 earnings — Google Cloud's Anthropic-fueled surge versus Meta's cloudless "trust me" call — and the strategic split between owning the full AI stack and partnering for models.

Gruber and Thompson cover Tahoe's blinding whiteness, the Musk v. OpenAI and Apple v. Epic cases, and why Microsoft amended its Azure exclusivity deal with OpenAI — choosing to protect its hundreds-of-billions stake over locking the API to its own cloud.

Boris Cherny, creator of Claude Code, explains how it was deliberately built six months before product-market fit to anticipate model gains, and describes his current setup where Claude writes 100% of his code via hundreds of parallel agents and loops — predicting coding will become as universal as literacy.
Matt Sitman and Sam Adler-Bell discuss their differing religious backgrounds and how faith shapes their politics — why observance now correlates with higher education, how digital technology wages a kind of war on attention and meaning, and whether religion is the only sufficient defense of human dignity against AI and social media.

Cat Wu, Head of Product for Claude Code, explains how Anthropic compressed its shipping cadence from months to days by building products before the models are ready, how AI is reshaping the PM role around evals and model introspection, and why mission alignment and a "just do things" culture remove organizational friction.

Sean Goedecke argues that even if AI erodes engineers' long-term skills, market forces may compel its use anyway — the way construction workers must lift heavy objects despite the toll — and that software engineers may be the first generation facing athlete-like career spans and should plan accordingly.

Scott Alexander argues that apps and argument-maps to "solve debate" are doomed: real disagreements rarely turn on mappable logic or false facts but on differing values and weightings of evidence, and the bootstrapping problem plus most people's preference for drive-by potshots makes such tools impractical at scale.

Kevin Kwok dissects the Cursor/SpaceX deal — SpaceX can acquire Cursor for $60B or pay $10B — as both companies racing to close the "complete loop" of owning model and product together, since competing at the frontier of coding agents now requires both layers at once.

Amelia Tait examines "Disney adults" who go thousands into debt for repeat park trips, exploring how Disney engineers an ecosystem of financial entanglement blending nostalgia, status, and escapism — and why nearly all the debtors she interviews say they have no regrets.

Jay Michaelson traces how Candace Owens's "Sabbatean" conspiracy theories originated in a 1974 book by Orthodox rabbi Marvin Antelman, were amplified by David Icke, and reached Owens — a strange journey of obscure Jewish messianic history twisted into modern antisemitism, and a case study in why debunking rarely works.

Sean Goedecke critiques the common advice to target Will Larson's staff-engineer archetypes, arguing you don't become a tech lead by trying to be one — you get there by doing good work until the skills and trust emerge. The real defining trait of a staff engineer is simply being useful to the company, no matter what.

Noah Smith marshals immigration data and polling to argue Japan is not abnormally xenophobic: foreign residents have surged since Abe deliberately opened the country, and the Japanese public is more pro-immigration than the median developed nation — driven by economic necessity and an aging population.

An essay drawing on Clausewitz's account of Prussia's 1806 collapse — hollow institutions, a disconnected elite, and a society drained of civic purpose — to warn that contemporary America shows the same pattern of strategic incoherence, bureaucratic ossification, and broad-based nihilism beneath an apparently prosperous surface.

Sam Altman and Patrick Collison discuss the recent inflection in AI coding ability and OpenAI's ambition to become a low-margin, utility-like infrastructure provider aligned with customer success — and why the most effective AI adopters are CEOs who personally automate their own work and grant permissive data access.
Stripe's Dan Hill demos Link's Wallet for Agents, which lets AI agents spend on a user's behalf via one-time virtual cards or shared payment tokens — with every purchase requiring human approval, raw credentials never exposed, and contractual guardrails enforced by Stripe as the issuer.
John Collison mines Stripe's data (nearly 2% of global GDP) for three trends: a structural surge in business dynamism with solopreneurs scaling faster than ever, the rise of agentic commerce, and the growing value of AI's complements — proprietary data, network effects, and real-world operations.
Patrick McKenzie tells the improbable story of how a civil-rights nonprofit ran a private intelligence agency whose blacklist became a screening tool across the financial industry — and how, per a April DOJ indictment, it allegedly opened front-business bank accounts to pay covert assets.

Scott Alexander explores "deontological bars" — hard moral rules that constrain consequentialist reasoning — and tries to derive a principle like "don't violate well-functioning norms unless you'd be cooperating while enemies defect," applying it to debates over working with AI companies versus mass political action.




























































Member discussion