← back to scientific method

The Self-Experiment

Randomization at n=1 · Example: Gwern Branwen, 2010→ · gwern.net/nootropics

Any protocol can be Goodharted. The defense is long content: publish the data, the code, and the nulls so anyone can check.

self-blinding protocol prepare capsules active + placebo randomize draw blind from bag take + record predict before reveal results published, including nulls melatonin improved sleep onset modafinil clear wakefulness effect ALCAR no effect detected magnesium 94% prob. of harm ZMA 127 days, no signal caffeine + theanine synergistic effect

The method

Fisher designed randomization for groups: split subjects into treatment and control, assign randomly, measure the difference. The method assumes you have dozens or hundreds of subjects. But the logic doesn't require groups. It requires that the experimenter cannot predict which condition comes next. If you can blind yourself — active and placebo in identical capsules, drawn from a bag — the randomization works at n=1. You become both experimenter and subject, separated by a wall of ignorance.

The protocol: fill gel capsules with active compound or inert powder. Pre-randomize into blocks of 28 days. Draw one daily without looking. Record your observations before unblinding. Run for months. Analyze with both Bayesian and frequentist methods. Publish everything.

This is what Feynman's first principle looks like as engineering. You cannot be biased toward the active condition if you don't know which condition you're in. You cannot selectively report if you've committed to publishing the full dataset. The protocol does the work that integrity alone cannot. The difference between self-experimentation and self-promotion is the control condition.

Bryan Johnson Gwern Branwen
Budget$2M/yearPharmacy prices
BlindingNoneSelf-blinded, capsules drawn from bag
Control conditionNonePlacebo blocks, pre-randomized
PublishesBiomarkers, supplement stacks, before/after photosBlinding protocol, R code, raw data, power analysis
Null resultsNot featuredPublished with equal prominence
What the data provesThat he's doing a lotWhether any of it works

What the method catches

Mayo's severity criterion: a test provides evidence only if it had a real chance of producing a different result were the hypothesis false. A 127-day blinded trial of magnesium citrate, analyzed with Bayesian methods, found a 94% probability that the supplement was actively harmful. That's a severe test producing an uncomfortable answer. Without the blinding, the experimenter would have attributed bad days to other causes and kept taking the supplement. The method caught something that self-observation never would.

Value-of-information analysis precedes expensive experiments. For fish oil and cognition, power analysis showed roughly 70 paired observation blocks were needed to detect a plausible effect size at 80% power. The question "is this experiment worth running?" has a quantitative answer. Fisher's machinery includes the calculation of whether to turn the machinery on.

The null results

Substance Duration Result
ALCARFull bottleNo effects, alone or mixed with choline + piracetam
AdrafinilFull course"Did nothing whatsoever that I noticed"
Huperzine-AFull bottleNo side-effects, no improvements of any kind
Coconut oilMonths"No longer particularly convinced it was doing anything"
ZMA127 daysNo statistically significant effects on sleep
Magnesium citrateMonths, blinded94% Bayesian probability of net harm

Positive results circulate on their own. Null results die in file drawers. Ioannidis showed that publication bias inflates the false positive rate across entire fields. Every published null is one fewer false positive someone else would have to discover independently.

Discussion

The self-experiment has a long history. Bacon died of pneumonia after stuffing a chicken with snow to test refrigeration. Barry Marshall drank H. pylori to prove it caused ulcers. The difference here is not courage but protocol. Bacon had no control group. Marshall had no blinding. The self-blinding method adds Fisher's randomization to the experimenter's willingness, and that turns anecdote into data.

Chamberlin warned that a "ruling theory" becomes a "ruling passion." Self-experimenters are especially vulnerable: you want the substance to work because you're taking it. Blinding is the structural answer to Chamberlin's problem. You can't have a ruling theory about today's capsule if you don't know what's in it.

Ioannidis showed that labs, IRBs, and grants don't prevent false findings. The institution provides the method but not the accountability. Publishing the raw data, the analysis code, and the uncomfortable results. That's the accountability. The paper trail can't be Goodharted, because gaming it requires fabrication, which the trail itself exposes.

Neighbors