Marketing AutoResearch · 2026

Run 36,500
Experiments
This Year.

Andrej Karpathy built AI that trains itself overnight. Eric Siu applied the same loop to marketing. Here's the pattern — and the tool to run it yourself.

▶ Open the Tool Learn the pattern ↓

"Most marketing teams run 20-30 experiments a year. Maybe 52 if they're 'good'. New landing page. New ad creative. Maybe a subject line test. That's considered data-driven marketing. But the next generation of marketing systems will run 36,500+ experiments per year."

Eric Siu · Founder, Single Grain · Applying Karpathy's AutoResearch to Marketing

Where This Came From

On March 7, 2026, Andrej Karpathy pushed a 630-line Python script to GitHub and went to sleep. By morning his AI agent had run 50 experiments, discovered a better learning rate, and committed the proof — without a single human instruction in between.

The script wasn't doing something exotic. It was automating what every researcher does: modify something, measure the result, keep or discard, repeat. Karpathy just removed the human from the loop.

Within days, Eric Siu made the marketing translation explicit: replace the training script with a marketing asset. Replace validation loss with reply rate. The loop is identical.

The Three Primitives

AutoResearch isn't magic. It's three constraints — and each constraint is doing specific engineering work:

01
Editable Asset

One thing the agent can modify. Confined search space = interpretable results every time.

ML: train.py — model architecture + hyperparams
MKTG: subject line, headline, CTA, email body
02
Scalar Metric

One number that determines if the change was better. No committees. No human judgment required.

ML: val_bpb — validation bits per byte
MKTG: open rate, reply rate, conversion rate
03
Time-Boxed Cycle

Fixed duration makes every experiment directly comparable regardless of what changed.

ML: exactly 5 minutes of training per run
MKTG: 24-48hr deployment window per variant

The Translation

Every component of Karpathy's ML loop has a direct marketing equivalent:

AutoResearch (ML)Marketing LoopExample
train.pyYour marketing assetCold email, landing page, ad creative
program.mdYour experiment briefAudience, goal, constraints, what not to change
val_bpb metricYour success signalReply rate, open rate, meeting book rate
AI modifies train.pyAI generates variants10 subject line variants each with a hypothesis
5-min training runDeployment windowSend to test segment, measure for 24hrs
Keep or discardKeep or discardWinner becomes new baseline, loop continues
100 experiments/nightCompounding advantageProprietary map of what resonates with YOUR audience

Before vs After

Traditional Marketing Team
30
EXPERIMENTS / YEAR
2wk
AVERAGE TEST CYCLE
1
VARIABLE TESTED AT A TIME

Human in critical path. Every experiment requires design, copy, deployment, review.

AutoResearch Marketing Loop
36,500+
EXPERIMENTS / YEAR
24hr
MINIMUM TEST CYCLE
10x
VARIANTS PER SESSION

Human sets goal and approves parameters. AI handles hypothesis generation. You review the morning report.

How To Run It

1
Define Your Asset + Metric

Paste your current subject line, headline, or CTA. Pick your success metric. This is your program.md.

↓ LOOP BEGINS
2
AI Generates Hypotheses + Variants

Claude generates N variants, each with an explicit hypothesis about what assumption is being tested.

↓ YOU DEPLOY AND MEASURE
3
Enter Your Results

Run each variant against a segment of your list. Enter the result. Mark KEEP or DISCARD.

↓ LOOP LEARNS
4
Continue — AI Learns From Winners

Hit Continue and the AI generates the next batch using your winners as context. Patterns compound.

↓ COMPOUNDING ADVANTAGE BUILDS
5
You Build a Proprietary Map

After dozens of iterations you have a ranked experiment log unique to your audience. Competitors can't buy it.

The Real Moat

"The companies that win won't have better marketers. They'll have faster experiment loops." — Eric Siu

This isn't about subject lines. The overnight loop is a compounding knowledge machine. Every experiment builds a dataset of what works for your specific audience, offer, and market position. That dataset is yours alone.

Karpathy's insight applies directly: the bottleneck of progress is no longer the ability to run experiments — it's the ability to define the right constraints. The human role shifts from experimenter to experimental designer.

Run Your First Loop

Bring your own Anthropic API key. Stays in your browser — never stored, never shared with anyone.

▶ Open the Tool