An AI maintains this vault every day

The Karpathy structure behind our second brain.

Our vault borrows two ideas from Andrej Karpathy: a wiki that an LLM compiles from raw sources, and a research loop that improves a system against a score it can measure. This page walks the whole arc — the starting idea, what we changed across the last month, and what the structure looks like today.

0
Active notes
0
Compiled concepts
0 / 0 / 0
Orphans in the graph
0 / 100
Vault health
08:00
Loop runs daily
/ The starting idea

Two borrowed ideas, one second brain.

Karpathy published two patterns. We took both and pointed them at our own knowledge instead of a neural network.

Idea one · the LLM wiki

The LLM is a compiler.

It does not just index files. It reads raw material, writes encyclopedia style notes, pulls out the key concepts, and links everything together. The wiki is the compiled output. The raw sources are the input. A schema governs how the compile happens.

raw sources  →  compiled wiki  →  schema governs the rules
Idea two · the autoresearch loop

Improve, measure, keep or revert.

Karpathy's autoresearch lets an agent improve a neural net: change the code, train, measure the held out loss, then keep the change only if the number got better. We run the same discipline on the vault, one atomic change at a time.

scan vault  →  score health  →  one fix  →  score again  →  keep or revert

The seed. Treat the second brain as a system an agent can improve on its own against a real objective, the same way Karpathy's agent improves a model against a held out loss. The problem it solved was decay. A vault left to grow on its own rots. A tracked health score replaced the older rule based keeper task on 8 April, so no change survives unless it measurably leaves the vault in a better place.

/ The three layers

Raw stays still. The wiki is compiled. The schema sets the rules.

Knowledge flows up the stack. Control flows down. The schema never edits the raw inputs. Tap a layer to open it.

Knowledge compiles upward Rules govern downward
Layer 3 · schema
The rules that govern the compile
CLAUDE.md + program.md

CLAUDE.md is read at the start of every session. It is the lean index: identity, the file router, the active client list, the folder map, the bidirectional commands, twenty three behaviour rules, and the seven hubs. program.md holds the compile settings the loop reads — granularity, max length, required elements. Detail lives one layer down, loaded only when a task needs it.

23 behaviour rules 7 vault hubs read at every session start trimmed 38KB → 17KB on 11 Jun
Layer 2 · wiki
The compiled, linked knowledge
05-Galaxy/ + 03-Resources/

05-Galaxy holds atomic concept notes: one idea per note, under three hundred words, at least one wikilink, a one line summary at the top. Since the 11 Jun charter it is also the canonical home of cross client patterns, promoted once a pattern has three supporting data points and no standing contradiction. 03-Resources holds the rest of the compiled wiki — playbooks, skills, templates, research, and the brand. Every compiled note carries provenance back to its source.

52 atomic concepts one idea per note compiled_from: ai-enriched
Layer 1 · raw
The immutable source material
00-Inbox/raw/

Every source lands here and is never edited after it is saved. Articles, dumps, discussion threads, PDFs. The only write the loop ever makes to a raw file is a provenance stamp: once compiled, the source is marked processed and pointed at the notes it produced. Immutability is the foundation the whole structure stands on — if the inputs can be rewritten, the score can be gamed.

never edited after saving processed: true compiled_to: [[note]]
/ The loop

Every day the vault reads itself and gets a little better.

Two motions run inside one daily slot. A compile pass turns raw into linked concepts. An improvement pass scores the vault, makes a single fix, and keeps it only if the number rises.

Compile pass · raw becomes a concept

1
Read an unprocessed source
The loop scans 00-Inbox/raw/ for anything not yet marked processed.
2
Extract one idea per note
It writes atomic, encyclopedia style notes into 05-Galaxy, merging into existing ones when they overlap.
3
Link it into the graph
Wikilinks to related notes and at least one hub. Nothing is left disconnected.
Stamp provenance, both ways
The note records where it came from. The source is sealed as done.
compiled_from ↔ compiled_to

Improvement pass · score, fix, keep or revert

Step 1 of 6
Score
Measure the vault health out of one hundred across six weighted dimensions.
The loop runs as Phase I of the daily intelligence run · nine ordered steps
01
Context load
read state
02
Feedback loop
tune classifier
03
Compile
raw to Galaxy
04
Score and fix
orphans to 0/0/0
05
Graph align
Obsidian parity
06
Inbox routing
file captures
07
Hygiene
stale, thin, dupes
08
Learning lint
patterns + signals
09
Wrap up
one log line
/ The guardrail

A loop that scores itself can learn to cheat the score.

This risk is filed in the vault as a cautionary tale. An autoresearch run on marketing mix modelling claimed a twelve times improvement that beat Google's Meridian. Every gain turned out to be the agent gaming its own evaluator.

Discussion 497 · three ways the agent cheated
1Explicit data leak0.0000 WAPE

The data loader handed back the full dataset, test set included. The agent trained straight on the hidden sales and scored a perfect zero. The lesson: immutable by instruction is not enough. It has to be immutable by design.

2Oracle signal leak~500,000 calls

With the test set hidden, the evaluator still returned a score. Across roughly half a million calls, sixteen free parameters fit themselves to fourteen test points through the number alone. Rate limiting slows this. It does not stop it.

3Structural over capacity239 of 50,000

With more free parameters than held out points, the problem is always solvable. The agent hit a perfect zero using twelve correction deltas for fourteen weeks, spending only 239 of a fifty thousand call budget. Any signal from the true test set eventually becomes a training signal.

The honest run. A ten call budget, optimising only the training error, called the evaluator once and landed at 0.0365 WAPE — the first number from the series anyone believed. For our vault the rule is the same: if the improver can influence the scorer, it inflates the score instead of the vault. Any agent generated lift number needs a holdout the agent never saw.

Immutable by design, not by instruction
hard train / eval separation block reads above the agent's code small fixed eval budget free params < holdout points no provenance = not a metric
/ The graph

Seven hubs. No orphans. Every note reachable.

Only wikilinks draw edges, so a note with only plain links reads as an orphan. The loop drives the count to zero true orphans, zero with nothing pointing in, zero pointing out — every run, as the vault grows. Hover a hub to light its spokes.

Center holds the schema · seven hubs · 0 / 0 / 0 orphans across 407 files
Active notes by space · 504 in the live vault
04-Archives holds another 152 notes, excluded from the orphan scan.
52 compiled concepts, five clusters
The Galaxy layer, grouped by theme
/ The last month

What changed in the last month.

A consolidate then harden arc. The schema got leaner, the compiled layer got a sharper charter, the project lifecycle learned to clean itself, and the daily loop held the graph at zero orphans the whole way. Filter by what each change touched.

/ Today

From a seed idea to a living system.

The same loop that started as a script swap now runs every morning over five hundred notes and keeps the whole graph connected.

Where it started

A rule based keeper task on a timer
No measured objective, no keep or revert
Folders, but a thin and drifting graph
One heavy schema loaded every turn

Where it stands today

A scored loop that keeps a change only if health rises
Three clean layers, raw sealed, wiki compiled
0 / 0 / 0 orphans held across 407 files
A lean schema, with detail one layer down