# The Obsidian Second Brain

A Claude-Code-powered personal knowledge management system that captures, curates, and queries everything you do — built around Andrej Karpathy's LLM Wiki pattern.

---

## Why I built this

I had notes in three apps, conversation exports in five folders, articles in a "read later" pile that never got read, and zero way to ask any of it a question. Every project started from scratch. Every decision I'd already made got re-litigated. The context I needed was always somewhere else.

I built this to fix one specific problem: **the stuff I learn should compound**.

You can build the same system in a weekend of intermittent sessions — it scales from a handful of notes up to the thousand-source vault I run today. Either way, the bones are the same.

## The pattern: Karpathy's LLM Wiki

This system is a working implementation of the LLM Wiki pattern Andrej Karpathy has talked about: structure your personal knowledge so an LLM can consume it directly. Source notes stay raw and untouched; a curated layer above them does the synthesis, the linking, and the attribution. Every claim points back to where it came from.

In practice that means:

- **Strict separation** between source material (`raw/`) and curated knowledge (`wiki/`)
- **Source attribution on every claim** — no synthesis without a citation
- **Atomic pages** — one idea per file, linked together so the LLM can follow the graph
- **Frontmatter discipline** — structured metadata that retrieval and ranking can lean on

The conventions in [`starter-files/CLAUDE.md`](starter-files/CLAUDE.md) enforce this pattern; the four slash commands operationalize it.

## How the parts fit together

The system has three moving parts:

1. **`raw/`** — the read-only intake folder. Every conversation, transcript, article, and note lands here. Nothing in `raw/` ever gets edited or deleted; it's the audit trail.
2. **`wiki/`** — Claude's curated output. Concept pages, project pages, people pages — all generated from `raw/` by the `/ingest` slash command, all linked together with `[[wiki-links]]`.
3. **A nightly cron** that runs `/ingest` and `/lint` while I sleep, so the wiki keeps current with whatever I dropped into `raw/` that day.

By the end of this guide, you'll have all three running on your own machine, plus a capture hook that saves every Claude Code conversation directly into `raw/` so the system feeds itself.

```
                  drop sources              promote to pages
   you ──────► raw/<source>/  ─── /ingest ──► wiki/concepts|projects|people/
                    ▲                                   │
   capture hook ────┘                                   │
   (Claude sessions)                                    ▼
                                                      /query  ──► you ask, Claude answers
                                                      /log    ──► you append, log.md grows
                                                      /lint   ──► nightly health check
```

---

## Prerequisites

- **Claude Code** installed and logged in. Test with `claude --version`.
- **Python 3** on your `PATH`. Test with `python3 --version` (or `py --version` on Windows).
- An **Obsidian-compatible Markdown editor**. The vault works in any Markdown viewer; Obsidian is recommended because it understands `[[wiki-links]]` natively and gives you a graph view + backlinks panel for free.
- **A way to run nightly cron jobs** — Windows Task Scheduler, macOS launchd, or Linux cron. Any of them works.

---

## Privacy first — read this before you push anything

The capture hook records full Claude Code conversations into `raw/claude-exports/auto/`. Some of those will contain credentials you typed, paths inside your home directory, names of people, internal project details — anything you said to Claude.

**Never push the following directories to a public repo:**

- `raw/` — every source you've ingested
- `wiki/` — your curated knowledge
- `.claude/logs/` — verbose output of every cron run
- `.claude/state/` — last-run status JSON, may include vault paths
- `.claude/backups/` — wiki snapshots
- `*claude-exports/auto/` — auto-captured session transcripts

The starter `.gitignore` in this repo handles all of these by default. **Run `git status` before your first commit and confirm those paths are not staged.** If you want to publish your setup, fork *this* starter repo (which contains only templates) and keep your own vault private.

---

## Step 1 — Scaffold the vault

Pick a directory you'll be happy with for years. Mine is `~/Documents/second-brain`. Create the skeleton:

```bash
mkdir -p ~/Documents/second-brain/{raw,wiki/concepts,wiki/projects,wiki/people,agents,content,journal,.claude/commands,.claude/hooks,.claude/scripts}
mkdir -p ~/Documents/second-brain/raw/{claude-exports/auto,chatgpt-exports,articles,notes,voice-memos,youtube-transcripts}
```

The `raw/` subfolders are suggestive, not mandatory — add or drop categories as you actually capture. The only hard rule: **don't write to `raw/` by hand once it's set up**. Drop files in, but never edit or rename them. That's the audit trail.

`agents/`, `content/`, and `journal/` start empty. They're scaffolding for things you'll grow into.

---

## Step 2 — Install CLAUDE.md

`CLAUDE.md` is the contract. Claude Code reads it on every session and uses it to decide how to behave inside this vault: file structure, frontmatter rules, style guide, your domain context.

Copy [`starter-files/CLAUDE.md`](starter-files/CLAUDE.md) to the root of your vault. Then open it and edit **Section 4 — Domain Context**. Replace the TODO placeholders with your own:

- The 2–4 domains you actually work in
- Active projects you're tracking
- People who show up in your sources
- What you want the wiki to weight

Leave the other sections alone. The frontmatter schema, page conventions, style rules, and Karpathy coding principles are battle-tested across hundreds of pages — change them only if you have a strong reason.

The frontmatter schema in particular is the contract that makes everything else work. Every wiki page must open with:

```yaml
---
title: "Page Title"
type: concept          # concept | entity | source-summary | comparison | project | person
sources:
  - raw/articles/2026-04-18-example.md
related:
  - "[[other-page]]"
created: 2026-04-18
last-updated: 2026-04-18
---
```

`/lint` will catch you if you skip a field. `/query` uses `related` to find connected pages. `last-updated` is how `/lint` flags stale stuff. These six fields earn their keep.

---

## Step 3 — Initialize `index.md` and `log.md`

Two pages bootstrap the wiki:

**`wiki/index.md`** — the master catalog. `/ingest` keeps it current; you rarely edit it by hand. Start with:

```markdown
# Wiki Index

## Concepts

## Projects

## People
```

**`wiki/log.md`** — the activity log. Append-only, newest entries at the top. `/log` and `/ingest` both write here. Start with just:

```markdown
# Activity Log
```

That's it. Both files grow as you use the system.

---

## Step 4 — Install the slash commands

Four custom slash commands are the heart of the workflow. Copy all four files from [`starter-files/.claude/commands/`](starter-files/.claude/commands/) into your vault's `.claude/commands/` directory.

| Command | What it does |
|---|---|
| `/ingest` | Reads unread files in `raw/`, creates `wiki/` pages with frontmatter, cross-links related pages via `[[wiki-links]]`, updates the index, appends to the log. Emits a machine-readable sentinel footer the cron parses. |
| `/log "thought"` | Appends a timestamped note to `wiki/log.md`. If the note mentions a page that already exists, also appends to that page's "Log Entries" section. |
| `/query "question"` | Answers grounded only in the wiki, with citations. Surfaces contradictions explicitly. Refuses to invent. |
| `/lint` | Read-only health check: missing frontmatter, broken links, orphan pages, stale pages, contradictions, index mismatches. Reports first, asks before fixing. |

Sanity-check each one once before moving on:

```bash
cd ~/Documents/second-brain

# Drop a tiny test file in raw/
echo "# Test\nThis is a test note about widgets." > raw/notes/test.md

claude /ingest          # should create a wiki page and update index + log
claude /log "I started building my second brain today"
claude /query "what did I write about widgets?"
claude /lint            # should report 0 or near-0 issues
```

If any of those misbehave, double-check the file made it into `.claude/commands/` and that the frontmatter at the top of the command file is intact.

---

## Step 5 — Wire the capture hook

This is the piece that makes the system feed itself. Every time a Claude Code session ends, the hook saves the conversation transcript as a Markdown file in `raw/claude-exports/auto/`. The next `/ingest` picks it up.

Two files to install:

1. Copy [`starter-files/.claude/hooks/capture-session.py`](starter-files/.claude/hooks/capture-session.py) to your vault's `.claude/hooks/capture-session.py`. No edits needed — the script figures out the vault root from its own location.
2. Open your vault's `.claude/settings.json` (create it if it doesn't exist), and merge the `hooks` block from [`starter-files/.claude/settings.json`](starter-files/.claude/settings.json). **Don't replace the whole file** — if you already have a `permissions` array or anything else there, leave it. Just add the `hooks` block.

Two events fire the same script:

- **`SessionEnd`** — the normal trigger, runs when you exit a Claude Code session.
- **`PreCompact`** — runs before context compaction, so a long session that gets compacted in-flight still leaves a snapshot.

On Linux/macOS, change the `command` from `py` to `python3`:

```json
"command": "python3 \"$CLAUDE_PROJECT_DIR/.claude/hooks/capture-session.py\""
```

**Verify it works:** start a Claude Code session in your vault, type a message or two, end the session, then check `raw/claude-exports/auto/`. You should see a new `YYYY-MM-DD-HHMMSS-session-end-<id>.md` file. Open it — the conversation should be in there as Markdown.

---

## Step 6 — Schedule the nightly auto-ingest

The cron job is the second half of "feeds itself." It runs `/ingest` and `/lint` every night while you sleep, so by morning your wiki has absorbed yesterday's captures.

Copy the contents of [`starter-files/.claude/scripts/`](starter-files/.claude/scripts/) into your vault's `.claude/scripts/`. There are three runners — pick the one that matches your OS — plus a README.

**Windows (Task Scheduler):**

```powershell
cd C:\path\to\your-vault\.claude\scripts

# Sanity-check the runner without invoking Claude:
.\nightly-ingest.ps1 -DryRun

# Register the daily 02:00 task:
.\register-nightly-ingest.ps1
```

**Linux / macOS (cron):**

```bash
cd ~/your-vault/.claude/scripts
chmod +x nightly-ingest.sh

# Sanity-check:
VAULT_ROOT=~/your-vault ./nightly-ingest.sh --dry-run

# Add to crontab (runs daily at 02:00):
crontab -e
# 0 2 * * * VAULT_ROOT=/home/you/your-vault /home/you/your-vault/.claude/scripts/nightly-ingest.sh >> /tmp/second-brain-cron.log 2>&1
```

Both runners do the same thing:

1. **Acquire a lock** so two runs can't collide. Stale locks (>6h old) are reclaimed automatically.
2. **Snapshot `wiki/`** to `.claude/backups/nightly-ingest/` — your rollback path if a run goes sideways.
3. **Run `/ingest`** with a 30-minute timeout. Parse the sentinel footer for processed/created/updated/remaining counts.
4. **Run `/lint`** with a 30-minute timeout. Parse the issue count from the report.
5. **Write a status JSON** to `.claude/state/nightly-status.json` and a human-readable summary to `nightly-summary.md`.
6. **Release the lock**, exit with a code that maps to the outcome (0 ok, 2 lock contention, 3 partial, 1 fatal).

Inspect `.claude/state/nightly-status.json` after the first run to confirm everything works:

```json
{
  "startedAt": "2026-05-02T02:00:00Z",
  "completedAt": "2026-05-02T02:04:18Z",
  "vault": "/home/you/second-brain",
  "overall": "ok",
  "ingest": { "status": "ok", "processed": 5, "created": 3, "updated": 4, "remaining": 12 },
  "lint":   { "status": "ok", "issuesFound": 0 }
}
```

---

## Day-to-day rhythm

Once everything's wired up, the loop is small:

- **Add a source.** Drop a file into `raw/<subfolder>/`. The next `/ingest` (manual or overnight) absorbs it.
- **Capture a thought.** `claude /log "I just realized X about Y."`
- **Ask the wiki.** `claude /query "what did I decide about $thing?"` — Claude reads `index.md`, pulls the relevant pages, answers with citations, and tells you when the wiki doesn't know.
- **Browse.** Open the vault in Obsidian. The graph view shows you the shape of what you know.
- **Check health.** Once a week, `claude /lint` and accept the proposed fixes for orphans and broken links.

The capture hook + nightly cron mean you can mostly forget about the system. Things you do during the day land in `raw/` automatically; tomorrow they're in the wiki.

## What a real first run looks like

When I ran `/ingest` on my own vault for the first weekend, it processed 8 sources, wrote 10 new wiki pages, and stitched them together with cross-links automatically. One of those pages — `home-server` — mapped my whole home-lab infrastructure on first ingest, linking out to the OS, the config concepts, and prior project history I'd captured months earlier. I never pointed Claude at any of those connections; the frontmatter and wiki-link discipline did the work.

`/lint` on first run flagged 4 orphan pages. I asked Claude to fix the relations and it resolved all four. `/log` is what I reach for when I have a mid-day thought I want anchored to a specific project — typing `/log "kicking the fan-noise issue back into standby until I'm onsite"` quietly updates `home-server.md`'s frontmatter and appends the note to its log section, no manual file-finding required.

That weekend was the first time the system felt like a second brain instead of a folder of notes.

---

## Lessons learned (the parts that bit me)

- **Frontmatter discipline matters.** Skip a field once and `/lint` catches you forever. The schema looks heavy but it pays off the first time `/query` finds the right page through `related`.
- **Don't manually edit `raw/`.** It's the audit trail. Any mistake there is silent — you'll never know what got mangled.
- **The concurrency lock is non-negotiable.** Two `/ingest` runs racing each other will corrupt `log.md` because both append at the top. Keep the lock.
- **Sentinel footers are the secret sauce.** `/ingest` ends every successful run with `<!-- ingest-result: ... -->`. The cron parses that line. Without it, automation has nothing to reason about. Don't strip it.
- **Snapshot pages need the marker.** Anything dated and append-only (weekly summaries, handoffs) should include `<!-- standup: exclude snapshot -->` under the title so health checks don't flag them as stale.
- **Status JSON beats scraping logs.** Write structured run results from day one. `nightly-status.json` is the file every future automation will read.

---

## Going further (not covered here)

The four-command kernel above is the minimum viable second brain. There's more I built on top of it that didn't fit the "open-ended starter" goal of this guide:

- **Weekly `/standup` automation** that surfaces top 1–2 active projects with handoff briefs and posts to a chat channel.
- **A multi-agent council** (Daedalus for triage, Intel for research, Hermes for comms) that uses the vault as shared memory.
- **Auto-memory across Claude Code sessions** so context persists between conversations without you having to re-explain.

Those live in my private repo. If you've gotten the kernel working and want to discuss extensions, my contact is at the bottom of the showcase page.

---

## Credits

Built on [Claude Code](https://claude.com/claude-code). The structural pattern is inspired by Andrej Karpathy's LLM Wiki approach; the editor of choice is [Obsidian](https://obsidian.md/) but the vault is plain Markdown and works in anything. The conventions in `CLAUDE.md` are mine and yours to bend — the only rule I'd call non-negotiable is "raw/ is read-only."

---

Built and documented by **Dylan Palumbo**. Reach out: <machina.infastructure@gmail.com>.
