This is the bound, systematic version of the Inductive Biases in Neural Networks series. Eight chapters, three parts, built on the pure-Python scratchnn library. It is pedagogical, not new research: there is no theorem here that was not already known. What it offers is one lens applied all the way through, with the hand-derived backward passes, the actual code, and the experiments the posts only gesture at.

A chapter-by-chapter HTML edition, rendered from the LaTeX source, is in progress and will be embedded here. For now the PDF is below.

Discussion & Related

Attention Weight Is Not Information Flow

The trained pointer model reads exactly the right memory cell, provably. Its attention barely shows where. The gap, and the causal probe that closes it.

June 9, 2026 · 4 min read

Discussion