Recommended Reading

Each list below pairs with a series and extends it with the canonical literature of its topic.

Entries marked ✦ are the works I would hand someone starting out; the rest is depth. Each list ends with a short “How this list is opinionated” section that states what is included, what is deliberately left out, and why.

Minds and Machines: Recommended Reading
Tweet Share Share
Recommended Reading
This list extends the Minds and Machines series with the technical and philosophical literature the essays and fiction are in conversation with. Entries marked ✦ are the works I would hand someone starting out; the rest is depth.
The sections mirror the layered argument of the series: first the overview-level texts, then the specific technical problems (mesa-optimization, deceptive alignment, reward modeling), then interpretability, then the philosophical substrate the fiction draws on.
Foundations
The books you read first if you want to understand why anyone thinks alignment is hard.
- Human Compatible by Russell (2019) [book] ✦. The “AI should be uncertain about its objective” framing. Probably the clearest single entry point. Publisher.
- Superintelligence by Bostrom (2014) [book] ✦. The instrumental-convergence and orthogonality-thesis book. Dense, alarmist on purpose, still the reference.
- The Alignment Problem by Christian (2020) [book]. More historical and journalistic; useful counterpoint to Bostrom’s philosophical register.
- Life 3.0 by Tegmark (2017) [book]. Broader speculative arc; weaker on technical depth but good at framing the stakes for general audiences.
Technical Alignment
Specific problems and specific attempts at solving them. The papers the essays reference.
- Concrete Problems in AI Safety by Amodei, Olah, Steinhardt, Christiano, Schulman, Mané (2016) [paper] ✦. The canonical problem-taxonomy paper: reward hacking, side effects, safe exploration, distributional shift. arXiv.
- Risks from Learned Optimization in Advanced Machine Learning Systems by Hubinger, van Merwijk, Mikulik, Skalse, Garrabrant (2019) [paper] ✦. Mesa-optimization and deceptive alignment. The technical spine under “The Policy” novel. arXiv.
- Deep Reinforcement Learning from Human Preferences by Christiano, Leike, Brown, Martic, Legg, Amodei (2017) [paper]. RLHF, the method behind modern instruction-tuned LLMs. arXiv.
- Scalable Agent Alignment via Reward Modeling by Leike, Krueger, Everitt, Martic, Maini, Legg (2018) [paper]. Recursive reward modeling as a scalable oversight proposal. arXiv.
- AI Safety via Debate by Irving, Christiano, Amodei (2018) [paper]. Debate as an alignment protocol. arXiv.
- The Alignment Problem from a Deep Learning Perspective by Ngo, Chan, Mindermann (2022) [paper]. Updated survey after RLHF landed. arXiv.
- Unsolved Problems in ML Safety by Hendrycks, Carlini, Schulman, Steinhardt (2021) [paper]. A research-agenda paper covering robustness, monitoring, alignment, systemic safety. arXiv.
Interpretability
The empirical substrate that makes alignment tractable, or shows why it is not.
Read More →
RL Assumptions: Recommended Reading
Tweet Share Share
Recommended Reading
This list extends the RL Assumptions series with the literature the interactive posts are in conversation with. Entries marked ✦ are the works I would hand someone starting out; the rest is depth.
The sections mirror the series structure: the foundational textbook first, then the classical tabular and linear methods, then deep RL, then the model-based and universal-agent frame that anchors the series’ final synthesis.
Foundations
The books you read before everything else.
- Reinforcement Learning: An Introduction by Sutton, Barto (2018, 2nd ed.) [book] ✦. The canonical text. Free PDF from the authors. incompleteideas.net.
- Dynamic Programming and Optimal Control by Bertsekas (2017, Vol I 4th ed.) [book]. The control-theoretic counterpart. Denser than Sutton-Barto but complementary.
- Algorithms for Reinforcement Learning by Szepesvári (2010) [book]. Short, rigorous, free.
- Reinforcement Learning Course by Silver (2015) [course]. David Silver’s DeepMind lectures. The standard video curriculum.
Tabular and Linear Methods
The classical core: Q-learning, SARSA, TD, and their convergence theory.
- Learning from Delayed Rewards by Watkins (1989) [paper]. The Q-learning thesis.
- Q-Learning by Watkins, Dayan (1992) [paper] ✦. The convergence-to-optimal proof for tabular Q-learning. Machine Learning 8.
- Learning to Predict by the Methods of Temporal Differences by Sutton (1988) [paper]. The original TD-learning paper. Machine Learning 3.
- An Analysis of Temporal-Difference Learning with Function Approximation by Tsitsiklis, Van Roy (1997) [paper] ✦. Why linear TD converges and nonlinear TD may not. IEEE Trans. Automatic Control 42.
- Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning by Williams (1992) [paper]. REINFORCE, the ancestor of every modern policy-gradient method. Machine Learning 8.
- Policy Gradient Methods for Reinforcement Learning with Function Approximation by Sutton, McAllester, Singh, Mansour (2000) [paper]. The policy-gradient theorem. NIPS.
Deep RL
The practical explosion after 2013, and the architectures that made function approximation work.
- Human-Level Control through Deep Reinforcement Learning by Mnih, Kavukcuoglu, Silver, et al. (2015) [paper] ✦. DQN. The paper that showed deep RL works. Nature 518. arXiv.
- Continuous Control with Deep Reinforcement Learning by Lillicrap, Hunt, Pritzel, et al. (2016) [paper]. DDPG; deep RL in continuous action spaces. arXiv.
- Trust Region Policy Optimization by Schulman, Levine, Moritz, Jordan, Abbeel (2015) [paper]. TRPO. arXiv.
- Proximal Policy Optimization Algorithms by Schulman, Wolski, Dhariwal, Radford, Klimov (2017) [paper] ✦. PPO. The workhorse of modern policy-gradient RL. arXiv.
- Soft Actor-Critic by Haarnoja, Zhou, Abbeel, Levine (2018) [paper]. SAC. Maximum-entropy RL done right. arXiv.
- A Distributional Perspective on Reinforcement Learning by Bellemare, Dabney, Munos (2017) [paper]. C51 and the distributional-RL program. arXiv.
Model-Based and Universal RL
The horizon the series closes on: representation is prior, and the limit of the prior is AIXI.
Read More →
Sequential Prediction: Recommended Reading
Tweet Share Share
Recommended Reading
This list extends the Sequential Prediction series with the formal theory behind predicting xₙ from x₁..xₙ₋₁. Entries marked ✦ are the works I would hand someone starting out; the rest is depth.
The sections mirror the series’ arc: information-theoretic foundations first (prediction equals compression), then the CTW line specifically, then universal prediction (the Solomonoff connection), then modern neural sequence models.
Information Theory
The framework that unifies compression, coding, and prediction.
- A Mathematical Theory of Communication by Shannon (1948) [paper] ✦. The founding paper. Entropy, channel capacity, source coding. Bell System Technical Journal 27. PDF.
- Elements of Information Theory by Cover, Thomas (2006, 2nd ed.) [book] ✦. The standard graduate text. Entropy, mutual information, channel coding, source coding.
- Information Theory, Inference, and Learning Algorithms by MacKay (2003) [book] ✦. Free PDF from the author; pairs information theory with Bayesian ML. Inference Group.
Context-Tree Weighting
The specific algorithm the series focuses on, and its immediate lineage.
- The Context-Tree Weighting Method: Basic Properties by Willems, Shtarkov, Tjalkens (1995) [paper] ✦. The CTW paper. Exact Bayesian model averaging over tree sources, provable redundancy bounds. IEEE Trans. Information Theory 41(3).
- The Context-Tree Weighting Method: Extensions by Willems (1998) [paper]. Beta versions of the estimator, longer contexts. IEEE Trans. Information Theory 44.
- The Performance of Universal Source Coding Methods by Rissanen (1984) [paper]. The redundancy bounds CTW achieves.
- The Minimum Description Length Principle by Grünwald (2007) [book]. CTW is an MDL method at heart.
Universal Prediction
The optimal-but-incomputable ideal, and its computable approximations.
- A Formal Theory of Inductive Inference by Solomonoff (1964) [paper] ✦. Solomonoff induction as the universal predictor. Parts I and II.
- Universal Artificial Intelligence by Hutter (2005) [book] ✦. AIXI: Solomonoff induction plus utility maximization. Publisher.
- An Introduction to Kolmogorov Complexity and Its Applications by Li, Vitányi (2019, 4th ed.) [book]. The algorithmic-information-theory reference.
- Optimal Sequential Prediction in Unknown Sparse Environments by Vovk (various) [paper]. Vovk’s prediction-with-experts line generalizes CTW-style regret bounds.
Data Compression
The practical side of the prediction-compression connection.
- A Universal Algorithm for Sequential Data Compression by Ziv, Lempel (1977) [paper] ✦. LZ77, the ancestor of most modern general-purpose compressors. IEEE Trans. Information Theory 23(3).
- Compression of Individual Sequences via Variable-Rate Coding by Ziv, Lempel (1978) [paper]. LZ78.
- Arithmetic Coding for Data Compression by Witten, Neal, Cleary (1987) [paper]. The practical implementation of arithmetic coding, the bridge from probabilistic prediction to bits on disk. CACM 30(6).
Neural Sequence Modeling
The modern counterpart: expressive models, less theory, sometimes better in practice.
Read More →
SICP: Recommended Reading
Tweet Share Share
Recommended Reading
This list extends the SICP series with the companion texts, the Scheme-family pedagogy, and the interpreter-and-PL literature SICP points at. Entries marked ✦ are the works I would hand someone starting out; the rest is depth.
The sections are short on purpose: SICP already does the heavy lifting, and most of these books extend or refine specific moves SICP makes.
The Book and Its Courses
- Structure and Interpretation of Computer Programs by Abelson, Sussman (1996, 2nd ed.) [book] ✦. Free online. mitpress.mit.edu.
- MIT 6.001 Lectures by Abelson, Sussman (1986) [course] ✦. The original video lectures; age better than most 1980s software lectures. MIT OCW.
- Structure and Interpretation of Classical Mechanics by Sussman, Wisdom (2015, 2nd ed.) [book] ✦. SICP applied to physics. Makes the “programming as a medium for ideas” argument concrete in a new domain. Free online.
- Functional Differential Geometry by Sussman, Wisdom (2013) [book]. Same authors, same method, applied to tensors. The clearest treatment of covariant derivatives I have read.
SICP-Adjacent Texts
Successors and parallel developments.
- How to Design Programs by Felleisen, Findler, Flatt, Krishnamurthi (2018, 2nd ed.) [book] ✦. HtDP: the structured-recipes pedagogy that grew out of SICP teaching experience. Free online.
- Concrete Abstractions by Hailperin, Kaiser, Knight (1999) [book]. A SICP-inspired text that stays with Scheme longer and handles concurrency more carefully.
- The Little Schemer by Friedman, Felleisen (1995, 4th ed.) [book]. The Socratic introduction to recursion. Short, essential.
- The Seasoned Schemer by Friedman, Felleisen (1996) [book]. Continuations and higher-order abstractions.
- The Reasoned Schemer by Friedman, Byrd, Kiselyov, Hemann (2018, 2nd ed.) [book] ✦. Logic programming in the Little-Schemer style. miniKanren.
- The Little Typer by Friedman, Christiansen (2018) [book]. Dependent types in the same style.
Interpreters and Programming Languages
The direct downstream: implementing the languages that SICP teaches you to reason about.
- Essentials of Programming Languages by Friedman, Wand (2008, 3rd ed.) [book] ✦. The classic interpreter-design text; pairs perfectly with SICP Chapter 4-5.
- Programming Languages: Application and Interpretation by Krishnamurthi (2017, 3rd ed.) [book]. Modern PL-design pedagogy. Free online.
- Paradigms of Artificial Intelligence Programming by Norvig (1991) [book] ✦. Classic AI algorithms implemented in Common Lisp. A graduate course in software design as much as in AI.
- On Lisp by Graham (1993) [book]. Macros and the fully-expressed Lisp-as-meta-language style. Free from the author.
- Lisp in Small Pieces by Queinnec (1996) [book]. Implementations of Scheme at multiple levels of sophistication, including ones with continuations and first-class environments.
- Crafting Interpreters by Nystrom (2021) [book]. Modern, practical, covers both a tree-walking and a bytecode interpreter. Free online.
How this list is opinionated
The thread: computation is a medium for expressing ideas, and good abstractions come from understanding the algebraic structure of what you are computing. SICP is the axiomatic statement; the rest of this list either extends the method to new domains (mechanics, types, logic) or implements the languages the method uses.
Read More →
Statistical Reliability: Recommended Reading
Tweet Share Share
Recommended Reading
This list extends the Statistical Reliability series with the canonical literature of the field. Entries marked ✦ are the works I would hand someone starting out; the rest is depth.
The list is organized to mirror the series: the foundational survival-analysis canon first, then parametric models, then the specific thread the series develops (competing risks under masking), then the likelihood machinery, then software.
Foundations
The graduate-level canon. If you are going to own two books on survival analysis, these are the two.
- The Statistical Analysis of Failure Time Data by Kalbfleisch & Prentice (2002, 2nd ed.) [book] ✦. Censoring, likelihood construction, parametric and semiparametric models, competing risks. The reference I reach for first.
- Survival Analysis: Techniques for Censored and Truncated Data by Klein & Moeschberger (2003, 2nd ed.) [book] ✦. The applied-stats counterpoint to Kalbfleisch-Prentice. Worked examples, cleaner for self-study.
- Regression Models and Life-Tables by Cox (1972) [paper] ✦. The proportional hazards model. JRSS B.
- Nonparametric Estimation from Incomplete Observations by Kaplan & Meier (1958) [paper]. The product-limit estimator. Foundational for anything involving right-censored data. JASA.
Parametric Models
The series works almost entirely in parametric land: specifying distribution families for component lifetimes and estimating parameters via MLE. These are the texts that made that setting rigorous.
- Statistical Methods for Reliability Data by Meeker, Escobar & Pascual (2022, 2nd ed.) [book] ✦. The reliability-engineering companion to Kalbfleisch-Prentice. Strong on parametric likelihood, Fisher information, large-sample theory, accelerated life testing.
- A Statistical Distribution Function of Wide Applicability by Weibull (1951) [paper]. The original Weibull paper. Short, historically essential. J. Appl. Mech. 18.
- Statistical Models and Methods for Lifetime Data by Lawless (2003, 2nd ed.) [book]. Encyclopedic parametric reference; covers what the other two skim.
- Applied Life Data Analysis by Nelson (1982, Wiley reprint 2005) [book]. Older but strong on engineering applications.
Competing Risks and Masked Data
The specific thread the series develops. Classical competing risks assumes the cause of each failure is observed; the series relaxes that assumption. This section traces the prior art that treatment builds on.
- The Analysis of Failure Times in the Presence of Competing Risks by Prentice, Kalbfleisch, Peterson, Flournoy, Farewell, Breslow (1978) [paper] ✦. Canonical framing of cause-specific hazards and the identifiability issues competing risks creates. Biometrics 34.
- Estimating Component Reliability from Masked System Life Data by Flehinger, Reiser & Yashchin (1996; extended in 1998, 2001, 2002) [paper] ✦. A decade-long program on inference under masked component causes. Directly in the series’ lineage.
- Bayesian Estimation of Component Reliability from Masked System Life Data by Reiser, Flehinger & Conn (1996) [paper]. The Bayesian counterpart to the frequentist program above.
- A Nonparametric Estimation of Component Reliability from Masked System Life Test Data by Guess, Usher & Hodgson (1991) [paper]. Earlier treatment of the masked-cause problem; one of the starting points the foundation paper revisits.
- Inference Based on the Weibull Model from Masked-Failure Data by Lin, Usher & Guess (1993) [paper]. Classical parametric (Weibull) treatment of masked series systems. Essentially the point of departure for the master’s thesis.
Likelihood Theory
The likelihood-based foundation the whole series rests on. The R-package ecosystem (algebraic.mle, likelihood.model, and friends) is a direct expression of ideas in these works.
Read More →
Stepanov: Recommended Reading
Tweet Share Share
Recommended Reading
This list extends the Stepanov series with the canon that the principle “algorithms arise from algebraic structure” belongs to. Entries marked ✦ are the works I would hand someone starting out; the rest is depth.
The sections mirror the series’ two layers: first Stepanov’s own writings (the primary source), then the functional-algebra-of-programming lineage, then the practical C++ machinery that implements generic programming in modern form, then the type-theoretic and category-theoretic substrate.
The Stepanov Canon
Read these in this order.
- From Mathematics to Generic Programming by Stepanov, Rose (2015) [book] ✦. The accessible entry point: starts with integer arithmetic and GCD, ends with groups, rings, and STL design. The book to read first.
- Elements of Programming by Stepanov, McJones (2009) [book] ✦. The rigorous axiomatic reference. Dense but definitive.
- Notes on Programming by Stepanov (2007) [notes]. The unpolished version of the same material, with his raw reasoning visible. Free PDF. stepanovpapers.com.
- Efficient Programming with Components (A9 Lectures) by Stepanov [course]. The video series covering the same material for watching. YouTube.
Algebra of Programming
The functional lineage: reasoning about programs as equational manipulations of algebraic structures.
- Algebra of Programming by Bird, de Moor (1997) [book] ✦. The canonical treatment of program derivation by equational reasoning over algebraic data types. Out of print but the free draft circulates.
- Purely Functional Data Structures by Okasaki (1998) [book] ✦. Algebraic specifications as data-structure designs. The thesis (free) is also worth reading.
- Pearls of Functional Algorithm Design by Bird (2010) [book]. Worked examples of equational reasoning applied to specific problems.
- The Essence of Functional Programming by Wadler (1992) [paper] ✦. Monads as the algebraic structure under effectful computation. POPL.
Modern C++ Generic Programming
The practical machinery that implements Stepanov’s ideas in a language.
- C++ Templates: The Complete Guide by Vandevoorde, Josuttis, Gregor (2017, 2nd ed.) [book] ✦. The reference for template metaprogramming, concepts, and the compile-time side of generic code.
- Effective Modern C++ by Meyers (2014) [book] ✦. Why the modern language looks the way it does. Required for anyone writing C++ since C++11.
- A Tour of C++ by Stroustrup (2022, 3rd ed.) [book]. The language designer’s short-and-readable overview of modern features.
- C++20: The Complete Guide by Josuttis (2022) [book]. The concepts mechanism that finally made Stepanov’s style first-class.
- Range-v3 by Niebler [software]. The library that inspired C++20 ranges; generic-programming-as-pipelines done rigorously.
Type Theory and Category Theory
The theoretical backdrop: type systems as algebraic structures, programs as arrows.
Read More →
The Learning Problem: Recommended Reading
Tweet Share Share
Recommended Reading
This list extends The Learning Problem series with the formal theory that the essays summarize and apply. Entries marked ✦ are the works I would hand someone starting out; the rest is depth.
The sections mirror the series’ argument: Solomonoff’s optimal but incomputable starting point, then Bayesian inference as the tractable principled framework, then the statistical learning theory that bounds what is actually learnable, then MDL as the compression-prediction connection, then online learning for the adversarial-sequence setting.
Universal Induction
The theoretical optimum, what it means, and why you cannot run it.
- A Formal Theory of Inductive Inference by Solomonoff (1964) [paper] ✦. Parts I and II. The original Solomonoff induction paper. Incomputable but optimal under every reasonable Bayes loss. Part I, Part II.
- Universal Artificial Intelligence by Hutter (2005) [book] ✦. AIXI: the formal universal-agent construction, with rigorous convergence and optimality analysis. Publisher.
- An Introduction to Kolmogorov Complexity and Its Applications by Li, Vitányi (2019, 4th ed.) [book] ✦. The reference for algorithmic information theory, Kolmogorov complexity, incompressibility, and the Solomonoff prior.
- A Theory of Universal Artificial Intelligence Based on Algorithmic Complexity by Hutter (2000) [paper]. Shorter AIXI derivation if you want the machinery without the full book.
Bayesian Foundations
The principled framework every practical learning algorithm approximates.
- Probability Theory: The Logic of Science by Jaynes (2003) [book] ✦. Bayesian reasoning derived from desiderata, not assumed. The epistemological spine under every informal “all induction is Bayesian” claim.
- Information Theory, Inference, and Learning Algorithms by MacKay (2003) [book] ✦. Bayes, information theory, and neural networks together. Free PDF from the author. Inference Group.
- Bayesian Data Analysis by Gelman, Carlin, Stern, Dunson, Vehtari, Rubin (2013, 3rd ed.) [book]. The applied-modeling counterpart to Jaynes.
- Pattern Recognition and Machine Learning by Bishop (2006) [book]. The graduate textbook that taught a generation the Bayesian view of ML. Now free from the publisher.
Statistical Learning Theory
What is learnable from finite samples, under what assumptions, with what sample complexity.
- A Theory of the Learnable by Valiant (1984) [paper] ✦. The PAC-learning framework. CACM 27(11).
- The Nature of Statistical Learning Theory by Vapnik (1998, 2nd ed.) [book] ✦. VC dimension, structural risk minimization, the statistical foundations of SVMs.
- Understanding Machine Learning: From Theory to Algorithms by Shalev-Shwartz, Ben-David (2014) [book]. Modern graduate text; best single source for PAC, VC, Rademacher, online learning. Free PDF from the authors.
- Patterns, Predictions, and Actions by Hardt, Recht (2022) [book]. Short, modern, covers learning theory and the fairness/causality machinery that older texts skip. Free online.
- Foundations of Machine Learning by Mohri, Rostamizadeh, Talwalkar (2018, 2nd ed.) [book]. The standard reference for formal generalization bounds.
Minimum Description Length
Prediction as compression, compression as prediction. The practical cousin of Solomonoff.
Read More →
The Long Echo: Recommended Reading
Tweet Share Share
Recommended Reading
This list extends The Long Echo series with the philosophical and technical literature behind its approach to digital preservation. Entries marked ✦ are the works I would hand someone starting out; the rest is depth.
The sections mirror the Long Echo argument: first the Unix-philosophy discipline that motivates plain-text everything, then the explicit digital-preservation literature, then the memex lineage that frames personal archives as thinking tools, then the philosophical substrate on memory and legacy.
Unix Philosophy
Plain text, composable tools, narrow interfaces. The cultural substrate under graceful degradation.
- The Unix Programming Environment by Kernighan, Pike (1984) [book] ✦. The book that taught a generation to build tools that compose. Still the cleanest statement of what “Unix philosophy” means in practice.
- The Practice of Programming by Kernighan, Pike (1999) [book] ✦. Simpler-is-better applied to software in general.
- The Art of Unix Programming by Raymond (2003) [book]. The explicit philosophical treatment: rule of separation, rule of composition, rule of transparency. Free online from the author.
- A Small Matter of Programming by Nardi (1993) [book]. End-user programming, a parallel tradition compatible with Long Echo’s “users own their data” stance.
Digital Preservation
The specific literature on keeping bits readable across decades.
- Avoiding Technological Quicksand: Finding a Viable Technical Foundation for Digital Preservation by Rothenberg (1999) [paper] ✦. The RLG report that crystallized the emulation-versus-migration debate. Still the best single entry point.
- OAIS Reference Model (ISO 14721) [standard]. The standard vocabulary for archival systems (SIP, AIP, DIP). Formal, useful as a thinking framework even when you ignore the formalism.
- Paradigm: A Workbook on Digital Private Papers by Bodleian Libraries (2008) [book] ✦. Free online. Practical playbook for preserving personal digital records; rare as a resource aimed at the personal rather than institutional scale.
- The Theory and Craft of Digital Preservation by Owens (2018) [book]. Modern survey; connects archival theory to current practice. Free from Johns Hopkins Press.
Personal Knowledge Archives
The memex-to-now lineage.
- As We May Think by Bush (1945) [paper] ✦. The memex essay. Where the idea of a personal associative archive comes from. Atlantic Monthly. Atlantic.
- Augmenting Human Intellect: A Conceptual Framework by Engelbart (1962) [paper] ✦. The framework paper behind the Mother of All Demos; explicit about tools as cognitive amplifiers.
- Literary Machines by Nelson (1981) [book]. Xanadu’s vision of bidirectional hypertext. Never built, but the ideas have outlived the system.
- Tools for Thought by Rheingold (1985, 2000 rev.) [book]. The history of personal computing as an attempt to realize Bush and Engelbart. Free online from the author.
- How to Take Smart Notes by Ahrens (2017) [book]. Zettelkasten method; the closest contemporary expression of the memex impulse. Practical, narrow, useful.
Memory, Identity, Legacy
The philosophical questions the fiction side of the series engages.
Read More →
Trapdoor Computing: Recommended Reading
Tweet Share Share
Recommended Reading
This list extends the Trapdoor Computing series with the prior art that the cipher-map framework reframes. Entries marked ✦ are the works I would hand someone starting out; the rest is depth.
The sections mirror the structure of the series: classical searchable encryption first, then oblivious-access primitives, then the approximate-set data structures the Bernoulli Model generalizes, then the information-theoretic foundations.
Searchable Encryption
The classical literature on search over encrypted data. The framework under which “cipher map” can be read as a generalization.
- Practical Techniques for Searches on Encrypted Data by Song, Wagner, Perrig (2000) [paper] ✦. The paper that started the field. IEEE S&P.
- Searchable Symmetric Encryption: Improved Definitions and Efficient Constructions by Curtmola, Garay, Kamara, Ostrovsky (2006) [paper] ✦. Established the modern security definitions (IND-CKA1, IND-CKA2) for SSE. IACR.
- Secure Indexes by Goh (2003) [paper]. Z-IDX, the Bloom-filter-backed secure index. One of the direct ancestors of the cipher-map construction. IACR.
- Public Key Encryption with Keyword Search by Boneh, Di Crescenzo, Ostrovsky, Persiano (2004) [paper]. PEKS: the public-key counterpart to SSE. IACR.
- Privacy Preserving Keyword Searches on Remote Encrypted Data by Chang, Mitzenmacher (2005) [paper]. Extends Goh with keyword indexes.
Oblivious Computation
Access-pattern hiding and oblivious execution. The lineage that motivates the cipher-map focus on representation uniformity and indistinguishability of access.
- Software Protection and Simulation on Oblivious RAMs by Goldreich, Ostrovsky (1996) [paper] ✦. The foundational ORAM paper. JACM 43(3).
- Path ORAM: An Extremely Simple Oblivious RAM Protocol by Stefanov, van Dijk, Shi, Fletcher, Ren, Yu, Devadas (2013) [paper]. Modern, practical ORAM. CCS.
- Candidate Indistinguishability Obfuscation and Functional Encryption for all Circuits by Garg, Gentry, Halevi, Raykova, Sahai, Waters (2013) [paper]. The iO construction. Adjacent but conceptually important for anyone thinking about functional concealment.
Approximate Set Data Structures
The engineering half of the series. These are the constructions the Bernoulli Model generalizes and the cipher-map implements.
- Space/Time Trade-offs in Hash Coding with Allowable Errors by Bloom (1970) [paper] ✦. The Bloom filter. One of the most-cited data-structures papers in CS. CACM 13(7).
- Cuckoo Filter: Practically Better Than Bloom by Fan, Andersen, Kaminsky, Mitzenmacher (2014) [paper]. Hash-cuckoo-based approximate set with deletions and better space efficiency. CoNEXT.
- Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters by Graf, Lemire (2020) [paper]. Tight approximate-set construction approaching the information-theoretic lower bound.
- Hash, Displace, and Compress by Belazzougui, Botelho, Dietzfelbinger, Pagh (2009) [paper] ✦. CHD: one of the cleanest minimal perfect hash function constructions.
- RecSplit: Minimal Perfect Hashing via Recursive Splitting by Esposito, Graf, Vigna (2020) [paper]. Practical near-optimal MPHF.
- Network Applications of Bloom Filters: A Survey by Broder, Mitzenmacher (2004) [paper]. The classic survey.
Information-Theoretic Foundations
Where the model grounds its claims about confidentiality, leakage, and the optimality of specific constructions.
Read More →

Recommended Reading

Recommended Reading

Foundations

Technical Alignment

Interpretability

Recommended Reading

Foundations

Tabular and Linear Methods

Deep RL

Model-Based and Universal RL

Recommended Reading

Information Theory

Context-Tree Weighting

Universal Prediction

Data Compression

Neural Sequence Modeling

Recommended Reading

The Book and Its Courses

SICP-Adjacent Texts

Interpreters and Programming Languages

How this list is opinionated

Recommended Reading

Foundations

Parametric Models

Competing Risks and Masked Data

Likelihood Theory

Recommended Reading

The Stepanov Canon

Algebra of Programming

Modern C++ Generic Programming

Type Theory and Category Theory

Recommended Reading

Universal Induction

Bayesian Foundations

Statistical Learning Theory

Minimum Description Length

Recommended Reading

Unix Philosophy

Digital Preservation

Personal Knowledge Archives

Memory, Identity, Legacy

Recommended Reading

Searchable Encryption

Oblivious Computation

Approximate Set Data Structures

Information-Theoretic Foundations