The Bernoulli Model

Random approximate computation: the Bernoulli model, its papers, and the monograph that unifies them. Bloom filters are the special case; this is the general theory.

June 15, 2026

The one-sentence version. A Bernoulli set is a set you can store far below the information-theoretic cost of the exact set, in exchange for a known, tunable rate of false positives and false negatives. The Bernoulli model is the framework that makes that trade precise and then carries it everywhere: from sets to maps, relations, and arbitrary algebraic types, with closed-form error propagation through every operation.

A Bloom filter is the special case everyone already knows: false positives, no false negatives, one of the two error rates pinned to zero. The model generalizes it. Both error rates are first-class and tunable, the construction is provably space-optimal at the information bound, and the same algebra that governs a single membership test governs unions, intersections, joins, function composition, and the Boolean post-processing of noisy measurements.

The monograph: the center of gravity

Everything below is being drawn together into one book.

Bernoulli Types: An Algebra of Approximate Computation builds the model from a single atom, the Bernoulli Boolean, and composes outward: random approximate sets, their entropy and space-optimal construction, maps, relations, and a full algebra of approximate data types. Each paper below is roughly a chapter’s worth of the argument. The monograph is where they become one coherent theory instead of a stack of separate preprints, and it is the thing I most want read. It is in active development.

Read it: the monograph as a PDF (276 pages).

The papers

A family of preprints, each a self-contained result the monograph consolidates. All carry minted Zenodo DOIs.

Foundations

PaperWhat it establishes
Bernoulli Sets: A Model for Random Approximate SetsThe model itself: the two axioms (element-wise independence and conditional block independence), the false-positive / false-negative parameterization, the error-count distributions, and higher-order composition. The hub the rest build on.
Composition of Bernoulli SetsClosed-form error propagation through union, intersection, complement, and difference, the monoid structure, and the relational subset and equality predicates.
Binary Classification Measures for Random Approximate SetsPPV, NPV, accuracy, and Youden’s J as random variables, with prevalence sensitivity by the delta method.
Entropy and Space Complexity of Random Approximate SetsThe information-theoretic lower bound on bits per element and the space-accuracy duality that every optimal construction must meet.
The Bernoulli Model: A Probabilistic Framework for Data Structures and TypesThe type-theoretic generalization: Bool as the atom, approximate sum and product types, and the computational basis from which the rest is built.

Optimal constructions

PaperWhat it establishes
The Bernoulli Hash Function: Optimal Bernoulli Sets and Bernoulli MapsThe space-optimal construction that meets the entropy bound: the threshold predicate, generalized acceptance, and a unified set/map framework.
The Perfect Hash FilterA concrete, efficient implementation of the positive Bernoulli set and map via perfect hashing, with a ranked-search application.

Extensions

PaperWhat it adds
The Algebra of the Random Approximate Map ModelFunctions from X to Y as first-class approximate objects, and the error rates of map composition.
The Abstract Data Type of the Approximate RelationRelational algebra (selection, projection, join) over approximate relations, and a closure dichotomy for structural invariants: a class-induced invariant is free, transitivity admits no bounded closure.

Applications

PaperWhat it adds
A Forward (ε, ω) Calculus for Quantum Readout ErrorThe model as the error calculus for the classical post-processing of quantum measurement, validated on real device calibration. Under review at Quantum Science and Technology.

The software

bernoulli-types is the reference implementation in Python: the atom, the sets, the maps, and the sketches, built outward in phases that mirror the monograph’s parts, with optional plain-C acceleration and pure-Python fallbacks. It is both the research artifact and the book’s executable companion. Source.

Notes and posts

Shorter writing on the model, tagged bernoulli-model:

  • A Boolean Algebra Over Trapdoors · June 2023
    A Boolean algebra framework over trapdoors for cryptographic operations. Introduces a homomorphism from powerset Boolean algebra to n-bit strings via cryptographic hash functions, enabling secure …

Connection to trapdoor computing

The Bernoulli model is also the error model for Trapdoor Computing. At the mathematical level a Bernoulli map is a cipher map: the construction’s salt is the trapdoor key, the acceptance predicate implements the cipher map, and the space formula \(-\log_2(\varepsilon) + \mu\) governs both. The composition theorem \(\eta_{\text{total}} = 1 - \prod_i (1 - \eta_i)\) is the same law on both sides. If you came here from the cryptography, that is the bridge.


Alexander Towell, metafunctor.com.

Topics

#bernoulli model #random approximate sets #bloom filters #probabilistic data structures #perfect hashing #information theory #approximate computing