Below you will find pages that utilize the taxonomy term “Algebraic_hashing”
The Beautiful Deception: How 256 Bits Pretend to be Infinity
July 1, 2024
How do you store infinity in 256 bits? You don’t. But you can fake it well enough that no bounded observer can tell the difference. This paper is about that deception, why it works, and what it tells us about randomness.
The impossible oracle
A random oracle maps any input to an infinite sequence of perfectly random bits. Try to implement one and you fail immediately:
- Memory unboundedness: each new query exhausts memory
- Non-serializability: can’t save/restore state
- Non-reproducibility: each instance generates different values
- Non-distributability: can’t share across systems
This isn’t a limitation of current hardware. It’s a constructive proof that true random oracles can’t exist in our computational universe.
The lie that works
From this impossibility comes something useful:
class LazyDigest:
def __init__(self, seed):
self.seed = seed
def __getitem__(self, index):
return hash(seed || index)[0]
256 bits of entropy generating what appears to be an infinite random sequence.
The deception:
- Appears: infinite random sequence
- Actually: deterministic function with 256 bits of state
- Information content: K(LazyDigest) = 256 bits + constant
- Apparent information: infinite
We’re achieving a compression ratio of infinity, representing unbounded data with bounded information.
Why it works
Computational indistinguishability. If h is a secure PRF, no polynomial-time algorithm can tell LazyDigest apart from truly random output. We’re not random. We’re computationally hard to distinguish from random. This weaker guarantee is sufficient for all of cryptography.
Since LazyDigest has finite state, it must eventually cycle. After at most 2^256 queries, it repeats. But the expected cycle length is 2^128, roughly 10^38. At a billion queries per second, cycling takes about 10^21 years. The universe is roughly 10^10 years old.
Advanced constructions
Hierarchical seeding extends the effective period:
epoch_seed = h(master_seed || "epoch" || epoch)
chunk_seed = h(epoch_seed || "chunk" || chunk)
value = h(chunk_seed || position)[0]
XOR multi-hash hedges against individual algorithm failures:
result = sha256(seed||index)[0] ^ sha512(seed||index)[0] ^
sha3_256(seed||index)[0] ^ blake2b(seed||index)[0]
The system stays secure if at least one hash function holds. This hedges against future cryptanalysis, quantum vulnerabilities, and implementation bugs.
Sponge construction reserves capacity bits that never leave the system, providing tunable security with 2^(capacity/2) collision resistance.
Random oracles and uncomputable reals
Most real numbers are uncomputable. They require infinite information to specify. A true random oracle is the cryptographic analog:
- Computable reals (like pi, e): measure zero, finite programs
- Uncomputable reals: measure one, infinite information
- LazyDigest: computable, appears random
- Random oracle: uncomputable, truly random
We’re using computable functions to approximate uncomputable ones.
Algebraic Hashing: Composable Hash Functions Through XOR
November 1, 2022
Most hash libraries treat hash functions as opaque blobs. You put data in, you get bits out, and that’s the end of the story. Algebraic Hashing takes a different approach: it exposes the mathematical structure underneath, so you can compose hash functions like algebraic expressions. And because this is C++20 with concepts and templates, the composition resolves entirely at compile time. Zero runtime overhead.
The observation
Hash functions form an abelian group under XOR:
- Closure:
h1 XOR h2is still a valid hash function - Associativity:
(h1 XOR h2) XOR h3 = h1 XOR (h2 XOR h3) - Identity: XOR with zero
- Inverses: each hash is its own inverse under XOR
This is a clean algebraic structure, and it’s the foundation for everything that follows.
What you can do with it
Compile-time composition. Using C++20 concepts and template metaprogramming:
auto composed = fnv1a<> ^ sha256<>;
auto hash = composed("data"); // Zero runtime overhead
All composition resolves at compile time. No virtual dispatch, no function pointers.
Provable properties. XOR composition preserves uniform distribution (under independence), avalanche effect, and collision resistance for cryptographic hashes. These aren’t just empirical observations. They follow from the group structure.
Universal interface. Works with any hash function: non-cryptographic (FNV-1a), perfect (FKS), or cryptographic (SHA-256). The algebra doesn’t care about the implementation details.
Practical uses
- Domain separation:
hash_user ^ hash_timestampprevents collision attacks across domains - Perfect hashing: FKS two-level scheme with pluggable base hash functions
- Composite keys: hash multiple fields independently, then XOR
- Type-based hashing: different hash functions for different types, composed generically
Connection to oblivious computing
This work shares DNA with my oblivious computing research. The common thread is making mathematical structure explicit in the type system. Just as Bernoulli types enforce privacy invariants algebraically, algebraic hashing enforces compositional invariants through group theory. Same philosophy, different domain.
The library is header-only C++20 with zero-cost abstractions via concepts.