These posts explore the philosophy and practice of generic programming as taught by Alex Stepanov, architect of the C++ Standard Template Library.

The central insight: algorithms arise from algebraic structure. The same algorithm that works on integers works on matrices, polynomials, and modular integers—not by accident, but because they share algebraic properties. When you recognize “this is a monoid” or “this is a Euclidean domain,” you immediately know which algorithms apply.

Each post demonstrates one beautiful idea with minimal, pedagogical C++ code (~100-300 lines). The implementations prioritize clarity over cleverness—code that reads like a textbook that happens to compile.

Core Topics

  • Monoids and exponentiation: The Russian peasant algorithm reveals the universal structure of repeated operations
  • Rings and fields: Modular arithmetic teaches which algorithms apply where
  • Euclidean domains: The same GCD algorithm works for integers and polynomials
  • Generic concepts: C++20 concepts express algebraic requirements as compile-time contracts

Further Reading

  • Stepanov & Rose, From Mathematics to Generic Programming
  • Stepanov & McJones, Elements of Programming
  • Finite algebraic structures and what they teach us about algorithms

    The Stepanov Perspective

    Alex Stepanov’s central insight: algorithms arise from algebraic structure. The same algorithm that works on integers works on matrices, polynomials, and modular integers—not by accident, but because they share algebraic properties.

    Integers modulo N form a ring: a set with addition and multiplication satisfying familiar laws. When N is prime, it’s a field—every non-zero element has a multiplicative inverse. Understanding these structures teaches us which algorithms apply where.

    The Ring Z/NZ

    Integers modulo N, written Z/NZ, are equivalence classes:

    • 0 = {…, -N, 0, N, 2N, …}
    • 1 = {…, -N+1, 1, N+1, 2N+1, …}

    Operations are inherited from integers:

    [a] + [b] = [a + b]
    [a] x [b] = [a x b]
    

    The implementation keeps one representative per class, in [0, N):

    template<int64_t N>
    struct mod_int {
        int64_t v;  // Always in [0, N)
    
        static constexpr int64_t normalize(int64_t x) {
            x %= N;
            return x < 0 ? x + N : x;
        }
    
        constexpr mod_int(int64_t x) : v(normalize(x)) {}
    };
    

    Ring Axioms

    A ring (R, +, x) satisfies:

    Addition forms an abelian group:

    • Associative: (a + b) + c = a + (b + c)
    • Identity: a + 0 = a
    • Inverses: a + (-a) = 0
    • Commutative: a + b = b + a

    Multiplication is a monoid:

    • Associative: (a x b) x c = a x (b x c)
    • Identity: a x 1 = 1 x a = a

    Distributive:

    • a x (b + c) = a x b + a x c
    • (b + c) x a = b x a + c x a

    These axioms enable algorithms. Power-by-squaring works because multiplication is associative. The extended GCD works because of the ring structure.

    Fermat’s Little Theorem

    When N is prime, something special happens: every non-zero element has a multiplicative inverse. The set of non-zero elements forms a multiplicative group of order N-1.

    Fermat’s Little Theorem: for prime p and a not congruent to 0 (mod p):

    a^(p-1) = 1 (mod p)
    

    This gives us the inverse:

    a x a^(p-2) = a^(p-1) = 1 (mod p)
    

    So a^(p-2) is the multiplicative inverse of a:

    constexpr mod_int inverse() const {
        return pow(N - 2);  // Using repeated squaring
    }
    

    The Connection to Peasant

    The power function uses the same peasant algorithm from elsewhere:

    Read More →
  • How the Russian peasant algorithm reveals the universal structure of exponentiation

    The Algorithm

    Russian peasants had a clever method for multiplication that doesn’t require memorizing times tables. To compute 23 x 17:

    23    17
    11    34     (halve, double)
     5    68
     2   136
     1   272
    

    Add the right column wherever the left is odd: 17 + 34 + 68 + 272 = 391. That’s 23 x 17.

    Why does this work? Because we’re really computing:

    23 x 17 = (16 + 4 + 2 + 1) x 17 = 16x17 + 4x17 + 2x17 + 17
    

    The algorithm only needs three operations on the multiplier:

    • half(n) — integer division by 2
    • even(n) — test if divisible by 2
    • Addition on the result

    From Multiplication to Exponentiation

    Here’s the insight that makes this interesting: the same algorithm computes powers.

    Replace “add to accumulator” with “multiply into accumulator” and “double the multiplicand” with “square the base”:

    T power(T base, int exp) {
        T result = 1;
        while (exp > 0) {
            if (!even(exp)) result = result * base;
            base = base * base;
            exp = half(exp);
        }
        return result;
    }
    

    This is O(log n) multiplications instead of O(n). Computing 2^1000 takes about 10 multiplications, not 1000.

    The Monoid Connection

    The peasant algorithm works whenever you have:

    1. An associative binary operation *
    2. An identity element 1 where 1 * x = x * 1 = x

    This structure is called a monoid. The algorithm computes x * x * ... * x (n times) using O(log n) operations.

    What makes this powerful is that many things form monoids:

    Type Operation Identity Computing x^n gives you…
    Integers x 1 Powers
    Matrices x I Matrix powers
    Strings concat "" String repetition
    Functions compose id Function iteration
    Permutations compose id Permutation powers
    Quaternions x 1 Rotation composition

    Examples in Code

    Fibonacci via Matrix Exponentiation

    The Fibonacci recurrence F(n) = F(n-1) + F(n-2) can be encoded as matrix multiplication:

    [F(n+1)]   [1 1]^n   [1]
    [F(n)  ] = [1 0]   x [0]
    

    Computing F(1,000,000) takes about 20 matrix multiplications:

    mat2 fib_matrix{1, 1, 1, 0};
    mat2 result = power(fib_matrix, 1000000);
    // result.b is F(1,000,000)
    

    Quaternion Rotations

    A rotation by angle theta around axis (x,y,z) is a unit quaternion. Composing rotations is quaternion multiplication. To rotate by theta x n:

    auto rot = rotation_z(theta);
    auto rot_n = power(rot, n);  // O(log n) multiplications
    

    Shortest Paths via Tropical Semiring

    In the tropical semiring, “addition” is min and “multiplication” is +. Matrix “multiplication” computes path lengths. Powers find multi-hop paths:

    Read More →