Reverse-Process Synthetic Data Generation for Math Reasoning
Training LLMs on mathematical reasoning by inverting easy-to-solve problems: generate derivatives, reverse them into integration exercises with full step-by-step solutions.
Browse posts by category
Training LLMs on mathematical reasoning by inverting easy-to-solve problems: generate derivatives, reverse them into integration exercises with full step-by-step solutions.
My master's project on maximum likelihood estimation for series systems with right-censored and masked failure data.
Many structures come in pairs: forward/reverse AD, push/pull iteration, encode/decode. Recognizing duality lets you transfer theorems and insights between domains.
A reflection on eleven explorations in generic programming, and how algorithms arise from algebraic structure.
Closed-form MLEs and Fisher information for exponential series systems with masked failure data. No numerical optimization required.
A C++17 header-only library that formalizes a pattern behind FFT, logarithmic arithmetic, and Bayesian inference: transform to a domain where your target operation is cheap.
A C++ header-only library that treats disjoint interval sets as proper mathematical objects with Boolean algebra operations.
A Python library for rule-based term rewriting with pattern matching, multiple input formats, and an interactive REPL.
Formalizing oblivious computing through cipher maps and algebraic cipher types, using category theory for functorial composition of privacy-preserving transformations.
Three approaches to computing derivatives, forward-mode AD, reverse-mode AD, and finite differences, each with different trade-offs for numerical computing and machine learning.
Space bounds, entropy requirements, and cryptographic security properties of perfect hash functions.
Numerical integration meets generic programming. By requiring only ordered field operations, the quadrature routines work with dual numbers, giving you differentiation under the integral for free.
The Bernoulli Model is a framework for reasoning about probabilistic data structures by treating noisy outputs as Bernoulli-distributed approximations of latent values, from Booleans to set-indicator functions.
Reverse-mode automatic differentiation is just the chain rule applied systematically. I built one in C++20 to understand what PyTorch and JAX are actually doing.
Choosing step size h for finite differences: small enough for a good approximation, not so small that floating-point errors eat your lunch.
Dual numbers extend the reals with an infinitesimal epsilon where epsilon^2 = 0. Evaluate f(x + epsilon) and you get f(x) + f'(x)*epsilon. The derivative falls out of the algebra.
elementa is a linear algebra library built to teach. Every design decision prioritizes clarity over cleverness. Code that reads like a textbook and compiles.
The same GCD algorithm works for integers and polynomials because both are Euclidean domains. One structure, many types, same algorithms.
Rational numbers give exact arithmetic where floating-point fails. The implementation connects GCD, the Stern-Brocot tree, and the algebraic structure of fields.
The Miller-Rabin primality test demonstrates how probabilistic algorithms achieve arbitrary certainty, trading absolute truth for practical efficiency.
Integers modulo N form a ring, an algebraic structure that determines which algorithms apply. Understanding this structure unlocks algorithms from cryptography to competitive programming.
The Russian peasant algorithm computes products, powers, Fibonacci numbers, and more, once you see the underlying algebraic structure.