Reverse-Process Synthetic Data Generation for Math Reasoning
Training LLMs on mathematical reasoning by inverting easy-to-solve problems: generate derivatives, reverse them into integration exercises with full step-by-step solutions.
Browse posts by category
Training LLMs on mathematical reasoning by inverting easy-to-solve problems: generate derivatives, reverse them into integration exercises with full step-by-step solutions.
Fine-tuning a small language model to generate ElasticSearch DSL queries from natural language, as a proof of concept for domain-specific LLM specialization.
The most dramatic possibility in AI might arrive through the most mundane mechanism. Not a beam of sacred light. A sufficiently good build system.
The classical AI curriculum teaches rational agents as utility maximizers. The progression from search to RL to LLMs is really about one thing: finding representations that make decision-making tractable.
A corpus-based language model using suffix arrays for O(m log n) pattern matching. The corpus is the model.
A logic programming system that alternates between wake and sleep phases, using LLMs for knowledge generation during wake and compression-based learning during sleep.
Validating Context Tree Weighting through experiments, including a bug that changed everything.
Solomonoff induction, MDL, speed priors, and neural networks are all special cases of one Bayesian framework with four knobs.
The evolution of neural sequence prediction, and how it connects to classical methods
The bias-data trade-off in sequential prediction: when to use CTW, n-grams, or neural language models.
How RLHF-trained language models may develop instrumental goals, and the information-theoretic limits on detecting them.
I experiment with simple predictive / generative models to approximate Solomonoff induction for a relatively simple synthetic data-generating process.
Abstractions let us reason about complex systems despite our cognitive limits. But some systems resist compression entirely.
How the limited capacity of human working memory acts as regularization, shaping our reasoning and possibly preventing cognitive overfitting.
The classical approach to sequence prediction: counting and smoothing
Markov processes and tree sources: understanding where sequences come from
Model averaging over hypotheses, the principled way to handle uncertainty in prediction
The optimal predictor is incomputable. What we can learn from it anyway.
The problem of predicting what comes next, from compression to language models