sequential-prediction
Research framework for sequential prediction algorithms, starting with Context Tree Weighting (CTW)
Resources & Distribution
Sequential Prediction Research Hub
A comprehensive research framework for studying sequential prediction algorithms, starting with Context Tree Weighting (CTW) and designed for expansion to other approaches.
Overview
Sequential prediction—forecasting the next symbol in a sequence—is foundational to:
- Data compression (arithmetic coding, prediction by partial matching)
- Language modeling (from n-grams to transformers)
- Time series forecasting
- Reinforcement learning (model-based RL, world models)
This repository provides implementations, experiments, and educational documentation that connects classical information-theoretic approaches to modern machine learning.
Installation
# Clone the repository
git clone https://github.com/username/sequential-prediction.git
cd sequential-prediction
# Install in development mode
pip install -e .
# Install with dev dependencies (pytest, coverage)
pip install -e ".[dev]"
Quick Start
from experiments.ctw import CTW, generate_sequence
# Generate a test sequence from a 3rd-order Markov process
seq = generate_sequence(10000, seed=42)
# Create a CTW predictor with depth 3
ctw = CTW(max_depth=3)
# Evaluate accuracy
accuracy = ctw.accuracy(seq)
print(f"Prediction accuracy: {accuracy:.1%}")
Project Structure
sequential-prediction/
├── README.md # This file
├── CLAUDE.md # Claude Code guidance
├── pyproject.toml # Package configuration
│
├── docs/ # Project-wide theory
│ ├── index.md # Overview and motivation
│ ├── problem_overview.md # The sequence prediction problem
│ ├── dgp.md # Data Generating Processes
│ ├── solomonoff_induction.md # Theoretical foundation
│ ├── bayesian_prediction.md # Bayesian inference perspective
│ ├── ngram_models.md # Classical n-gram approaches
│ ├── neural_models.md # LLMs and neural approaches
│ └── comparison.md # Trade-offs and comparison
│
├── blog/ # Articles about experiments
│ └── README.md
│
└── experiments/
└── ctw/ # CTW experiment module
├── README.md # CTW overview, how to run
├── src/
│ ├── __init__.py
│ ├── ctw.py # Core CTW implementation
│ └── generators/
│ ├── __init__.py
│ ├── markov.py # Markov process generators
│ └── README.md
├── tests/
│ ├── __init__.py
│ └── test_ctw.py
├── scripts/
│ ├── __init__.py
│ ├── ctw_paper.py # Paper reproduction
│ └── ctw_multiorder.py # Multi-order analysis
└── results/
└── experimental_results.md
Key Features
CTW Implementation
- Pure Python implementation of Context Tree Weighting
- Efficient O(depth) updates and predictions
- Numerically stable log-space computations
- Comprehensive docstrings with examples
Markov Generators
- Configurable order and transition probabilities
- Pre-defined processes (paper’s 3rd-order, sparse, structured)
- Theoretical maximum accuracy computation
Experiments
- Reproduce results from the original CTW paper
- Multi-order experiments across DGP orders and CTW depths
- Configurable train/test splits and random seeds
Running Tests
# Run all tests
python -m pytest experiments/ctw/tests/ -v
# With coverage
python -m pytest experiments/ctw/tests/ --cov=experiments/ctw/src --cov-report=term-missing
Running Experiments
# Quick multi-order experiment
python -m experiments.ctw.scripts.ctw_multiorder --quick
# Full paper reproduction
python -m experiments.ctw.scripts.ctw_paper --quick
Documentation
The docs/ directory contains educational documentation covering:
- Problem Overview: Formal definition of sequential prediction
- Data Generating Processes: Understanding the source of sequences
- Solomonoff Induction: The theoretical ideal (and why it’s incomputable)
- Bayesian Prediction: The framework underlying CTW
- N-gram Models: Classical language modeling
- Neural Models: Modern deep learning approaches
- Comparison: Trade-offs between approaches
Why CTW?
CTW offers a unique combination of properties:
| Property | CTW |
|---|---|
| Sample Efficiency | Excellent—learns from limited data |
| Theoretical Guarantees | Strong—provably optimal for tree sources |
| Interpretability | High—explicit context tree structure |
| Computational Cost | Minimal—O(depth) per symbol |
This makes it an ideal starting point for understanding sequence prediction before diving into more complex (but less theoretically grounded) neural approaches.
Future Directions
This repository is designed to grow with additional prediction models:
- Prediction by Partial Matching (PPM)
- Hidden Markov Models
- Recurrent Neural Networks (simple baselines)
- Connections to compression algorithms (PAQ, CMIX)
References
Willems, F. M. J., Shtarkov, Y. M., & Tjalkens, T. J. (1995). “The context-tree weighting method: basic properties.” IEEE Transactions on Information Theory, 41(3), 653-664.
Solomonoff, R. J. (1964). “A formal theory of inductive inference.” Information and Control, 7(1-2), 1-22.
License
MIT License