December 3, 2025
Infinigram: Corpus-Based Language Modeling via Suffix Arrays with LLM Probability Mixing
Browse posts by tag
A corpus-based language model using suffix arrays for O(m log n) pattern matching. The corpus is the model.
The classical approach to sequence prediction: counting and smoothing