Speech and Language Processing (3rd ed. draft)
Notes
The canonical NLP book, updated for the LLM era.
Browse posts by tag
The canonical NLP book, updated for the LLM era.
Seminal blog post demonstrating char-level RNN power. Shakespeare, LaTeX, kernel code generation.
Seminal blog post demonstrating the power of character-level RNNs. Shows Shakespeare generation, Wikipedia generation, LaTeX generation, and Linux kernel code generation. The visualizations of LSTM cells are particularly illuminating.
A corpus-based language model using suffix arrays for O(m log n) pattern matching. The corpus is the model.
A mathematical framework that treats language models as algebraic objects with compositional structure.
The evolution of neural sequence prediction, and how it connects to classical methods
The bias-data trade-off in sequential prediction: when to use CTW, n-grams, or neural language models.
The classical approach to sequence prediction: counting and smoothing