September 15, 2025
KL-Threshold Routing Between LLMs: What Speculative Decoding Already Solved
In 2023 I drafted a paper on routing between a large and small LLM via KL-divergence thresholds. Speculative decoding had already solved the problem more rigorously. Here is the post-mortem.