Infini-gram: LLM-scale *-gram models
Recently, I watched a presentation on Infini-grams, which utilize a suffix array to avoid precomputing
This sparked my interest as I had worked on a similar project for a LLM talk I gave for SLUUG at https://www.stllinux.org (see my GitHub repo https://github.com/queelius/sluug-talk-llm and the video fo the talk at https://www.sluug.org/resources/presentations/media/2024/STLLINUX/2024-02-22_STLLINUX_2560x1440.mp4) where in part of the talk I demonstrated arbitrary-size
Since my data was sparse synthetic data (expression trees and their evaluations), I was able to use a relatively inefficient approach to compute very large
I started a project, n-gram projections, to work on concepts related to n-grams and how projections of the input onto the training data may be a way of thinking about OOD generalization and inductive biases. See …