Content-Addressable-Memory

Browse posts by tag

Attention Is a Learned Pointer Dereference

An attention head is a learned content-addressable lookup: a query matches keys, retrieves a value, exactly like dereferencing a pointer. Depth is how many lookups you can compose.