Vectorizing the Trie: Efficient Constrained Decoding for LLM-based Generative Retrieval on Accelerators Paper • 2602.22647 • Published 12 days ago • 3