The harness is mostly retrieval
Laurie Voss says applied-AI iteration has moved off the model and into "the harness". He’s right - and once you strip the new vocabulary, the harness is mostly a retrieval system.
I’m Chris — a software developer based in London. I’m currently Director of Engineering at Vimeo, leading the teams behind the platform that powers millions of videos. This is my personal blog, where I write about programming, search systems, and Elasticsearch.





Laurie Voss says applied-AI iteration has moved off the model and into "the harness". He’s right - and once you strip the new vocabulary, the harness is mostly a retrieval system.
xAI open-sourced the For You feed algorithm today. Three observations through a search-and-IR lens: two-tower’s quiet dominance, the retrieve/rank split surviving the bitter lesson, and recsys converging with search.
TurboQuant landed as a KV cache result, but the more interesting application might be ColBERT-style late interaction. Here’s the case, and the open questions.
A new quantisation method out of Google Research is making the rounds. Qdrant shipped it. Elastic ran the benchmarks and politely declined. Both responses tell you something useful.
DiskBBQ’s new filtered-search optimisation is the same architectural move I wrote about last week, applied one layer deeper. The pattern is fractal - and that’s what makes it useful.




