Writing on software design, company building, and Information Retrieval.

Occasional longer-form thoughts on programming, leadership, product design, and more, collected in chronological order.

Asymmetric query quantization in DiskBBQ

Ben Trent and Thomas Veasey shipped another DiskBBQ optimisation today: quantising queries against coarser parent centroids instead of per-document ones. 5x off the quantisation stack with no meaningful recall loss. The insight underneath is that the query path and the document path don’t have to be treated symmetrically.

SMART: late interaction without retraining

A new paper out this week shows the per-token hidden states of off-the-shelf single-vector embedders already carry the information needed for ColBERT-style MaxSim - and you can wire it in at inference time, without retraining. The late-interaction deployment barrier I most underestimated just dropped.

Faster similar-document search in Elasticsearch 9.4

Elasticsearch 9.4 adds query_vector_builder.lookup - a tiny API addition that collapses a two-request vector search into one and runs better than 3x faster. A small change with a big impact, and a look at where that ratio actually comes from.

SID-1: Train the loop, keep the index

SID AI’s SID-1 is the first retrieval model trained end-to-end with RL. Some observations through a search-and-IR lens: the middle of the retrieval pipeline collapses into one trained model, the NDCG reward gets deliberately bent toward recall, and the agentic-retrieval loop becomes a subagent you hand to a larger system.

The harness is mostly retrieval

Laurie Voss says applied-AI iteration has moved off the model and into "the harness". He’s right - and once you strip the new vocabulary, the harness is mostly a retrieval system.

xAI algorithm through a search lens

xAI open-sourced the For You feed algorithm today. Three observations through a search-and-IR lens: two-tower’s quiet dominance, the retrieve/rank split surviving the bitter lesson, and recsys converging with search.

What to make of TurboQuant

A new quantisation method out of Google Research is making the rounds. Qdrant shipped it. Elastic ran the benchmarks and politely declined. Both responses tell you something useful.

The pattern goes all the way down

DiskBBQ’s new filtered-search optimisation is the same architectural move I wrote about last week, applied one layer deeper. The pattern is fractal - and that’s what makes it useful.

A primer on late interaction

How ColBERT-style token-level matching fits between single-vector dense retrieval and cross-encoders, why MaxSim is the clever bit, and what the storage tax actually looks like in practice.

Today, Vimeo goes public

Vimeo spins out from IAC and begins trading on Nasdaq under the ticker VMEO.

FOSDEM 2015

Another year at FOSDEM — Vimeo's open source talk, the dedicated Open Source Search track, and a closing keynote from a Mars One astronaut candidate.

I've Joined Vimeo

Joining the Vimeo team to work on the search platform after an amazing run at DueDil.

Elasticsearch 1.0 launched: An overview

A run-down of the headline features in Elasticsearch 1.0 — Snapshot/Restore, the cat API, the redesigned percolator, and the new Aggregations framework.

FOSDEM 2014: a retrospective

A weekend in Brussels at FOSDEM — Elasticsearch 1.0 ahead of launch, plus PostgreSQL JSON, Redis, MongoDB, and YARN talks.

Elasticsearch Snapshot Restore Overview

A walkthrough of the new Snapshot/Restore API arriving in Elasticsearch 1.0 — incremental backups for your cluster via a simple REST endpoint.

Elasticsearch Aggregations Overview

A look at the new Aggregations framework arriving in Elasticsearch 1.0 — multi-level, nested calculations that go far beyond what Facets could do.

London Elasticsearch User Group Presentation

A talk at the London Elasticsearch meetup on how DueDil uses Elasticsearch — bulk indexing, and using Facets to add depth to search.

Using Elasticsearch on Amazon EC2

Setting up an Elasticsearch cluster on EC2: installing the AWS cloud plugin, configuring discovery, and watching nodes find each other.

DueDil: Trust the data

DueDil's intro video — a quick look at what we're building.