ColBERT

Learn about the ColBERT architecture, a powerful approach that combines vector search and cross-encoders for efficient and effective reranking and retrieval tasks. Discover Mixedbread's state-of-the-art ColBERT models and their performance on BEIR benchmarks.

In this documentation, you'll learn all you need to know about the ColBERT architecture, a way to enable great reranking and retrieval performance without the computational needs of traditional cross-encoders.

What are the Traditional Approaches?

The typical search approach uses the same model to encode both documents and queries. We then choose a metric, such as cosine similarity, to measure the distance between the query and the documents. However, there is an issue with that: our model has to determine the optimal placement within the latent space, so that query and relevant documents are positioned closely together, but there is no interaction between query and document within the model.

On the other hand, we have models like cross-encoders. With cross-encoders, the query and documents are fed to the model together, improving search accuracy. Unfortunately, cross-encoders are extremely compute-intensive, since we need to pass all possible combinations of documents and queries to the model. Therefore, these models are not suitable for large-scale search and are mostly used for reranking.

What is the ColBERT Architecture?

ColBERT stands for Contextualized Late Interaction BERT, and it combines both vector search and cross-encoders. In ColBERT, the queries and the documents are first encoded separately. However, instead of creating a single embedding for the entire document, ColBERT generates contextualized embeddings for each token in the document. To search, the token-level query embeddings are compared with the token-level embeddings of the documents using the lightweight scoring function MaxSim. This allows ColBERT to capture more nuanced matching signals while still being computationally efficient. The resulting scores are then used to rank the documents based on their relevance to the query.

Similarity scoring process of query and document in a ColBERT model

While ColBERT can be used for both reranking and retrieval tasks, we mainly recommend using it for reranking and taking advantage of our powerful embedding model for retrieval-related use cases.

Mixedbread ColBERT Models

With the recent release of the fresh and crunchy ColBERT model mxbai-colbert-large-v1, there's a new model family in our portfolio!

The model family now includes:

Model	Status	Context Length	Dimension	BEIR Average
mxbai-colbert-large-v1	API unavailable	512	1024	50.37 (Reranking)

Why mixedbread-colbert?

mixedbread-colbert is a powerful, computationally efficient ColBERT model family - again fully open-source under the Apache 2.0 license! The new mxbai-colbert-large-v1 model outperforms the other open models that are currently available on the BEIR benchmark on most subsets as well as on average. Its scores even beat the levels typical for traditional cross-encoder based rerankers in spite of its resource efficiency advantages.

Reranking performance in NDCG@10:

Dataset	ColBERTv2	Jina-ColBERT-v1	mxbai-colbert-large-v1
ArguAna	29.99	33.42	33.11
ClimateFEVER	16.51	20.66	20.85
DBPedia	31.80	42.16	40.61
FEVER	65.13	81.07	80.75
FiQA	23.61	35.60	35.86
HotPotQA	63.30	68.84	67.62
NFCorpus	33.75	36.69	36.37
NQ	30.55	51.27	51.43
Quora	78.86	85.18	86.95
SCIDOCS	14.90	15.39	16.98
SciFact	67.89	70.20	71.48
TREC-COVID	59.47	75.00	81.04
Webis-touché2020	44.22	32.12	31.70
Average	43.08	49.82	50.37

How Can You Get Started Using mixedbread-colbert Yourself?