ColBERT Models
We're happy to introduce mixedbread ColBERT, our model family constructed following the ColBERT architecture. Enjoy access to peak reranking performance in concert with computational efficiency.
What's New in the mixedbread ColBERT Model Family?
With the recent release of the fresh and crunchy ColBERT model mxbai-colbert-large-v1, there's a new model family in our portfolio!
The model family now includes:
Model | Status | Context Length | Dimension | BEIR Average |
---|---|---|---|---|
mxbai-colbert-large-v1 | API unavailable | 512 | 1024 | 50.37 (Reranking) |
Why mixedbread ColBERT?
mixedbread ColBERT is a powerful, computationally efficient ColBERT model family - again fully open-source under the Apache 2.0 license! The new mxbai-colbert-large-v1 model outperforms the other open models that are currently available on the BEIR benchmark on most subsets as well as on average. Its scores even beat the levels typical for traditional cross-encoder based rerankers in spite of its resource efficiency advantages.
Reranking performance in NDCG@10:
Dataset | ColBERTv2 | Jina-ColBERT-v1 | mxbai-colbert-large-v1 |
---|---|---|---|
ArguAna | 29.99 | 33.42 | 33.11 |
ClimateFEVER | 16.51 | 20.66 | 20.85 |
DBPedia | 31.80 | 42.16 | 40.61 |
FEVER | 65.13 | 81.07 | 80.75 |
FiQA | 23.61 | 35.60 | 35.86 |
HotPotQA | 63.30 | 68.84 | 67.62 |
NFCorpus | 33.75 | 36.69 | 36.37 |
NQ | 30.55 | 51.27 | 51.43 |
Quora | 78.86 | 85.18 | 86.95 |
SCIDOCS | 14.90 | 15.39 | 16.98 |
SciFact | 67.89 | 70.20 | 71.48 |
TREC-COVID | 59.47 | 75.00 | 81.04 |
Webis-touché2020 | 44.22 | 32.12 | 31.70 |
Average | 43.08 | 49.82 | 50.37 |
How Can You Get Started Using mixedbread ColBERT Yourself?
Since our ColBERT model is not currently available via API, you'll need to get the model from Hugging Face and host it yourself. We recommend using our model with the framework RAGatouille. Please see the page on mxbai-colbert-large-v1 for more information!