mxbai-rerank-xsmall-v1
mxbai-rerank-xsmall-v1 is a highly efficient, open-source reranking model designed to enhance search results with semantic relevance. As the most compact model in the Mixedbread rerank family, it offers a balance of good performance and minimal resource requirements, making it ideal for improving search systems with minimal infrastructure changes.
API Reference
Reranking
Model Reference
mxbai-rerank-xsmall-v1
Blog Post
Boost Your Search With The Crispy Mixedbread Rerank Models
mxbai-rerank-xsmall-v1 is not available via API. Please use mxbai-rerank-large-v1 instead.
Model description
mxbai-rerank-xsmall-v1 is part of the Mixedbread rerank family, a set of best-in-class reranking models that are fully open-source under the Apache 2.0 license. These models are designed to boost search results by adding a semantic layer to existing search systems, making it easier to find relevant results.
The models were trained using a large collection of real-life search queries and the top-10 results from search engines for these queries. First, a large language model ranked the results according to their relevance to the query. These signals were then used to train the rerank models. Experiments show that these models significantly boost search performance, particularly for complex and domain-specific queries.
When used in combination with a keyword-based search engine, such as Elasticsearch, OpenSearch, or Solr, the reranking model can be added to the end of an existing search workflow, allowing users to incorporate semantic relevance into their keyword-based search system without changing the existing infrastructure. This is an easy, low-complexity method of improving search results by introducing semantic search technology into a user's stack with one line of code.
mxbai-rerank-xsmall-v1 is the most capacity-efficient model in the Mixedbread rerank family, offering good performance with a slight increase in non-relevant result scores at a very small size. On a subset of 11 BEIR datasets, mxbai-rerank-xsmall-v1 achieves an NDCG@10 score of 43.9 and an Accuracy@3 score of 70.0, outperforming lexical search while being much smaller than other reranking models.
Recommended Sequence Length | Language |
---|---|
512 | English |
Suitable Scoring Methods
- Model Output: The model directly scores the relevance of each document to the query. You can use the model output directly. If you want a score between 0 and 1, you can use the sigmoid function on the scores.
Limitations
- Language: mxbai-rerank-large-v1 is trained on English text and is specifically designed for the English language.
- Sequence Length: The suggested maximum sequence length is 512 tokens. Longer sequences may be truncated, leading to a loss of information. Please note that max sequence length is for the query and document combined. It means that
len(query) + len(document)
should not be longer than 512 tokens.