mixedbread Embeddings

In this document, you'll learn all you need to know about the fascinating topic of embeddings, a crucial concept on the path towards the future of AI information retrieval.

What are embeddings?

Embeddings are numerical representations of different types of media, such as text or images, as vectors in an n-dimensional space. A good embedding model places semantically related objects closer to each other in the vector space. The embeddings can then be used to perform a variety of functions that require an understanding of context and semantic relationships, most importantly retrieval-augmented generation (RAG).

Specific use cases that embeddings can help you solve include:

  • Reranking: Reordering a list of documents based on their semantic relevance to a given query.
  • Classification: Classifying text strings into given labels by relevance.
  • Clustering: Grouping different text strings based on their similarity to each other.
  • Anomaly detection: Identifying data from a list of inputs that does not fit semantically.
  • Recommendations: Recommending items that contain text strings relevant to a given input.

What's next?

Now that you know about the power of embeddings, it's time to dive in and learn more about our powerful, size-efficient embedding model: .