Mixedbread

LlamaIndex

Integrate Mixedbread's powerful embedding and reranking capabilities into your LlamaIndex projects. This guide covers installation, quick start examples for both Python and TypeScript, advanced usage scenarios, and links to detailed documentation for seamless integration with your natural language processing workflows.

Quick Start

  1. Install the package:
Installation
pip install llama-index llama-index-embeddings-mixedbreadai llama-index-postprocessor-mixedbreadai-rerank
  1. Set up your API key:
export MXBAI_API_KEY=your_api_key_here
  1. Start using Mixedbread in your LlamaIndex projects!

Embeddings

Generate text embeddings for queries and documents.

from llama_index.embeddings import MixedbreadAIEmbedding
from llama_index.core import Settings
from llama_index.core import Document, VectorStoreIndex
 
# Set up embeddings
Settings.embed_model = MixedbreadAIEmbedding(
    api_key="your_api_key_here",
    model_name="mixedbread-ai/mxbai-embed-large-v1"
)
 
# Create and index a document
document = Document(text="The true source of happiness.", id_="bread")
index = VectorStoreIndex.from_documents([document])
 
# Query the index
query_engine = index.as_query_engine()
query = "Represent this sentence for searching relevant passages: What is bread?"
results = query_engine.query(query)
print(results)

Reranking

Reorder documents based on relevance to a query.

from llama_index import Document, VectorStoreIndex
from llama_index.embeddings import MixedbreadAIEmbedding
from llama_index.postprocessor import MixedbreadAIRerank
from llama_index import Settings
from llama_index.llms import OpenAI
 
# Set up OpenAI LLM
Settings.llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)
 
# Create and index a document
document = Document(text="This is a sample document.", id_="sampleDoc")
index = VectorStoreIndex.from_documents([document])
 
# Set up retriever
retriever = index.as_retriever(similarity_top_k=5)
 
# Set up reranker
node_postprocessor = MixedbreadAIRerank(
    api_key="your_api_key_here",
    top_n=4
)
 
# Create query engine with reranking
query_engine = index.as_query_engine(
    retriever=retriever,
    node_postprocessors=[node_postprocessor]
)
 
# Query
response = query_engine.query("Where did the author grow up?")
print(response)

Advanced Usage

Custom Embedding Parameters

You can customize the embedding process with additional parameters:

from llama_index.embeddings import MixedbreadAIEmbedding
 
embeddings = MixedbreadAIEmbedding(
    api_key="your_api_key_here",
    model_name="mixedbread-ai/mxbai-embed-large-v1",
    batch_size=64,
    normalized=True,
    dimensions=512,
    encoding_format="binary"
)
 
texts = ["Bread is life", "Bread is love"]
result = embeddings.get_text_embedding_batch(texts)
print(result)

Reranking with Objects

For more complex reranking scenarios:

from llama_index.postprocessor import MixedbreadAIRerank
 
reranker = MixedbreadAIRerank(
    api_key="your_api_key_here",
    model_name="mixedbread-ai/mxbai-rerank-large-v1",
    top_k=5,
    rank_fields=["title", "content"],
    return_input=True,
    max_retries=5
)
 
documents = [
    {"title": "Bread Recipe", "content": "To bake bread you need flour"},
    {"title": "Bread Recipe", "content": "To bake bread you need yeast"},
]
query = "What do you need to bake bread?"
result = reranker.postprocess_nodes(documents, query)
print(result)

Documentation

For detailed information on using Mixedbread with LlamaIndex, check out:

Need Help?

Happy baking with Mixedbread and LlamaIndex! 🍞🚀

On this page