LlamaIndex Integrate Mixedbread's powerful embedding and reranking capabilities into your LlamaIndex projects. This guide covers installation, quick start examples for both Python and TypeScript, advanced usage scenarios, and links to detailed documentation for seamless integration with your natural language processing workflows.
Install the package:
python typescript
Installation Copy to clipboard pip install llama-index llama-index-embeddings-mixedbreadai llama-index-postprocessor-mixedbreadai-rerank
Set up your API key:
Copy to clipboard export MXBAI_API_KEY = your_api_key_here
Start using Mixedbread in your LlamaIndex projects!
Generate text embeddings for queries and documents.
python typescript
Copy to clipboard from llama_index.embeddings import MixedbreadAIEmbedding
from llama_index.core import Settings
from llama_index.core import Document, VectorStoreIndex
# Set up embeddings
Settings.embed_model = MixedbreadAIEmbedding (
api_key = "your_api_key_here" ,
model_name = "mixedbread-ai/mxbai-embed-large-v1"
)
# Create and index a document
document = Document ( text = "The true source of happiness." , id_ = "bread" )
index = VectorStoreIndex. from_documents ([document])
# Query the index
query_engine = index. as_query_engine ()
query = "Represent this sentence for searching relevant passages: What is bread?"
results = query_engine. query (query)
print (results)
Learn more about Mixedbread Embeddings
Reorder documents based on relevance to a query.
python typescript
Copy to clipboard from llama_index import Document, VectorStoreIndex
from llama_index.embeddings import MixedbreadAIEmbedding
from llama_index.postprocessor import MixedbreadAIRerank
from llama_index import Settings
from llama_index.llms import OpenAI
# Set up OpenAI LLM
Settings.llm = OpenAI ( model = "gpt-3.5-turbo" , temperature = 0.1 )
# Create and index a document
document = Document ( text = "This is a sample document." , id_ = "sampleDoc" )
index = VectorStoreIndex. from_documents ([document])
# Set up retriever
retriever = index. as_retriever ( similarity_top_k = 5 )
# Set up reranker
node_postprocessor = MixedbreadAIRerank (
api_key = "your_api_key_here" ,
top_n = 4
)
# Create query engine with reranking
query_engine = index. as_query_engine (
retriever = retriever,
node_postprocessors = [node_postprocessor]
)
# Query
response = query_engine. query ( "Where did the author grow up?" )
print (response)
Learn more about Mixedbread Reranking
You can customize the embedding process with additional parameters:
python typescript
Copy to clipboard from llama_index.embeddings import MixedbreadAIEmbedding
embeddings = MixedbreadAIEmbedding (
api_key = "your_api_key_here" ,
model_name = "mixedbread-ai/mxbai-embed-large-v1" ,
batch_size = 64 ,
normalized = True ,
dimensions = 512 ,
encoding_format = "binary"
)
texts = [ "Bread is life" , "Bread is love" ]
result = embeddings. get_text_embedding_batch (texts)
print (result)
For more complex reranking scenarios:
python typescript
Copy to clipboard from llama_index.postprocessor import MixedbreadAIRerank
reranker = MixedbreadAIRerank (
api_key = "your_api_key_here" ,
model_name = "mixedbread-ai/mxbai-rerank-large-v1" ,
top_k = 5 ,
rank_fields = [ "title" , "content" ],
return_input = True ,
max_retries = 5
)
documents = [
{ "title" : "Bread Recipe" , "content" : "To bake bread you need flour" },
{ "title" : "Bread Recipe" , "content" : "To bake bread you need yeast" },
]
query = "What do you need to bake bread?"
result = reranker. postprocess_nodes (documents, query)
print (result)
For detailed information on using Mixedbread with LlamaIndex, check out:
Happy baking with Mixedbread and LlamaIndex! 🍞🚀