Mixedbread

Unstructured

Seamlessly integrate the Mixedbread AI API with Unstructured. Learn how to get started with embeddings in Python.

Integrate Mixedbread AI's embeddings capabilities into your projects.

Quick Start

  1. Install the package:
pip install unstructured unstructured[embed-mixedbreadai]
  1. Set up your API key:
import os
os.environ["MXBAI_API_KEY"] = "your_api_key_here"
  1. Start using Mixedbread AI in your Unstructured projects!

Embeddings

Generate text embeddings for queries and documents.

import os
from unstructured.embed.mixedbreadai import (
    MixedbreadAIEmbeddingConfig,
    MixedbreadAIEmbeddingEncoder,
)
from unstructured.documents.elements import Text
 
embedding_encoder = MixedbreadAIEmbeddingEncoder(
    config=MixedbreadAIEmbeddingConfig(
        api_key=os.getenv("MXBAI_API_KEY"),
        model_name="mixedbread-ai/mxbai-embed-large-v1",
    )
)
 
elements = embedding_encoder.embed_documents(
    elements=[Text("Bread is life"), Text("Bread is love")]
)
 
query = "Represent this sentence for searching relevant passages: What is bread?"
query_embedding = embedding_encoder.embed_query(query)
 
[print(element.embedding) for element in elements]
print(query_embedding)
print(embedding_encoder.is_unit_vector, embedding_encoder.num_of_dimensions)

Advanced Usage

Custom Embedding Parameters

You can customize the embedding process with additional parameters:

import os
from unstructured.embed.mixedbreadai import (
    MixedbreadAIEmbeddingConfig,
    MixedbreadAIEmbeddingEncoder,
)
from unstructured.documents.elements import Text
 
embeddings = MixedbreadAIEmbeddingEncoder(
    config=MixedbreadAIEmbeddingConfig(
        api_key=os.getenv("MXBAI_API_KEY"),
        model_name="mixedbread-ai/mxbai-embed-large-v1",
        batch_size=64,
        normalized=True,
        dimensions=512,
        encoding_format="binary"
    )
)
 
elements = embedding_encoder.embed_documents(
    elements=[Text("Bread is life"), Text("Bread is love")]
)
 
query = "Represent this sentence for searching relevant passages: What is bread?"
query_embedding = embedding_encoder.embed_query(query)
 
[print(element.embedding) for element in elements]
print(query_embedding)
print(embedding_encoder.is_unit_vector, embedding_encoder.num_of_dimensions)

Documentation

For detailed information on using Mixedbread AI with Unstructured, check out:

Need Help?

Happy baking with Mixedbread AI and Unstructured! 🍞🚀

On this page