Langchain Embeddings

Embedding Functions

Chroma and Langchain both offer embedding functions which are wrappers on top of popular embedding models.

Unfortunately Chroma and LC’s embedding functions are not compatible with each other. Below we offer two adapters to convert Chroma’s embedding functions to LC’s and vice versa.

Links:

Chroma Built-in Langchain Adapter

As of version 0.5.x Chroma offers a built-in two-way adapter to convert Langchain’s embedding function to an adapted embeddings that can be used by both LC and Chroma. Implementation can be found here.

  1. # pip install chromadb langchain langchain-huggingface langchain-chroma
  2. import chromadb
  3. from chromadb.utils.embedding_functions import create_langchain_embedding
  4. from langchain_huggingface import HuggingFaceEmbeddings
  5. langchain_embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
  6. ef = create_langchain_embedding(langchain_embeddings)
  7. client = chromadb.PersistentClient(path="/test_folder_1")
  8. collection = client.get_or_create_collection(name="name_1", embedding_function=ef)

Custom Adapter

Here is the adapter to convert Chroma’s embedding functions to LC’s:

  1. from langchain_core.embeddings import Embeddings
  2. from chromadb.api.types import EmbeddingFunction
  3. class ChromaEmbeddingsAdapter(Embeddings):
  4. def __init__(self, ef: EmbeddingFunction):
  5. self.ef = ef
  6. def embed_documents(self, texts):
  7. return self.ef(texts)
  8. def embed_query(self, query):
  9. return self.ef([query])[0]

Here is the adapter to convert LC’s embedding function s to Chroma’s:

  1. from langchain_core.embeddings import Embeddings
  2. from chromadb.api.types import EmbeddingFunction, Documents
  3. class LangChainEmbeddingAdapter(EmbeddingFunction[Documents]):
  4. def __init__(self, ef: Embeddings):
  5. self.ef = ef
  6. def __call__(self, input: Documents) -> Embeddings:
  7. # LC EFs also have embed_query but Chroma doesn't support that so we just use embed_documents
  8. # TODO: better type checking
  9. return self.ef.embed_documents(input)

Example Usage

Using Chroma Embedding Functions with Langchain:

  1. # pip install chromadb langchain langchain-huggingface langchain-chroma
  2. from langchain.vectorstores.chroma import Chroma
  3. from chromadb.utils.embedding_functions import SentenceTransformerEmbeddingFunction
  4. texts = ["foo", "bar", "baz"]
  5. docs_vectorstore = Chroma.from_texts(
  6. texts=texts,
  7. collection_name="docs_store",
  8. embedding=ChromaEmbeddingsAdapter(SentenceTransformerEmbeddingFunction(model_name="all-MiniLM-L6-v2")),
  9. )

Using Langchain Embedding Functions with Chroma:

  1. # pip install chromadb langchain langchain-huggingface langchain-chroma
  2. from langchain_huggingface import HuggingFaceEmbeddings
  3. import chromadb
  4. client = chromadb.Client()
  5. collection = client.get_or_create_collection("test", embedding_function=LangChainEmbeddingAdapter(
  6. HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")))
  7. collection.add(ids=["1", "2", "3"], documents=["foo", "bar", "baz"])

July 15, 2024