Chroma Clients

Chroma Settings Object

The below is only a partial list of Chroma configuration options. For full list check the code chromadb.config.Settings or the ChromaDB Configuration page.

Persistent Client

To create your a local persistent client use the PersistentClient class. This client will store all data locally in a directory on your machine at the path you specify.

  1. import chromadb
  2. from chromadb.config import DEFAULT_TENANT, DEFAULT_DATABASE, Settings
  3. client = chromadb.PersistentClient(
  4. path="test",
  5. settings=Settings(),
  6. tenant=DEFAULT_TENANT,
  7. database=DEFAULT_DATABASE,
  8. )

Parameters:

  1. path - parameter must be a local path on the machine where Chroma is running. If the path does not exist, it will be created. The path can be relative or absolute. If the path is not specified, the default is ./chroma in the current working directory.
  2. settings - Chroma settings object.
  3. tenant - the tenant to use. Default is default_tenant.
  4. database - the database to use. Default is default_database.

Positional Parameters

Chroma PersistentClient parameters are positional, unless keyword arguments are used.

Uses of Persistent Client

The persistent client is useful for:

  • Local development: You can use the persistent client to develop locally and test out ChromaDB.
  • Embedded applications: You can use the persistent client to embed ChromaDB in your application. This means that you can ship Chroma bundled with your product or services, thus simplifying the deployment process.
  • Simplicity: If you do not wish to incur the complexities associated with setting up and operating a Chroma server (arguably Hosted-Chroma will resolve this).
  • Data privacy: If you are working with sensitive data and do not want to store it on a remote server.
  • Optimize performance: If you want to reduce latency.

The right tool for the job

When evaluating the use of local PersistentClient one should always factor in the scale of the application. Similar to SQLite vs Posgres/MySQL, PersistentClient vs HTTPClient with Chroma server, application architectural characteristics (such as complexity, scale, performance etc) should be considered when deciding to use one or the other.

HTTP Client

Chroma also provides HTTP Client, suitable for use in a client-server mode. This client can be used to connect to a remote ChromaDB server. The HTTP client can operate in synchronous or asynchronous mode (see examples below)

Python SyncPython AsyncJavaScriptGoLang

  1. import chromadb
  2. from chromadb.config import DEFAULT_TENANT, DEFAULT_DATABASE, Settings
  3. client = chromadb.HttpClient(
  4. host="localhost",
  5. port=8000,
  6. ssl=False,
  7. headers=None,
  8. settings=Settings(),
  9. tenant=DEFAULT_TENANT,
  10. database=DEFAULT_DATABASE,
  11. )

Parameters:

  1. host - The host of the remote server. If not specified, the default is localhost.
  2. port - The port of the remote server. If not specified, the default is 8000.
  3. ssl - If True, the client will use HTTPS. If not specified, the default is False.
  4. headers - (optional): The headers to be sent to the server. The setting can be used to pass additional headers to the server. An example of this can be auth headers.
  5. settings - Chroma settings object.
  6. tenant - the tenant to use. Default is default_tenant.
  7. database - the database to use. Default is default_database.

Positional Parameters

Chroma HttpClient parameters are positional, unless keyword arguments are used.

  1. import asyncio
  2. import chromadb
  3. # Apply nest_asyncio to allow running nested event loops in jupyter notebook
  4. # import nest_asyncio # import this if running in jupyter notebook
  5. # nest_asyncio.apply() # apply this if running in jupyter notebook
  6. async def list_collections():
  7. client = await chromadb.AsyncHttpClient(
  8. host="localhost",
  9. port=8000,
  10. ssl=False,
  11. headers=None,
  12. settings=Settings(),
  13. tenant=DEFAULT_TENANT,
  14. database=DEFAULT_DATABASE,
  15. )
  16. return await client.list_collections()
  17. result = asyncio.get_event_loop().run_until_complete(list_collections())
  18. print(result)

Parameters:

  1. host - The host of the remote server. If not specified, the default is localhost.
  2. port - The port of the remote server. If not specified, the default is 8000.
  3. ssl - If True, the client will use HTTPS. If not specified, the default is False.
  4. headers - (optional): The headers to be sent to the server. The setting can be used to pass additional headers to the server. An example of this can be auth headers.
  5. settings - Chroma settings object.
  6. tenant - the tenant to use. Default is default_tenant.
  7. database - the database to use. Default is default_database.

Positional Parameters

Chroma AsyncHttpClient parameters are positional, unless keyword arguments are used.

  1. import {ChromaClient} from "chromadb";
  2. const client = new ChromaClient({
  3. path: "http://localhost:8000",
  4. auth: {
  5. provider: "token",
  6. credentials: "your_token_here",
  7. tokenHeaderType: "AUTHORIZATION",
  8. },
  9. tenant: "default_tenant",
  10. database: "default_database",
  11. });

Parameters:

  • path - The Chroma endpoint
  • auth - Chroma authentication object
  • tenant - the tenant to use. Default is default_tenant.
  • database - the database to use. Default is default_database.
  1. go get github.com/amikos-tech/chroma-go
  1. package main
  2. import (
  3. "context"
  4. "fmt"
  5. "log"
  6. "os"
  7. chroma "github.com/amikos-tech/chroma-go"
  8. "github.com/amikos-tech/chroma-go/collection"
  9. openai "github.com/amikos-tech/chroma-go/pkg/embeddings/openai"
  10. "github.com/amikos-tech/chroma-go/types"
  11. )
  12. func main() {
  13. // Create new OpenAI embedding function
  14. openaiEf, err := openai.NewOpenAIEmbeddingFunction(os.Getenv("OPENAI_API_KEY"))
  15. if err != nil {
  16. log.Fatalf("Error creating OpenAI embedding function: %s \n", err)
  17. }
  18. // Create a new Chroma client
  19. client := chroma.NewClient(
  20. "localhost:8000",
  21. chroma.WithTenant(types.DefaultTenant),
  22. chroma.WithDatabase(types.DefaultDatabase),
  23. chroma.WithAuth(types.NewTokenAuthCredentialsProvider("my-token", types.AuthorizationTokenHeader))
  24. )
  25. // Create a new collection with options
  26. newCollection, err := client.NewCollection(
  27. context.TODO(),
  28. "test-collection",
  29. collection.WithMetadata("key1", "value1"),
  30. collection.WithEmbeddingFunction(openaiEf),
  31. collection.WithHNSWDistanceFunction(types.L2),
  32. )
  33. if err != nil {
  34. log.Fatalf("Error creating collection: %s \n", err)
  35. }
  36. }

Parameters:

  • Chroma endpoint - the chroma endpoint URL e.g. http://localhost:8000. This is a required parameter.
  • WithAuth() - Chroma authentication provider (see more here).
  • WithTenant() - the tenant to use. Default is default_tenant or constant types.DefaultTenant.
  • WithDatabase() - the database to use. Default is default_database or constant types.DefaultDatabase.

Uses of HTTP Client

The HTTP client is ideal for when you want to scale your application or move off of local machine storage. It is important to note that there are trade-offs associated with using HTTP client:

  • Network latency - The time it takes to send a request to the server and receive a response.
  • Serialization and deserialization overhead - The time it takes to convert data to a format that can be sent over the network and then convert it back to its original format.
  • Security - The data is sent over the network, so it is important to ensure that the connection is secure (we recommend using both HTTPS and authentication).
  • Availability - The server must be available for the client to connect to it.
  • Bandwidth usage - The amount of data sent over the network.
  • Data privacy and compliance - Storing data on a remote server may require compliance with data protection laws and regulations.
  • Difficulty in debugging - Debugging network issues can be more difficult than debugging local issues. The same applies to server-side issues.

Host parameter special cases (Python-only)

The host parameter supports a more advanced syntax than just the hostname. You can specify the whole endpoint ULR ( without the API paths), e.g. https://chromadb.example.com:8000/my_server/path/. This is useful when you want to use a reverse proxy or load balancer in front of your ChromaDB server.

Ephemeral Client

Ephemeral client is a client that does not store any data on disk. It is useful for fast prototyping and testing. To get started with an ephemeral client, use the EphemeralClient class.

  1. import chromadb
  2. from chromadb.config import DEFAULT_TENANT, DEFAULT_DATABASE, Settings
  3. client = chromadb.EphemeralClient(
  4. settings=Settings(),
  5. tenant=DEFAULT_TENANT,
  6. database=DEFAULT_DATABASE,
  7. )

Parameters:

  1. settings - Chroma settings object.
  2. tenant - the tenant to use. Default is default_tenant.
  3. database - the database to use. Default is default_database.

Positional Parameters

Chroma PersistentClient parameters are positional, unless keyword arguments are used.

Environmental Variable Configured Client

You can also configure the client using environmental variables. This is useful when you want to configure any of the client configurations listed above via environmental variables.

  1. import chromadb
  2. from chromadb.config import DEFAULT_TENANT, DEFAULT_DATABASE, Settings
  3. client = chromadb.Client(
  4. settings=Settings(),
  5. tenant=DEFAULT_TENANT,
  6. database=DEFAULT_DATABASE,
  7. )

Parameters:

  1. settings - Chroma settings object.
  2. tenant - the tenant to use. Default is default_tenant.
  3. database - the database to use. Default is default_database.

Positional Parameters

Chroma PersistentClient parameters are positional, unless keyword arguments are used.

July 29, 2024