Multi-User Basic Auth

Why Multi-user Auth?

Multi-user authentication can be crucial for several reasons. Let’s delve into this topic.

Security—The primary concern is the security of your deployments. You need to control who can access your data and ensure they are authorized to do so. You may wonder, since Chroma offers basic and token-based authentication, why is multi-user authentication necessary?

You should never share your Chroma access credentials with your users or any app that depends on Chroma. The answer to this concern is a categorical NO.

Another reason to consider multi-user authentication is to differentiate access to your data. However, the solution presented here doesn’t provide this. It’s a stepping stone towards our upcoming article on multi-tenancy and securing Chroma data.

Last but not least is auditing. While we acknowledge this is not for everybody, there is an increasing pressure to provide visibility into your app via auditable events.

Multi-user experiences - Not all GenAI apps are intended to be private or individual. This is another reason to consider and implement multi-user authentication and authorization.

Dive right in.

Let’s get straight to the point and build a multi-user authorization with basic authentication. Here’s our goal:

  • Develop a server-side authorization provider that can read multiple users from a .htpasswd file
  • Generate a multi-user .htpasswd file with several test users
  • Package our plugin with the Chroma base image and execute it using Docker Compose

Auth CIP

Chroma has detailed info about how its authentication and authorization are implemented. Should you want to learn more go read the CIP (Chroma Improvement Proposal doc).

The Plugin

  1. import importlib
  2. import logging
  3. from typing import Dict, cast, TypeVar, Optional
  4. from chromadb.auth import (
  5. ServerAuthCredentialsProvider,
  6. AbstractCredentials,
  7. SimpleUserIdentity,
  8. )
  9. from chromadb.auth.registry import register_provider
  10. from chromadb.config import System
  11. from chromadb.telemetry.opentelemetry import (
  12. OpenTelemetryGranularity,
  13. trace_method,
  14. add_attributes_to_current_span,
  15. )
  16. from pydantic import SecretStr
  17. from overrides import override
  18. T = TypeVar("T")
  19. logger = logging.getLogger(__name__)
  20. @register_provider("multi_user_htpasswd_file")
  21. class MultiUserHtpasswdFileServerAuthCredentialsProvider(ServerAuthCredentialsProvider):
  22. _creds: Dict[str, SecretStr] # contains user:password-hash
  23. def __init__(self, system: System) -> None:
  24. super().__init__(system)
  25. try:
  26. self.bc = importlib.import_module("bcrypt")
  27. except ImportError:
  28. raise ValueError(
  29. "The bcrypt python package is not installed. "
  30. "Please install it with `pip install bcrypt`"
  31. )
  32. system.settings.require("chroma_server_auth_credentials_file")
  33. _file = str(system.settings.chroma_server_auth_credentials_file)
  34. self._creds = dict()
  35. with open(_file, "r") as f:
  36. for line in f:
  37. _raw_creds = [v for v in line.strip().split(":")]
  38. if len(_raw_creds) != 2:
  39. raise ValueError(
  40. "Invalid Htpasswd credentials found in "
  41. f"[{str(system.settings.chroma_server_auth_credentials_file)}]. "
  42. "Must be <username>:<bcrypt passwd>."
  43. )
  44. self._creds[_raw_creds[0]] = SecretStr(_raw_creds[1])
  45. @trace_method( # type: ignore
  46. "MultiUserHtpasswdFileServerAuthCredentialsProvider.validate_credentials",
  47. OpenTelemetryGranularity.ALL,
  48. )
  49. @override
  50. def validate_credentials(self, credentials: AbstractCredentials[T]) -> bool:
  51. _creds = cast(Dict[str, SecretStr], credentials.get_credentials())
  52. if len(_creds) != 2 or "username" not in _creds or "password" not in _creds:
  53. logger.error(
  54. "Returned credentials did match expected format: "
  55. "dict[username:SecretStr, password: SecretStr]"
  56. )
  57. add_attributes_to_current_span(
  58. {
  59. "auth_succeeded": False,
  60. "auth_error": "Returned credentials did match expected format: "
  61. "dict[username:SecretStr, password: SecretStr]",
  62. }
  63. )
  64. return False # early exit on wrong format
  65. _user_pwd_hash = (
  66. self._creds[_creds["username"].get_secret_value()]
  67. if _creds["username"].get_secret_value() in self._creds
  68. else None
  69. )
  70. validation_response = _user_pwd_hash is not None and self.bc.checkpw(
  71. _creds["password"].get_secret_value().encode("utf-8"),
  72. _user_pwd_hash.get_secret_value().encode("utf-8"),
  73. )
  74. add_attributes_to_current_span(
  75. {
  76. "auth_succeeded": validation_response,
  77. "auth_error": f"Failed to validate credentials for user {_creds['username'].get_secret_value()}"
  78. if not validation_response
  79. else "",
  80. }
  81. )
  82. return validation_response
  83. @override
  84. def get_user_identity(
  85. self, credentials: AbstractCredentials[T]
  86. ) -> Optional[SimpleUserIdentity]:
  87. _creds = cast(Dict[str, SecretStr], credentials.get_credentials())
  88. return SimpleUserIdentity(_creds["username"].get_secret_value())

In less than 80 lines of code, we have our plugin. Let’s delve into and explain some of the key points of the code above:

  • __init__ - Here, we dynamically import bcrypt, which we’ll use to check user credentials. We also read the configured credentials file - server.htpasswd line by line, to retrieve each user (we assume each line contains a new user with its bcrypt hash).
  • validate_credentials - This is where the magic happens. We initially perform some lightweight validations on the credentials parsed by Chroma and passed to the plugin. Then, we attempt to retrieve the user and its hash from the _creds dictionary. The final step is to verify the hash. We’ve also added some attributes to monitor our authentication process in our observability layer (we have an upcoming article about this).
  • get_user_identity - Constructs a simple user identity, which the authorization plugin uses to verify permissions. Although not needed for now, each authentication plugin must implement this, as user identities are crucial for authorization.

We’ll store our plugin in __init__.py within the following directory structure - chroma_auth/authn/basic/__init__.py (refer to the repository for details).

Password file

Now that we have our plugin let’s create a password file with a few users:

Initial user:

  1. echo "password123" | htpasswd -iBc server.htpasswd admin

The above will create (-c flag) a new server.htpasswd file with initial user admin and the password will be read from stdin (-i flag) and saved as bcrypt hash (-B flag)

Let’s add another user:

  1. echo "password123" | htpasswd -iB server.htpasswd user1

Now our server.htpasswd file will look like this:

  1. admin:$2y$05$vkBK4b1Vk5O98jNHgr.uduTJsTOfM395sKEKe48EkJCVPH/MBIeHK
  2. user1:$2y$05$UQ0kC2x3T2XgeN4WU12BdekUwCJmLjJNhMaMtFNolYdj83OqiEpVu

Moving on to docker setup.

Docker compose setup

Let’s create a Dockerfile to bundle our plugin with the official Chroma image:

  1. ARG CHROMA_VERSION=0.4.24
  2. FROM ghcr.io/chroma-core/chroma:${CHROMA_VERSION} as base
  3. COPY chroma_auth/ /chroma/chroma_auth

This will pick up the official docker image for Chroma and will add our plugin directory structure so that we can use it.

Now let’s create an .env file to load our plugin:

  1. CHROMA_SERVER_AUTH_PROVIDER="chromadb.auth.basic.BasicAuthServerProvider"
  2. CHROMA_SERVER_AUTH_CREDENTIALS_FILE="server.htpasswd"
  3. CHROMA_SERVER_AUTH_CREDENTIALS_PROVIDER="chroma_auth.authn.basic.MultiUserHtpasswdFileServerAuthCredentialsProvider"

And finally our docker-compose.yaml:

  1. version: '3.9'
  2. networks:
  3. net:
  4. driver: bridge
  5. services:
  6. server:
  7. image: chroma-server
  8. build:
  9. dockerfile: Dockerfile
  10. volumes:
  11. - ./chroma-data:/chroma/chroma
  12. - ./server.htpasswd:/chroma/server.htpasswd
  13. command: "--workers 1 --host 0.0.0.0 --port 8000 --proxy-headers --log-config chromadb/log_config.yml --timeout-keep-alive 30"
  14. environment:
  15. - IS_PERSISTENT=TRUE
  16. - CHROMA_SERVER_AUTH_PROVIDER=${CHROMA_SERVER_AUTH_PROVIDER}
  17. - CHROMA_SERVER_AUTH_CREDENTIALS_FILE=${CHROMA_SERVER_AUTH_CREDENTIALS_FILE}
  18. - CHROMA_SERVER_AUTH_CREDENTIALS=${CHROMA_SERVER_AUTH_CREDENTIALS}
  19. - CHROMA_SERVER_AUTH_CREDENTIALS_PROVIDER=${CHROMA_SERVER_AUTH_CREDENTIALS_PROVIDER}
  20. - CHROMA_SERVER_AUTH_TOKEN_TRANSPORT_HEADER=${CHROMA_SERVER_AUTH_TOKEN_TRANSPORT_HEADER}
  21. - PERSIST_DIRECTORY=${PERSIST_DIRECTORY:-/chroma/chroma}
  22. - CHROMA_OTEL_EXPORTER_ENDPOINT=${CHROMA_OTEL_EXPORTER_ENDPOINT}
  23. - CHROMA_OTEL_EXPORTER_HEADERS=${CHROMA_OTEL_EXPORTER_HEADERS}
  24. - CHROMA_OTEL_SERVICE_NAME=${CHROMA_OTEL_SERVICE_NAME}
  25. - CHROMA_OTEL_GRANULARITY=${CHROMA_OTEL_GRANULARITY}
  26. - CHROMA_SERVER_NOFILE=${CHROMA_SERVER_NOFILE}
  27. restart: unless-stopped # possible values are: "no", always", "on-failure", "unless-stopped"
  28. ports:
  29. - "8000:8000"
  30. healthcheck:
  31. # Adjust below to match your container port
  32. test: [ "CMD", "curl", "-f", "http://localhost:8000/api/v1/heartbeat" ]
  33. interval: 30s
  34. timeout: 10s
  35. retries: 3
  36. networks:
  37. - net

The test

Let’s run our docker compose setup:

  1. docker compose --env-file ./.env up --build

You should see the following log message if the plugin was successfully loaded:

  1. server-1 | DEBUG: [01-04-2024 14:10:13] Starting component MultiUserHtpasswdFileServerAuthCredentialsProvider
  2. server-1 | DEBUG: [01-04-2024 14:10:13] Starting component BasicAuthServerProvider
  3. server-1 | DEBUG: [01-04-2024 14:10:13] Starting component FastAPIChromaAuthMiddleware

Once our container is up and running, let’s see if our multi-user auth works:

  1. import chromadb
  2. from chromadb.config import Settings
  3. client = chromadb.HttpClient(
  4. settings=Settings(chroma_client_auth_provider="chromadb.auth.basic.BasicAuthClientProvider",chroma_client_auth_credentials="admin:password123"))
  5. client.heartbeat() # this should work with or without authentication - it is a public endpoint
  6. client.get_or_create_collection("test_collection") # this is a protected endpoint and requires authentication
  7. client.list_collections() # this is a protected endpoint and requires authentication

The above code should return the list of collections, a single collection test_collection that we created.

  1. (chromadb-multi-user-basic-auth-py3.11) [chromadb-multi-user-basic-auth]python 19:51:38 main
  2. Python 3.11.7 (main, Dec 30 2023, 14:03:09) [Clang 15.0.0 (clang-1500.1.0.2.5)] on darwin
  3. Type "help", "copyright", "credits" or "license" for more information.
  4. >>> import chromadb
  5. >>> from chromadb.config import Settings
  6. >>>
  7. >>> client = chromadb.HttpClient(
  8. ... settings=Settings(chroma_client_auth_provider="chromadb.auth.basic.BasicAuthClientProvider",chroma_client_auth_credentials="admin:password123"))
  9. >>> client.heartbeat() # this should work with or without authentication - it is a public endpoint
  10. 1711990302270211007
  11. >>>
  12. >>> client.list_collections() # this is a protected endpoint and requires authentication
  13. []

Great, now let’s test for our other user:

  1. client = chromadb.HttpClient(
  2. settings=Settings(chroma_client_auth_provider="chromadb.auth.basic.BasicAuthClientProvider",chroma_client_auth_credentials="user1:password123"))

Works just as well (logs omitted for brevity).

To ensure that our plugin works as expected let’s also test with an user that is not in our server.htpasswd file:

  1. client = chromadb.HttpClient(
  2. settings=Settings(chroma_client_auth_provider="chromadb.auth.basic.BasicAuthClientProvider",chroma_client_auth_credentials="invalid_user:password123"))
  1. Traceback (most recent call last):
  2. File "<stdin>", line 1, in <module>
  3. File "/Users/tazarov/Library/Caches/pypoetry/virtualenvs/chromadb-multi-user-basic-auth-vIZuPNTE-py3.11/lib/python3.11/site-packages/chromadb/__init__.py", line 197, in HttpClient
  4. return ClientCreator(tenant=tenant, database=database, settings=settings)
  5. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  6. File "/Users/tazarov/Library/Caches/pypoetry/virtualenvs/chromadb-multi-user-basic-auth-vIZuPNTE-py3.11/lib/python3.11/site-packages/chromadb/api/client.py", line 144, in __init__
  7. self._validate_tenant_database(tenant=tenant, database=database)
  8. File "/Users/tazarov/Library/Caches/pypoetry/virtualenvs/chromadb-multi-user-basic-auth-vIZuPNTE-py3.11/lib/python3.11/site-packages/chromadb/api/client.py", line 445, in _validate_tenant_database
  9. raise e
  10. File "/Users/tazarov/Library/Caches/pypoetry/virtualenvs/chromadb-multi-user-basic-auth-vIZuPNTE-py3.11/lib/python3.11/site-packages/chromadb/api/client.py", line 438, in _validate_tenant_database
  11. self._admin_client.get_tenant(name=tenant)
  12. File "/Users/tazarov/Library/Caches/pypoetry/virtualenvs/chromadb-multi-user-basic-auth-vIZuPNTE-py3.11/lib/python3.11/site-packages/chromadb/api/client.py", line 486, in get_tenant
  13. return self._server.get_tenant(name=name)
  14. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  15. File "/Users/tazarov/Library/Caches/pypoetry/virtualenvs/chromadb-multi-user-basic-auth-vIZuPNTE-py3.11/lib/python3.11/site-packages/chromadb/telemetry/opentelemetry/__init__.py", line 127, in wrapper
  16. return f(*args, **kwargs)
  17. ^^^^^^^^^^^^^^^^^^
  18. File "/Users/tazarov/Library/Caches/pypoetry/virtualenvs/chromadb-multi-user-basic-auth-vIZuPNTE-py3.11/lib/python3.11/site-packages/chromadb/api/fastapi.py", line 200, in get_tenant
  19. raise_chroma_error(resp)
  20. File "/Users/tazarov/Library/Caches/pypoetry/virtualenvs/chromadb-multi-user-basic-auth-vIZuPNTE-py3.11/lib/python3.11/site-packages/chromadb/api/fastapi.py", line 649, in raise_chroma_error
  21. raise chroma_error
  22. chromadb.errors.AuthorizationError: Unauthorized

As expected, we get auth error when trying to connect to Chroma (the client initialization validates the tenant and DB which are both protected endpoints which raises the exception above).

April 3, 2024