Implementing OpenFGA Authorization Model In Chroma

Source Code

The source code for this article can be found here.

Preparation

To make things useful we also introduce an initial tuple set with permissions which will allows us to test the authorization model.

We define three users:

  • admin part of chroma team as owner
  • user1 part of chroma team as reader
  • admin-ext part of external team as owner

We will give enough permissions to these three users and their respective teams so that they can perform collection creation, deletion, add records, remove records, get records and query records in the context of their role within the team - owner has access to all API actions while reader can only read, list get, query.

Abbreviate Example

We have removed some of the data from the above example for brevity. The full tuple set can be found under data/data/initial-data.json

  1. [
  2. {
  3. "object": "team:chroma",
  4. "relation": "owner",
  5. "user": "user:admin"
  6. },
  7. {
  8. "object": "team:chroma",
  9. "relation": "reader",
  10. "user": "user:user1"
  11. },
  12. {
  13. "object": "team:external",
  14. "relation": "owner",
  15. "user": "user:admin-ext"
  16. },
  17. {
  18. "object": "server:localhost",
  19. "relation": "can_get_tenant",
  20. "user": "team:chroma#owner"
  21. },
  22. {
  23. "object": "tenant:default_tenant-default_database",
  24. "relation": "can_get_database",
  25. "user": "team:chroma#owner"
  26. },
  27. {
  28. "object": "database:default_tenant-default_database",
  29. "relation": "can_create_collection",
  30. "user": "team:chroma#owner"
  31. },
  32. {
  33. "object": "database:default_tenant-default_database",
  34. "relation": "can_list_collections",
  35. "user": "team:chroma#owner"
  36. },
  37. {
  38. "object": "database:default_tenant-default_database",
  39. "relation": "can_get_or_create_collection",
  40. "user": "team:chroma#owner"
  41. },
  42. {
  43. "object": "database:default_tenant-default_database",
  44. "relation": "can_count_collections",
  45. "user": "team:chroma#owner"
  46. }
  47. ]

Testing the model

Let’s spin up a quick docker compose to test our setup. In the repo we have provided openfga/docker-compose.openfga-standalone.yaml

  1. docker compose -f openfga/docker-compose.openfga-standalone.yaml up

For this next part ensure you have FGA CLI installed.

Once the containers are up and running let’s create a store and import the model:

  1. export FGA_API_URL=http://localhost:8082 # our OpenFGA binds to 8082 on localhost
  2. fga store create --model data/models/model-article-p4.fga --name chromadb-auth

You should see a response like this:

  1. {
  2. "store": {
  3. "created_at": "2024-04-09T18:37:26.367747Z",
  4. "id": "01HV3VB347NPY3NMX6VQ5N2E23",
  5. "name": "chromadb-auth",
  6. "updated_at": "2024-04-09T18:37:26.367747Z"
  7. },
  8. "model": {
  9. "authorization_model_id": "01HV3VB34JAXWF0F3C00DFBZV4"
  10. }
  11. }

Let’s import our initial tuple set. Before that make sure to export FGA_STORE_ID and FGA_MODEL_ID as per the output of the previous command:

  1. export FGA_STORE_ID=01HV3VB347NPY3NMX6VQ5N2E23
  2. export FGA_MODEL_ID=01HV3VB34JAXWF0F3C00DFBZV4
  3. fga tuple write --file data/data/initial-data.json

Let’s test our imported model and tuples:

  1. fga query check user:admin can_get_preflight server:localhost

If everything is working you should see this:

  1. {
  2. "allowed": true,
  3. "resolution": ""
  4. }

Implementing Authorization Plumbing in Chroma

First we will start with making a few small changes to the authorization plugin we’ve made. Why you ask? We need to introduce teams (aka groups). For that we’ll resort to standard Apache groupfile as follows:

  1. chroma: admin, user1
  2. external: admin-ext

The groupfile will be mounted to our Chroma container and read by the multi-user basic auth plugin. The changes to the authentication plugin are as follows:

  1. # imports as before
  2. @register_provider("multi_user_htpasswd_file")
  3. class MultiUserHtpasswdFileServerAuthCredentialsProvider(ServerAuthCredentialsProvider):
  4. _creds: Dict[str, SecretStr] # contains user:password-hash
  5. def __init__(self, system: System) -> None:
  6. super().__init__(system)
  7. try:
  8. self.bc = importlib.import_module("bcrypt")
  9. except ImportError:
  10. raise ValueError(aa
  11. "The bcrypt python package is not installed. "
  12. "Please install it with `pip install bcrypt`"
  13. )
  14. system.settings.require("chroma_server_auth_credentials_file")
  15. _file = str(system.settings.chroma_server_auth_credentials_file)
  16. ... # as before
  17. _basepath = path.dirname(_file)
  18. self._user_group_map = dict()
  19. if path.exists(path.join(_basepath, "groupfile")):
  20. _groups = dict()
  21. with open(path.join(_basepath, "groupfile"), "r") as f:
  22. for line in f:
  23. _raw_group = [v for v in line.strip().split(":")]
  24. if len(_raw_group) < 2:
  25. raise ValueError(
  26. "Invalid Htpasswd group file found in "
  27. f"[{path.join(_basepath, 'groupfile')}]. "
  28. "Must be <groupname>:<username1>,<username2>,...,<usernameN>."
  29. )
  30. _groups[_raw_group[0]] = [u.strip() for u in _raw_group[1].split(",")]
  31. for _group, _users in _groups.items():
  32. for _user in _users:
  33. if _user not in self._user_group_map:
  34. self._user_group_map[_user] = _group
  35. @trace_method( # type: ignore
  36. "MultiUserHtpasswdFileServerAuthCredentialsProvider.validate_credentials",
  37. OpenTelemetryGranularity.ALL,
  38. )
  39. @override
  40. def validate_credentials(self, credentials: AbstractCredentials[T]) -> bool:
  41. ... # as before
  42. @override
  43. def get_user_identity(
  44. self, credentials: AbstractCredentials[T]
  45. ) -> Optional[SimpleUserIdentity]:
  46. _creds = cast(Dict[str, SecretStr], credentials.get_credentials())
  47. if _creds["username"].get_secret_value() in self._user_group_map.keys():
  48. return SimpleUserIdentity(
  49. _creds["username"].get_secret_value(),
  50. attributes={
  51. "team": self._user_group_map[_creds["username"].get_secret_value()]
  52. },
  53. )
  54. return SimpleUserIdentity(_creds["username"].get_secret_value(), attributes={"team": "public"})

Full code

The code can be found under chroma_auth/authn/basic/__**init__**.py

We read the group file and for each user create a key in self._user_group_map to specify the group or team of that user. The information is returned as user identity attributes that is further used by the authz plugin.

Now let’s turn our attention to the authorization plugin. First let’s start with that we’re trying to achieve with it:

  • Handle OpenFGA configuration from the import of the model as per the snippet above. This will help us to wire all necessary parts of the code with correct authorization model configuration.
  • Map all existing Chroma authorization actions to our authorization model
  • Adapt any shortcomings or quirks in Chroma authorization to the way OpenFGA works
  • Implement the Enforcement Point (EP) logic
  • Implement OpenFGA Permissions API wrapper - this is a utility class that will help us update and keep updating the OpenFGA tuples throughout collections’ lifecycle.

We’ve split the implementation in two files:

  • chroma_auth/authz/openfga/__init__.py - Storing our OpenFGA authorization configuration reader and our authorization plugin that adapts to Chroma authz model and enforces authorization decisions
  • chroma_auth/authz/openfga/openfga_permissions.py - Holds our OpenFGA permissions update logic.
  • chroma_auth/instr/**__init__**.py - holds our adapted FastAPI server from Chroma 0.4.24. While the authz plugin system in Chroma makes it easy to write the enforcement of authorization decisions, the update of permissions does require us to into this rabbit hole. Don’t worry the actual changes are minimal

Let’s cover things in a little more detail.

Reading the configuration.

  1. @register_provider("openfga_config_provider")
  2. class OpenFGAAuthorizationConfigurationProvider(
  3. ServerAuthorizationConfigurationProvider[ClientConfiguration]
  4. ):
  5. _config_file: str
  6. _config: ClientConfiguration
  7. def __init__(self, system: System) -> None:
  8. super().__init__(system)
  9. self._settings = system.settings
  10. if "FGA_API_URL" not in os.environ:
  11. raise ValueError("FGA_API_URL not set")
  12. self._config = self._try_load_from_file()
  13. # TODO in the future we can also add credentials (preshared) or OIDC
  14. def _try_load_from_file(self) -> ClientConfiguration:
  15. store_id = None
  16. model_id = None
  17. if "FGA_STORE_ID" in os.environ and "FGA_MODEL_ID" in os.environ:
  18. return ClientConfiguration(
  19. api_url=os.environ.get("FGA_API_URL"),
  20. store_id=os.environ["FGA_STORE_ID"],
  21. authorization_model_id=os.environ["FGA_MODEL_ID"],
  22. )
  23. if "FGA_CONFIG_FILE" not in os.environ and not store_id and not model_id:
  24. raise ValueError("FGA_CONFIG_FILE or FGA_STORE_ID/FGA_MODEL_ID env vars not set")
  25. with open(os.environ["FGA_CONFIG_FILE"], "r") as f:
  26. config = json.load(f)
  27. return ClientConfiguration(
  28. api_url=os.environ.get("FGA_API_URL"),
  29. store_id=config["store"]["id"],
  30. authorization_model_id=config["model"]["authorization_model_id"],
  31. )
  32. @override
  33. def get_configuration(self) -> ClientConfiguration:
  34. return self._config

This is a pretty simple and straightforward implementation that will either take env variables for the FGA Server URL, Store and Model or it will only take the server ULR + json configuration (the same as above).

Next let’s have a look at our OpenFGAAuthorizationProvider implementation. We’ll start with the constructor where we adapt existing Chroma authorization actions to our model:

  1. def __init__(self, system: System) -> None:
  2. # more code here, but we're skipping for brevity
  3. self._authz_to_model_action_map = {
  4. AuthzResourceActions.CREATE_DATABASE.value: "can_create_database",
  5. AuthzResourceActions.GET_DATABASE.value: "can_get_database",
  6. AuthzResourceActions.CREATE_TENANT.value: "can_create_tenant",
  7. AuthzResourceActions.GET_TENANT.value: "can_get_tenant",
  8. AuthzResourceActions.LIST_COLLECTIONS.value: "can_list_collections",
  9. AuthzResourceActions.COUNT_COLLECTIONS.value: "can_count_collections",
  10. AuthzResourceActions.GET_COLLECTION.value: "can_get_collection",
  11. AuthzResourceActions.CREATE_COLLECTION.value: "can_create_collection",
  12. AuthzResourceActions.GET_OR_CREATE_COLLECTION.value: "can_get_or_create_collection",
  13. AuthzResourceActions.DELETE_COLLECTION.value: "can_delete_collection",
  14. AuthzResourceActions.UPDATE_COLLECTION.value: "can_update_collection",
  15. AuthzResourceActions.ADD.value: "can_add_records",
  16. AuthzResourceActions.DELETE.value: "can_delete_records",
  17. AuthzResourceActions.GET.value: "can_get_records",
  18. AuthzResourceActions.QUERY.value: "can_query_records",
  19. AuthzResourceActions.COUNT.value: "can_count_records",
  20. AuthzResourceActions.UPDATE.value: "can_update_records",
  21. AuthzResourceActions.UPSERT.value: "can_upsert_records",
  22. AuthzResourceActions.RESET.value: "can_reset",
  23. }
  24. self._authz_to_model_object_map = {
  25. AuthzResourceTypes.DB.value: "database",
  26. AuthzResourceTypes.TENANT.value: "tenant",
  27. AuthzResourceTypes.COLLECTION.value: "collection",
  28. }

The above is located in chroma_auth/authz/openfga/__init__.py

The above is fairly straightforward mapping between AuthzResourceActions part of Chroma’s auth framework and the relations (aka actions) we’ve defined in our model above. Next we map also the AuthzResourceTypes to OpenFGA objects. This seem pretty simple right? Wrong, things are not so perfect and nothing exhibits this more than our next portion that takes the action and resource and returns object and relation to be checked:

  1. def resolve_resource_action(self, resource: AuthzResource, action: AuthzAction) -> tuple:
  2. attrs = ""
  3. tenant = None,
  4. database = None
  5. if "tenant" in resource.attributes:
  6. attrs += f"{resource.attributes['tenant']}"
  7. tenant = resource.attributes['tenant']
  8. if "database" in resource.attributes:
  9. attrs += f"-{resource.attributes['database']}"
  10. database = resource.attributes['database']
  11. if action.id == AuthzResourceActions.GET_TENANT.value or action.id == AuthzResourceActions.CREATE_TENANT.value:
  12. return "server:localhost", self._authz_to_model_action_map[action.id]
  13. if action.id == AuthzResourceActions.GET_DATABASE.value or action.id == AuthzResourceActions.CREATE_DATABASE.value:
  14. return f"tenant:{attrs}", self._authz_to_model_action_map[action.id]
  15. if action.id == AuthzResourceActions.CREATE_COLLECTION.value:
  16. try:
  17. cole_exists = self._api.get_collection(
  18. resource.id, tenant=tenant, database=database
  19. )
  20. return f"collection:{attrs}-{cole_exists.name}", self._authz_to_model_action_map[
  21. AuthzResourceActions.GET_COLLECTION.value]
  22. except Exception as e:
  23. return f"{self._authz_to_model_object_map[resource.type]}:{attrs}", self._authz_to_model_action_map[
  24. action.id]
  25. if resource.id == "*":
  26. return f"{self._authz_to_model_object_map[resource.type]}:{attrs}", self._authz_to_model_action_map[action.id]
  27. else:
  28. return f"{self._authz_to_model_object_map[resource.type]}:{attrs}-{resource.id}",
  29. self._authz_to_model_action_map[action.id]

Full code

The above is located in chroma_auth/authz/openfga/__init__.py

The resolve_resource_action function demonstrates the idiosyncrasies of Chroma’s auth. I have only myself to blame. The key takeaway is that there is room for improvement.

The actual authorization enforcement is then dead simple:

  1. def authorize(self, context: AuthorizationContext) -> bool:
  2. with OpenFgaClient(self._authz_config_provider.get_configuration()) as fga_client:
  3. try:
  4. obj, act = self.resolve_resource_action(resource=context.resource, action=context.action)
  5. resp = fga_client.check(body=ClientCheckRequest(
  6. user=f"user:{context.user.id}",
  7. relation=act,
  8. object=obj,
  9. ))
  10. # openfga_sdk.models.check_response.CheckResponse
  11. return resp.allowed
  12. except Exception as e:
  13. logger.error(f"Error while authorizing: {str(e)}")
  14. return False

At the end we’ll look at the our permissions API wrapper. While a full-blown solution will implement all possible object lifecycle hooks, we’re content with collections. Therefore we’ll add lifecycle callbacks for creating and deleting collection (we’re not considering, sharing of the collection with other users and change of ownership). So how does our create collection hook might look like you ask?

  1. def create_collection_permissions(self, collection: Collection, request: Request) -> None:
  2. if not hasattr(request.state, "user_identity"):
  3. return
  4. identity = request.state.user_identity # AuthzUser
  5. tenant = request.query_params.get("tenant")
  6. database = request.query_params.get("database")
  7. _object = f"collection:{tenant}-{database}-{collection.id}"
  8. _object_for_get_collection = f"collection:{tenant}-{database}-{collection.name}" # this is a bug in the Chroma Authz that feeds in the name of the collection instead of ID
  9. _user = f"team:{identity.get_user_attributes()['team']}#owner" if identity.get_user_attributes() and "team" in identity.get_user_attributes() else f"user:{identity.get_user_id()}"
  10. _user_writer = f"team:{identity.get_user_attributes()['team']}#writer" if identity.get_user_attributes() and "team" in identity.get_user_attributes() else None
  11. _user_reader = f"team:{identity.get_user_attributes()['team']}#reader" if identity.get_user_attributes() and "team" in identity.get_user_attributes() else None
  12. with OpenFgaClient(self._fga_configuration) as fga_client:
  13. fga_client.write_tuples(
  14. body=[
  15. ClientTuple(_user, "can_add_records", _object),
  16. ClientTuple(_user, "can_delete_records", _object),
  17. ClientTuple(_user, "can_update_records", _object),
  18. ClientTuple(_user, "can_get_records", _object),
  19. ClientTuple(_user, "can_upsert_records", _object),
  20. ClientTuple(_user, "can_count_records", _object),
  21. ClientTuple(_user, "can_query_records", _object),
  22. ClientTuple(_user, "can_get_collection", _object_for_get_collection),
  23. ClientTuple(_user, "can_delete_collection", _object_for_get_collection),
  24. ClientTuple(_user, "can_update_collection", _object),
  25. ]
  26. )
  27. if _user_writer:
  28. fga_client.write_tuples(
  29. body=[
  30. ClientTuple(_user_writer, "can_add_records", _object),
  31. ClientTuple(_user_writer, "can_delete_records", _object),
  32. ClientTuple(_user_writer, "can_update_records", _object),
  33. ClientTuple(_user_writer, "can_get_records", _object),
  34. ClientTuple(_user_writer, "can_upsert_records", _object),
  35. ClientTuple(_user_writer, "can_count_records", _object),
  36. ClientTuple(_user_writer, "can_query_records", _object),
  37. ClientTuple(_user_writer, "can_get_collection", _object_for_get_collection),
  38. ClientTuple(_user_writer, "can_delete_collection", _object_for_get_collection),
  39. ClientTuple(_user_writer, "can_update_collection", _object),
  40. ]
  41. )
  42. if _user_reader:
  43. fga_client.write_tuples(
  44. body=[
  45. ClientTuple(_user_reader, "can_get_records", _object),
  46. ClientTuple(_user_reader, "can_query_records", _object),
  47. ClientTuple(_user_reader, "can_count_records", _object),
  48. ClientTuple(_user_reader, "can_get_collection", _object_for_get_collection),
  49. ]
  50. )

Full code

You can find the full code in chroma_auth/authz/openfga/openfga_permissions.py

Looks pretty straight, but hold on I hear a thought creeping in your mind. “Why are you adding roles manually?”

You are right, it lacks that DRY-je-ne-sais-quoi, and I’m happy to keep it simple an explicit. A more mature implementation can read the model figure out what type we’re adding permissions for and then for each relation add the requisite users, but premature optimization is difficult to put in an article that won’t turn into a book.

With the above code we make the assumption that the collection doesn’t exist ergo its permissions tuples don’t exist. ( OpenFGA will fail to add tuples that already exist and there is not way around it other than deleting them first). Remember permission tuple lifecycle is your responsibility when adding authz to your application.

The delete is oddly similar (that’s why we’ve skipped the bulk of it):

  1. def delete_collection_permissions(self, collection: Collection, request: Request) -> None:
  2. if not hasattr(request.state, "user_identity"):
  3. return
  4. identity = request.state.user_identity
  5. _object = f"collection:{collection.tenant}-{collection.database}-{collection.id}"
  6. _object_for_get_collection = f"collection:{collection.tenant}-{collection.database}-{collection.name}" # this is a bug in the Chroma Authz that feeds in the name of the collection instead of ID
  7. _user = f"team:{identity.get_user_attributes()['team']}#owner" if identity.get_user_attributes() and "team" in identity.get_user_attributes() else f"user:{identity.get_user_id()}"
  8. _user_writer = f"team:{identity.get_user_attributes()['team']}#writer" if identity.get_user_attributes() and "team" in identity.get_user_attributes() else None
  9. _user_reader = f"team:{identity.get_user_attributes()['team']}#reader" if identity.get_user_attributes() and "team" in identity.get_user_attributes() else None
  10. with OpenFgaClient(self._fga_configuration) as fga_client:
  11. fga_client.delete_tuples(
  12. body=[
  13. ClientTuple(_user, "can_add_records", _object),
  14. ClientTuple(_user, "can_delete_records", _object),
  15. ClientTuple(_user, "can_update_records", _object),
  16. ClientTuple(_user, "can_get_records", _object),
  17. ClientTuple(_user, "can_upsert_records", _object),
  18. ClientTuple(_user, "can_count_records", _object),
  19. ClientTuple(_user, "can_query_records", _object),
  20. ClientTuple(_user, "can_get_collection", _object_for_get_collection),
  21. ClientTuple(_user, "can_delete_collection", _object_for_get_collection),
  22. ClientTuple(_user, "can_update_collection", _object),
  23. ]
  24. )
  25. # more code in the repo

Full code

You can find the full code in chroma_auth/authz/openfga/openfga_permissions.py

Let’s turn our attention at the last piece of code - the necessary evil of updating the FastAPI in Chroma to add our Permissions API hooks. We start simple by injecting our component using Chroma’s DI (dependency injection).

  1. from chroma_auth.authz.openfga.openfga_permissions import OpenFGAPermissionsAPI
  2. self._permissionsApi: OpenFGAPermissionsAPI = self._system.instance(OpenFGAPermissionsAPI)

The we add a hook for collection creation:

  1. def create_collection(
  2. self,
  3. request: Request,
  4. collection: CreateCollection,
  5. tenant: str = DEFAULT_TENANT,
  6. database: str = DEFAULT_DATABASE,
  7. ) -> Collection:
  8. existing = None
  9. try:
  10. existing = self._api.get_collection(collection.name, tenant=tenant, database=database)
  11. except ValueError as e:
  12. if "does not exist" not in str(e):
  13. raise e
  14. collection = self._api.create_collection(
  15. name=collection.name,
  16. metadata=collection.metadata,
  17. get_or_create=collection.get_or_create,
  18. tenant=tenant,
  19. database=database,
  20. )
  21. if not existing:
  22. self._permissionsApi.create_collection_permissions(collection=collection, request=request)
  23. return collection

Full code

You can find the full code in chroma_auth/instr/__init__.py

And one for collection removal:

  1. def delete_collection(
  2. self,
  3. request: Request,
  4. collection_name: str,
  5. tenant: str = DEFAULT_TENANT,
  6. database: str = DEFAULT_DATABASE,
  7. ) -> None:
  8. collection = self._api.get_collection(collection_name, tenant=tenant, database=database)
  9. resp = self._api.delete_collection(
  10. collection_name, tenant=tenant, database=database
  11. )
  12. self._permissionsApi.delete_collection_permissions(collection=collection, request=request)
  13. return resp

Full code

You can find the full code in chroma_auth/instr/__init__.py

The key thing to observe about the above snippets is that we invoke permissions API when we’re sure things have been persisted in the DB. I know, I know, atomicity here is also important, but that is for another article. Just keep in mind that it is easier to fix broken permission than broken data.

I promise this was the last bit of python code you’ll see in this article.

The Infra

Infrastructure!!! Finally, a sigh of relieve.

Let’s draw a diagrams:

Untitled

Link

We have our Chroma server, that relies on OpenFGA which persists data in PostgreSQL. “Ok, but …”, I can see you scratch your head, “… how do I bring this magnificent architecture to live?”. I thought you’d never ask. We’ll rely on our trusty docker compose skills with the following sequence in mind:

Untitled

“Where is the docker-compose.yaml!”. Voilà, my impatient friends:

  1. version: '3.9'
  2. networks:
  3. net:
  4. driver: bridge
  5. services:
  6. server:
  7. depends_on:
  8. openfga:
  9. condition: service_healthy
  10. import:
  11. condition: service_completed_successfully
  12. image: chroma-server
  13. build:
  14. dockerfile: Dockerfile
  15. volumes:
  16. - ./chroma-data:/chroma/chroma
  17. - ./server.htpasswd:/chroma/server.htpasswd
  18. - ./groupfile:/chroma/groupfile
  19. - ./data/:/data
  20. command: "--workers 1 --host 0.0.0.0 --port 8000 --proxy-headers --log-config chromadb/log_config.yml --timeout-keep-alive 30"
  21. environment:
  22. - IS_PERSISTENT=TRUE
  23. - CHROMA_SERVER_AUTH_PROVIDER=${CHROMA_SERVER_AUTH_PROVIDER}
  24. - CHROMA_SERVER_AUTH_CREDENTIALS_FILE=${CHROMA_SERVER_AUTH_CREDENTIALS_FILE}
  25. - CHROMA_SERVER_AUTH_CREDENTIALS=${CHROMA_SERVER_AUTH_CREDENTIALS}
  26. - CHROMA_SERVER_AUTH_CREDENTIALS_PROVIDER=${CHROMA_SERVER_AUTH_CREDENTIALS_PROVIDER}
  27. - CHROMA_SERVER_AUTH_TOKEN_TRANSPORT_HEADER=${CHROMA_SERVER_AUTH_TOKEN_TRANSPORT_HEADER}
  28. - PERSIST_DIRECTORY=${PERSIST_DIRECTORY:-/chroma/chroma}
  29. - CHROMA_OTEL_EXPORTER_ENDPOINT=${CHROMA_OTEL_EXPORTER_ENDPOINT}
  30. - CHROMA_OTEL_EXPORTER_HEADERS=${CHROMA_OTEL_EXPORTER_HEADERS}
  31. - CHROMA_OTEL_SERVICE_NAME=${CHROMA_OTEL_SERVICE_NAME}
  32. - CHROMA_OTEL_GRANULARITY=${CHROMA_OTEL_GRANULARITY}
  33. - CHROMA_SERVER_NOFILE=${CHROMA_SERVER_NOFILE}
  34. - CHROMA_SERVER_AUTHZ_PROVIDER=${CHROMA_SERVER_AUTHZ_PROVIDER}
  35. - CHROMA_SERVER_AUTHZ_CONFIG_PROVIDER=${CHROMA_SERVER_AUTHZ_CONFIG_PROVIDER}
  36. - FGA_API_URL=http://openfga:8080
  37. - FGA_CONFIG_FILE=/data/store.json # we expect that the import job will create this file
  38. restart: unless-stopped # possible values are: "no", always", "on-failure", "unless-stopped"
  39. ports:
  40. - "8000:8000"
  41. healthcheck:
  42. # Adjust below to match your container port
  43. test: [ "CMD", "curl", "-f", "http://localhost:8000/api/v1/heartbeat" ]
  44. interval: 30s
  45. timeout: 10s
  46. retries: 3
  47. networks:
  48. - net
  49. postgres:
  50. image: postgres:14
  51. container_name: postgres
  52. networks:
  53. - net
  54. ports:
  55. - "5432:5432"
  56. environment:
  57. - POSTGRES_USER=postgres
  58. - POSTGRES_PASSWORD=password
  59. healthcheck:
  60. test: [ "CMD-SHELL", "pg_isready -U postgres" ]
  61. interval: 5s
  62. timeout: 5s
  63. retries: 5
  64. volumes:
  65. - postgres_data_openfga:/var/lib/postgresql/data
  66. migrate:
  67. depends_on:
  68. postgres:
  69. condition: service_healthy
  70. image: openfga/openfga:latest
  71. container_name: migrate
  72. command: migrate
  73. environment:
  74. - OPENFGA_DATASTORE_ENGINE=postgres
  75. - OPENFGA_DATASTORE_URI=postgres://postgres:password@postgres:5432/postgres?sslmode=disable
  76. networks:
  77. - net
  78. openfga:
  79. depends_on:
  80. migrate:
  81. condition: service_completed_successfully
  82. image: openfga/openfga:latest
  83. container_name: openfga
  84. environment:
  85. - OPENFGA_DATASTORE_ENGINE=postgres
  86. - OPENFGA_DATASTORE_URI=postgres://postgres:password@postgres:5432/postgres?sslmode=disable
  87. - OPENFGA_LOG_FORMAT=json
  88. command: run
  89. networks:
  90. - net
  91. ports:
  92. # Needed for the http server
  93. - "8082:8080"
  94. # Needed for the grpc server (if used)
  95. - "8083:8081"
  96. # Needed for the playground (Do not enable in prod!)
  97. - "3003:3000"
  98. healthcheck:
  99. test: [ "CMD", "/usr/local/bin/grpc_health_probe", "-addr=openfga:8081" ]
  100. interval: 5s
  101. timeout: 30s
  102. retries: 3
  103. import:
  104. depends_on:
  105. openfga:
  106. condition: service_healthy
  107. image: fga-cli
  108. build:
  109. context: .
  110. dockerfile: Dockerfile-fgacli
  111. container_name: import
  112. volumes:
  113. - ./data/:/data
  114. command: |
  115. /bin/sh -c "/data/create_store_and_import.sh"
  116. environment:
  117. - FGA_SERVER_URL=http://openfga:8080
  118. networks:
  119. - net
  120. volumes:
  121. postgres_data_openfga:
  122. driver: local

Don’t forget to create an .env file:

  1. CHROMA_SERVER_AUTH_PROVIDER = "chromadb.auth.basic.BasicAuthServerProvider"
  2. CHROMA_SERVER_AUTH_CREDENTIALS_FILE = "server.htpasswd"
  3. CHROMA_SERVER_AUTH_CREDENTIALS_PROVIDER = "chroma_auth.authn.basic.MultiUserHtpasswdFileServerAuthCredentialsProvider"
  4. CHROMA_SERVER_AUTHZ_PROVIDER = "chroma_auth.authz.openfga.OpenFGAAuthorizationProvider"
  5. CHROMA_SERVER_AUTHZ_CONFIG_PROVIDER = "chroma_auth.authz.openfga.OpenFGAAuthorizationConfigurationProvider"

Update your server.htpasswd to include the new user:

  1. admin:$2
  2. y$05$vkBK4b1Vk5O98jNHgr.uduTJsTOfM395sKEKe48EkJCVPH / MBIeHK
  3. user1:$2
  4. y$05$UQ0kC2x3T2XgeN4WU12BdekUwCJmLjJNhMaMtFNolYdj83OqiEpVu
  5. admin - ext:$2
  6. y$05$9.
  7. L13wKQTHeXz9IH2UO2RurWEK. / Z24qapzyi6ywQGJds2DaC36C2

And the groupfile from before. And don’t forget to take a look at the import script under - data/create_store_and_import.sh

Run the following command at the root of the repo and let things fail and burn down (or in the event this works - awe you, disclaimer - it worked on my machine):

  1. docker
  2. compose
  3. up - -build

Tests, who needs test when you have stable infra!

Authorization is serious stuff, which is why we’ve created a bare minimum set of tests to prove we’re not totally wrong about it!

Real Serious Note

Serious Note: Take these things seriously and write a copious amounts of tests before rolling out things to prod. Don’t become OWASP Top10 “Hero”. Broken access controls is a thing that WILL keep you up at night.

We’ll focus on three areas:

  • Testing admin (owner) access
  • Testing team access for owner and reader roles
  • Testing cross team permissions

Admin Access

Simple check to ensure that whoever created the collection (aka the owner) is allowed all actions.

  1. import uuid
  2. import chromadb
  3. from chromadb.config import Settings
  4. client = chromadb.HttpClient(
  5. settings=Settings(chroma_client_auth_provider="chromadb.auth.basic.BasicAuthClientProvider",
  6. chroma_client_auth_credentials="admin:password123"))
  7. client.heartbeat() # this should work with or without authentication - it is a public endpoint
  8. client.list_collections() # this is a protected endpoint and requires authentication
  9. col = client.get_or_create_collection(f"test_collection-{str(uuid.uuid4())}")
  10. col.add(ids=["1"], documents=["test doc"])
  11. col.get()
  12. col.update(ids=["1"], documents=["test doc 2"])
  13. col.count()
  14. col.upsert(ids=["1"], documents=["test doc 3"])
  15. col.delete(ids=["1"])
  16. client.delete_collection(col.name)

Full code

You can find the full code in test_auth.ipynb

Team Access

Team access tests whether roles and permissions associated with those roles are correctly enforced.

  1. import uuid
  2. import chromadb
  3. from chromadb.config import Settings
  4. client = chromadb.HttpClient(
  5. settings=Settings(chroma_client_auth_provider="chromadb.auth.basic.BasicAuthClientProvider",
  6. chroma_client_auth_credentials="admin:password123"))
  7. client.heartbeat() # this should work with or without authentication - it is a public endpoint
  8. client.list_collections() # this is a protected endpoint and requires authentication
  9. col_name = f"test_collection-{str(uuid.uuid4())}"
  10. col = client.get_or_create_collection(col_name)
  11. print(f"Creating collection {col.id}")
  12. col.add(ids=["1"], documents=["test doc"])
  13. client.get_collection(col_name)
  14. client = chromadb.HttpClient(
  15. settings=Settings(chroma_client_auth_provider="chromadb.auth.basic.BasicAuthClientProvider",
  16. chroma_client_auth_credentials="user1:password123"))
  17. client.heartbeat() # this should work with or without authentication - it is a public endpoint
  18. client.list_collections() # this is a protected endpoint and requires authentication
  19. client.count_collections()
  20. print("Getting collection " + col_name)
  21. col = client.get_collection(col_name)
  22. col.get()
  23. col.count()
  24. try:
  25. client.delete_collection(col_name)
  26. except Exception as e:
  27. print(e) #expect unauthorized error
  28. client = chromadb.HttpClient(
  29. settings=Settings(chroma_client_auth_provider="chromadb.auth.basic.BasicAuthClientProvider",
  30. chroma_client_auth_credentials="admin:password123"))
  31. client.delete_collection(col_name)

Full code

You can find the full code in test_auth.ipynb

Cross-team access

In the cross team access scenario we’ll create a collection with one team owner (admin) and will try to access it (aka delete it) with another team’s owner in a very mano-a-mano (owner-to-owner way). It is important to observe that all these collections are created within the same database (default_database)

  1. import uuid
  2. import chromadb
  3. from chromadb.config import Settings
  4. col_name = f"test_collection-{str(uuid.uuid4())}"
  5. client = chromadb.HttpClient(
  6. settings=Settings(chroma_client_auth_provider="chromadb.auth.basic.BasicAuthClientProvider",
  7. chroma_client_auth_credentials="admin:password123"))
  8. client.get_or_create_collection(col_name)
  9. client = chromadb.HttpClient(
  10. settings=Settings(chroma_client_auth_provider="chromadb.auth.basic.BasicAuthClientProvider",
  11. chroma_client_auth_credentials="admin-ext:password123"))
  12. client.get_or_create_collection("external-collection")
  13. try:
  14. client.delete_collection(col_name)
  15. except Exception as e:
  16. print("Expected error for admin-ext: ", str(e)) #expect unauthorized error
  17. client = chromadb.HttpClient(
  18. settings=Settings(chroma_client_auth_provider="chromadb.auth.basic.BasicAuthClientProvider",
  19. chroma_client_auth_credentials="admin:password123"))
  20. client.delete_collection(col_name)
  21. try:
  22. client.delete_collection("external-collection")
  23. except Exception as e:
  24. print("Expected error for admin: ", str(e)) #expect unauthorized error

Full code

You can find the full code in test_auth.ipynb

April 15, 2024