Client-Side Field Level Encryption

New in MongoDB 4.2, client-side field level encryption allows an application to encrypt specific data fields in addition to pre-existing MongoDB encryption features such as Encryption at Rest and TLS/SSL (Transport Encryption).

With field level encryption, applications can encrypt fields in documents prior to transmitting data over the wire to the server. Client-side field level encryption supports workloads where applications must guarantee that unauthorized parties, including server administrators, cannot read the encrypted data.

See also

The MongoDB documentation on Client Side Field Level Encryption.

Dependencies

To get started using client-side field level encryption in your project, you will need to install the pymongocrypt library as well as the driver itself. Install both the driver and a compatible version of pymongocrypt like this:

  1. $ python -m pip install 'pymongo[encryption]'

Note that installing on Linux requires pip 19 or later for manylinux2010 wheel support. For more information about installing pymongocrypt see the installation instructions on the project’s PyPI page.

mongocryptd

The mongocryptd binary is required for automatic client-side encryption and is included as a component in the MongoDB Enterprise Server package. For detailed installation instructions see the MongoDB documentation on mongocryptd.

mongocryptd performs the following:

  • Parses the automatic encryption rules specified to the database connection. If the JSON schema contains invalid automatic encryption syntax or any document validation syntax, mongocryptd returns an error.

  • Uses the specified automatic encryption rules to mark fields in read and write operations for encryption.

  • Rejects read/write operations that may return unexpected or incorrect results when applied to an encrypted field. For supported and unsupported operations, see Read/Write Support with Automatic Field Level Encryption.

A MongoClient configured with auto encryption will automatically spawn the mongocryptd process from the application’s PATH. Applications can control the spawning behavior as part of the automatic encryption options. For example to set the path to the mongocryptd process:

  1. auto_encryption_opts = AutoEncryptionOpts(
  2. ...,
  3. mongocryptd_spawn_path='/path/to/mongocryptd')

To control the logging output of mongocryptd pass options using mongocryptd_spawn_args:

  1. auto_encryption_opts = AutoEncryptionOpts(
  2. ...,
  3. mongocryptd_spawn_args=['--logpath=/path/to/mongocryptd.log', '--logappend'])

If your application wishes to manage the mongocryptd process manually, it is possible to disable spawning mongocryptd:

  1. auto_encryption_opts = AutoEncryptionOpts(
  2. ...,
  3. mongocryptd_bypass_spawn=True,
  4. # URI of the local ``mongocryptd`` process.
  5. mongocryptd_uri='mongodb://localhost:27020')

mongocryptd is only responsible for supporting automatic client-side field level encryption and does not itself perform any encryption or decryption.

Automatic Client-Side Field Level Encryption

Automatic client-side field level encryption is enabled by creating a MongoClient with the auto_encryption_opts option set to an instance of AutoEncryptionOpts. The following examples show how to setup automatic client-side field level encryption using ClientEncryption to create a new encryption data key.

Note

Automatic client-side field level encryption requires MongoDB 4.2 enterprise or a MongoDB 4.2 Atlas cluster. The community version of the server supports automatic decryption as well as Explicit Encryption.

Providing Local Automatic Encryption Rules

The following example shows how to specify automatic encryption rules via the schema_map option. The automatic encryption rules are expressed using a strict subset of the JSON Schema syntax.

Supplying a schema_map provides more security than relying on JSON Schemas obtained from the server. It protects against a malicious server advertising a false JSON Schema, which could trick the client into sending unencrypted data that should be encrypted.

JSON Schemas supplied in the schema_map only apply to configuring automatic client-side field level encryption. Other validation rules in the JSON schema will not be enforced by the driver and will result in an error.:

  1. import os
  2. from bson.codec_options import CodecOptions
  3. from bson import json_util
  4. from pymongo import MongoClient
  5. from pymongo.encryption import (Algorithm,
  6. ClientEncryption)
  7. from pymongo.encryption_options import AutoEncryptionOpts
  8. def create_json_schema_file(kms_providers, key_vault_namespace,
  9. key_vault_client):
  10. client_encryption = ClientEncryption(
  11. kms_providers,
  12. key_vault_namespace,
  13. key_vault_client,
  14. # The CodecOptions class used for encrypting and decrypting.
  15. # This should be the same CodecOptions instance you have configured
  16. # on MongoClient, Database, or Collection. We will not be calling
  17. # encrypt() or decrypt() in this example so we can use any
  18. # CodecOptions.
  19. CodecOptions())
  20. # Create a new data key and json schema for the encryptedField.
  21. # https://dochub.mongodb.org/core/client-side-field-level-encryption-automatic-encryption-rules
  22. data_key_id = client_encryption.create_data_key(
  23. 'local', key_alt_names=['pymongo_encryption_example_1'])
  24. schema = {
  25. "properties": {
  26. "encryptedField": {
  27. "encrypt": {
  28. "keyId": [data_key_id],
  29. "bsonType": "string",
  30. "algorithm":
  31. Algorithm.AEAD_AES_256_CBC_HMAC_SHA_512_Deterministic
  32. }
  33. }
  34. },
  35. "bsonType": "object"
  36. }
  37. # Use CANONICAL_JSON_OPTIONS so that other drivers and tools will be
  38. # able to parse the MongoDB extended JSON file.
  39. json_schema_string = json_util.dumps(
  40. schema, json_options=json_util.CANONICAL_JSON_OPTIONS)
  41. with open('jsonSchema.json', 'w') as file:
  42. file.write(json_schema_string)
  43. def main():
  44. # The MongoDB namespace (db.collection) used to store the
  45. # encrypted documents in this example.
  46. encrypted_namespace = "test.coll"
  47. # This must be the same master key that was used to create
  48. # the encryption key.
  49. local_master_key = os.urandom(96)
  50. kms_providers = {"local": {"key": local_master_key}}
  51. # The MongoDB namespace (db.collection) used to store
  52. # the encryption data keys.
  53. key_vault_namespace = "encryption.__pymongoTestKeyVault"
  54. key_vault_db_name, key_vault_coll_name = key_vault_namespace.split(".", 1)
  55. # The MongoClient used to access the key vault (key_vault_namespace).
  56. key_vault_client = MongoClient()
  57. key_vault = key_vault_client[key_vault_db_name][key_vault_coll_name]
  58. # Ensure that two data keys cannot share the same keyAltName.
  59. key_vault.drop()
  60. key_vault.create_index(
  61. "keyAltNames",
  62. unique=True,
  63. partialFilterExpression={"keyAltNames": {"$exists": True}})
  64. create_json_schema_file(
  65. kms_providers, key_vault_namespace, key_vault_client)
  66. # Load the JSON Schema and construct the local schema_map option.
  67. with open('jsonSchema.json', 'r') as file:
  68. json_schema_string = file.read()
  69. json_schema = json_util.loads(json_schema_string)
  70. schema_map = {encrypted_namespace: json_schema}
  71. auto_encryption_opts = AutoEncryptionOpts(
  72. kms_providers, key_vault_namespace, schema_map=schema_map)
  73. client = MongoClient(auto_encryption_opts=auto_encryption_opts)
  74. db_name, coll_name = encrypted_namespace.split(".", 1)
  75. coll = client[db_name][coll_name]
  76. # Clear old data
  77. coll.drop()
  78. coll.insert_one({"encryptedField": "123456789"})
  79. print('Decrypted document: %s' % (coll.find_one(),))
  80. unencrypted_coll = MongoClient()[db_name][coll_name]
  81. print('Encrypted document: %s' % (unencrypted_coll.find_one(),))
  82. if __name__ == "__main__":
  83. main()

Server-Side Field Level Encryption Enforcement

The MongoDB 4.2 server supports using schema validation to enforce encryption of specific fields in a collection. This schema validation will prevent an application from inserting unencrypted values for any fields marked with the "encrypt" JSON schema keyword.

The following example shows how to setup automatic client-side field level encryption using ClientEncryption to create a new encryption data key and create a collection with the Automatic Encryption JSON Schema Syntax:

  1. import os
  2. from bson.codec_options import CodecOptions
  3. from bson.binary import STANDARD
  4. from pymongo import MongoClient
  5. from pymongo.encryption import (Algorithm,
  6. ClientEncryption)
  7. from pymongo.encryption_options import AutoEncryptionOpts
  8. from pymongo.errors import OperationFailure
  9. from pymongo.write_concern import WriteConcern
  10. def main():
  11. # The MongoDB namespace (db.collection) used to store the
  12. # encrypted documents in this example.
  13. encrypted_namespace = "test.coll"
  14. # This must be the same master key that was used to create
  15. # the encryption key.
  16. local_master_key = os.urandom(96)
  17. kms_providers = {"local": {"key": local_master_key}}
  18. # The MongoDB namespace (db.collection) used to store
  19. # the encryption data keys.
  20. key_vault_namespace = "encryption.__pymongoTestKeyVault"
  21. key_vault_db_name, key_vault_coll_name = key_vault_namespace.split(".", 1)
  22. # The MongoClient used to access the key vault (key_vault_namespace).
  23. key_vault_client = MongoClient()
  24. key_vault = key_vault_client[key_vault_db_name][key_vault_coll_name]
  25. # Ensure that two data keys cannot share the same keyAltName.
  26. key_vault.drop()
  27. key_vault.create_index(
  28. "keyAltNames",
  29. unique=True,
  30. partialFilterExpression={"keyAltNames": {"$exists": True}})
  31. client_encryption = ClientEncryption(
  32. kms_providers,
  33. key_vault_namespace,
  34. key_vault_client,
  35. # The CodecOptions class used for encrypting and decrypting.
  36. # This should be the same CodecOptions instance you have configured
  37. # on MongoClient, Database, or Collection. We will not be calling
  38. # encrypt() or decrypt() in this example so we can use any
  39. # CodecOptions.
  40. CodecOptions())
  41. # Create a new data key and json schema for the encryptedField.
  42. data_key_id = client_encryption.create_data_key(
  43. 'local', key_alt_names=['pymongo_encryption_example_2'])
  44. json_schema = {
  45. "properties": {
  46. "encryptedField": {
  47. "encrypt": {
  48. "keyId": [data_key_id],
  49. "bsonType": "string",
  50. "algorithm":
  51. Algorithm.AEAD_AES_256_CBC_HMAC_SHA_512_Deterministic
  52. }
  53. }
  54. },
  55. "bsonType": "object"
  56. }
  57. auto_encryption_opts = AutoEncryptionOpts(
  58. kms_providers, key_vault_namespace)
  59. client = MongoClient(auto_encryption_opts=auto_encryption_opts)
  60. db_name, coll_name = encrypted_namespace.split(".", 1)
  61. db = client[db_name]
  62. # Clear old data
  63. db.drop_collection(coll_name)
  64. # Create the collection with the encryption JSON Schema.
  65. db.create_collection(
  66. coll_name,
  67. # uuid_representation=STANDARD is required to ensure that any
  68. # UUIDs in the $jsonSchema document are encoded to BSON Binary
  69. # with the standard UUID subtype 4. This is only needed when
  70. # running the "create" collection command with an encryption
  71. # JSON Schema.
  72. codec_options=CodecOptions(uuid_representation=STANDARD),
  73. write_concern=WriteConcern(w="majority"),
  74. validator={"$jsonSchema": json_schema})
  75. coll = client[db_name][coll_name]
  76. coll.insert_one({"encryptedField": "123456789"})
  77. print('Decrypted document: %s' % (coll.find_one(),))
  78. unencrypted_coll = MongoClient()[db_name][coll_name]
  79. print('Encrypted document: %s' % (unencrypted_coll.find_one(),))
  80. try:
  81. unencrypted_coll.insert_one({"encryptedField": "123456789"})
  82. except OperationFailure as exc:
  83. print('Unencrypted insert failed: %s' % (exc.details,))
  84. if __name__ == "__main__":
  85. main()

Explicit Encryption

Explicit encryption is a MongoDB community feature and does not use the mongocryptd process. Explicit encryption is provided by the ClientEncryption class, for example:

  1. import os
  2. from pymongo import MongoClient
  3. from pymongo.encryption import (Algorithm,
  4. ClientEncryption)
  5. def main():
  6. # This must be the same master key that was used to create
  7. # the encryption key.
  8. local_master_key = os.urandom(96)
  9. kms_providers = {"local": {"key": local_master_key}}
  10. # The MongoDB namespace (db.collection) used to store
  11. # the encryption data keys.
  12. key_vault_namespace = "encryption.__pymongoTestKeyVault"
  13. key_vault_db_name, key_vault_coll_name = key_vault_namespace.split(".", 1)
  14. # The MongoClient used to read/write application data.
  15. client = MongoClient()
  16. coll = client.test.coll
  17. # Clear old data
  18. coll.drop()
  19. # Set up the key vault (key_vault_namespace) for this example.
  20. key_vault = client[key_vault_db_name][key_vault_coll_name]
  21. # Ensure that two data keys cannot share the same keyAltName.
  22. key_vault.drop()
  23. key_vault.create_index(
  24. "keyAltNames",
  25. unique=True,
  26. partialFilterExpression={"keyAltNames": {"$exists": True}})
  27. client_encryption = ClientEncryption(
  28. kms_providers,
  29. key_vault_namespace,
  30. # The MongoClient to use for reading/writing to the key vault.
  31. # This can be the same MongoClient used by the main application.
  32. client,
  33. # The CodecOptions class used for encrypting and decrypting.
  34. # This should be the same CodecOptions instance you have configured
  35. # on MongoClient, Database, or Collection.
  36. coll.codec_options)
  37. # Create a new data key for the encryptedField.
  38. data_key_id = client_encryption.create_data_key(
  39. 'local', key_alt_names=['pymongo_encryption_example_3'])
  40. # Explicitly encrypt a field:
  41. encrypted_field = client_encryption.encrypt(
  42. "123456789",
  43. Algorithm.AEAD_AES_256_CBC_HMAC_SHA_512_Deterministic,
  44. key_id=data_key_id)
  45. coll.insert_one({"encryptedField": encrypted_field})
  46. doc = coll.find_one()
  47. print('Encrypted document: %s' % (doc,))
  48. # Explicitly decrypt the field:
  49. doc["encryptedField"] = client_encryption.decrypt(doc["encryptedField"])
  50. print('Decrypted document: %s' % (doc,))
  51. # Cleanup resources.
  52. client_encryption.close()
  53. client.close()
  54. if __name__ == "__main__":
  55. main()

Explicit Encryption with Automatic Decryption

Although automatic encryption requires MongoDB 4.2 enterprise or a MongoDB 4.2 Atlas cluster, automatic decryption is supported for all users. To configure automatic decryption without automatic encryption set bypass_auto_encryption=True in AutoEncryptionOpts:

  1. import os
  2. from pymongo import MongoClient
  3. from pymongo.encryption import (Algorithm,
  4. ClientEncryption)
  5. from pymongo.encryption_options import AutoEncryptionOpts
  6. def main():
  7. # This must be the same master key that was used to create
  8. # the encryption key.
  9. local_master_key = os.urandom(96)
  10. kms_providers = {"local": {"key": local_master_key}}
  11. # The MongoDB namespace (db.collection) used to store
  12. # the encryption data keys.
  13. key_vault_namespace = "encryption.__pymongoTestKeyVault"
  14. key_vault_db_name, key_vault_coll_name = key_vault_namespace.split(".", 1)
  15. # bypass_auto_encryption=True disable automatic encryption but keeps
  16. # the automatic _decryption_ behavior. bypass_auto_encryption will
  17. # also disable spawning mongocryptd.
  18. auto_encryption_opts = AutoEncryptionOpts(
  19. kms_providers, key_vault_namespace, bypass_auto_encryption=True)
  20. client = MongoClient(auto_encryption_opts=auto_encryption_opts)
  21. coll = client.test.coll
  22. # Clear old data
  23. coll.drop()
  24. # Set up the key vault (key_vault_namespace) for this example.
  25. key_vault = client[key_vault_db_name][key_vault_coll_name]
  26. # Ensure that two data keys cannot share the same keyAltName.
  27. key_vault.drop()
  28. key_vault.create_index(
  29. "keyAltNames",
  30. unique=True,
  31. partialFilterExpression={"keyAltNames": {"$exists": True}})
  32. client_encryption = ClientEncryption(
  33. kms_providers,
  34. key_vault_namespace,
  35. # The MongoClient to use for reading/writing to the key vault.
  36. # This can be the same MongoClient used by the main application.
  37. client,
  38. # The CodecOptions class used for encrypting and decrypting.
  39. # This should be the same CodecOptions instance you have configured
  40. # on MongoClient, Database, or Collection.
  41. coll.codec_options)
  42. # Create a new data key for the encryptedField.
  43. data_key_id = client_encryption.create_data_key(
  44. 'local', key_alt_names=['pymongo_encryption_example_4'])
  45. # Explicitly encrypt a field:
  46. encrypted_field = client_encryption.encrypt(
  47. "123456789",
  48. Algorithm.AEAD_AES_256_CBC_HMAC_SHA_512_Deterministic,
  49. key_alt_name='pymongo_encryption_example_4')
  50. coll.insert_one({"encryptedField": encrypted_field})
  51. # Automatically decrypts any encrypted fields.
  52. doc = coll.find_one()
  53. print('Decrypted document: %s' % (doc,))
  54. unencrypted_coll = MongoClient().test.coll
  55. print('Encrypted document: %s' % (unencrypted_coll.find_one(),))
  56. # Cleanup resources.
  57. client_encryption.close()
  58. client.close()
  59. if __name__ == "__main__":
  60. main()