Liberating cassandra.yaml Parameters’ Names from Their Units

Objective

Three big things happened as part of CASSANDRA-15234:

1) Renaming of parameters in cassandra.yaml to follow the form noun_verb.

2) Liberating cassandra.yaml parameters from their units (DataStorage, DataRate and Duration) and introducing temporary smallest accepted unit per parameter (only for DataStorage and Duration ones)

3) Backward compatibility framework to support the old names and lack of units support until at least the next major release.

Renamed Parameters

The community has decided to allow operators to specify units for Cassandra parameters of types duration, data storage, and data rate. All parameters which had a particular unit (most of the time added as a suffix to their name) can be now set by using the format [value][unit]. The unit suffix has been removed from their names. Supported units:

Parameter TypeUnits Supported

Duration

d, h, m, s, ms, us, µs, ns

Data Storage

B, KiB, MiB, GiB

Data Rate

B/s, MiB/s, KiB/s

Example:

Old name and value format:

  1. permissions_update_interval_ms: 0

New name and possible value formats:

  1. permissions_update_interval: 0ms
  2. permissions_update_interval: 0s
  3. permissions_update_interval: 0d
  4. permissions_update_interval: 0us
  5. permissions_update_interval: 0µs

The work in CASSANDRA-15234 was already quite big, so we decided to introduce the notion of the smallest allowed unit per parameter for duration and data storage parameters. What does this mean? Cassandra’s internals still use the old units for parameters. If, for example, seconds are used internally, but you want to add a value in nanoseconds in cassandra.yaml, you will get a configuration exception that contains the following information:

  1. Accepted units: seconds, minutes, hours, days.

Why was this needed? Because we can run into precision issues. The full solution to the problem is to convert internally all parameters’ values to be manipulated with the smallest supported by Cassandra unit. A series of tickets to assess and maybe migrate to the smallest unit our parameters (incrementally, post CASSANDRA-15234) will be opened in the future.

Old NameNew NameThe Smallest Supported Unit

permissions_validity_in_ms

permissions_validity

ms

permissions_update_interval_in_ms

permissions_update_interval

ms

roles_validity_in_ms

roles_validity

ms

roles_update_interval_in_ms

roles_update_interval

ms

credentials_validity_in_ms

credentials_validity

ms

credentials_update_interval_in_ms

credentials_update_interval

ms

max_hint_window_in_ms

max_hint_window

ms

native_transport_idle_timeout_in_ms

native_transport_idle_timeout

ms

request_timeout_in_ms

request_timeout

ms

read_request_timeout_in_ms

read_request_timeout

ms

range_request_timeout_in_ms

range_request_timeout

ms

write_request_timeout_in_ms

write_request_timeout

ms

counter_write_request_timeout_in_ms

counter_write_request_timeout

ms

cas_contention_timeout_in_ms

cas_contention_timeout

ms

truncate_request_timeout_in_ms

truncate_request_timeout

ms

streaming_keep_alive_period_in_secs

streaming_keep_alive_period

s

cross_node_timeout

internode_timeout

-

slow_query_log_timeout_in_ms

slow_query_log_timeout

ms

memtable_heap_space_in_mb

memtable_heap_space

MiB

memtable_offheap_space_in_mb

memtable_offheap_space

MiB

repair_session_space_in_mb

repair_session_space

MiB

internode_max_message_size_in_bytes

internode_max_message_size

B

internode_send_buff_size_in_bytes

internode_socket_send_buffer_size

B

internode_socket_send_buffer_size_in_bytes

internode_socket_send_buffer_size

B

internode_socket_receive_buffer_size_in_bytes

internode_socket_receive_buffer_size

B

internode_recv_buff_size_in_bytes

internode_socket_receive_buffer_size

B

internode_application_send_queue_capacity_in_bytes

internode_application_send_queue_capacity

B

internode_application_send_queue_reserve_endpoint_capacity_in_bytes

internode_application_send_queue_reserve_endpoint_capacity

B

internode_application_send_queue_reserve_global_capacity_in_bytes

internode_application_send_queue_reserve_global_capacity

B

internode_application_receive_queue_capacity_in_bytes

internode_application_receive_queue_capacity

B

internode_application_receive_queue_reserve_endpoint_capacity_in_bytes

internode_application_receive_queue_reserve_endpoint_capacity

B

internode_application_receive_queue_reserve_global_capacity_in_bytes

internode_application_receive_queue_reserve_global_capacity

B

internode_tcp_connect_timeout_in_ms

internode_tcp_connect_timeout

ms

internode_tcp_user_timeout_in_ms

internode_tcp_user_timeout

ms

internode_streaming_tcp_user_timeout_in_ms

internode_streaming_tcp_user_timeout

ms

native_transport_max_frame_size_in_mb

native_transport_max_frame_size

MiB

max_value_size_in_mb

max_value_size

MiB

column_index_size_in_kb

column_index_size

KiB

column_index_cache_size_in_kb

column_index_cache_size

KiB

batch_size_warn_threshold_in_kb

batch_size_warn_threshold

KiB

batch_size_fail_threshold_in_kb

batch_size_fail_threshold

KiB

compaction_throughput_mb_per_sec

compaction_throughput

MiB/s

compaction_large_partition_warning_threshold_mb

compaction_large_partition_warning_threshold

MiB

min_free_space_per_drive_in_mb

min_free_space_per_drive

MiB

stream_throughput_outbound_megabits_per_sec

stream_throughput_outbound

MiB/s

inter_dc_stream_throughput_outbound_megabits_per_sec

inter_dc_stream_throughput_outbound

MiB/s

commitlog_total_space_in_mb

commitlog_total_space

MiB

commitlog_sync_group_window_in_ms

commitlog_sync_group_window

ms

commitlog_sync_period_in_ms

commitlog_sync_period

ms

commitlog_segment_size_in_mb

commitlog_segment_size

MiB

periodic_commitlog_sync_lag_block_in_ms

periodic_commitlog_sync_lag_block

ms

max_mutation_size_in_kb

max_mutation_size

KiB

cdc_total_space_in_mb

cdc_total_space

MiB

cdc_free_space_check_interval_ms

cdc_free_space_check_interval

ms

dynamic_snitch_update_interval_in_ms

dynamic_snitch_update_interval

ms

dynamic_snitch_reset_interval_in_ms

dynamic_snitch_reset_interval

ms

hinted_handoff_throttle_in_kb

hinted_handoff_throttle

KiB

batchlog_replay_throttle_in_kb

batchlog_replay_throttle

KiB

hints_flush_period_in_ms

hints_flush_period

ms

max_hints_file_size_in_mb

max_hints_file_size

MiB

trickle_fsync_interval_in_kb

trickle_fsync_interval

KiB

sstable_preemptive_open_interval_in_mb

sstable_preemptive_open_interval

MiB

key_cache_size_in_mb

key_cache_size

MiB

row_cache_size_in_mb

row_cache_size

MiB

counter_cache_size_in_mb

counter_cache_size

MiB

networking_cache_size_in_mb

networking_cache_size

MiB

file_cache_size_in_mb

file_cache_size

MiB

index_summary_capacity_in_mb

index_summary_capacity

MiB

index_summary_resize_interval_in_minutes

index_summary_resize_interval

m

gc_log_threshold_in_ms

gc_log_threshold

ms

gc_warn_threshold_in_ms

gc_warn_threshold

ms

tracetype_query_ttl

trace_type_query_ttl

s

tracetype_repair_ttl

trace_type_repair_ttl

s

prepared_statements_cache_size_mb

prepared_statements_cache_size

MiB

enable_user_defined_functions

user_defined_functions_enabled

-

enable_scripted_user_defined_functions

scripted_user_defined_functions_enabled

-

enable_materialized_views

materialized_views_enabled

-

enable_transient_replication

transient_replication_enabled

-

enable_sasi_indexes

sasi_indexes_enabled

-

enable_drop_compact_storage

drop_compact_storage_enabled

-

enable_user_defined_functions_threads

user_defined_functions_threads_enabled

-

enable_legacy_ssl_storage_port

legacy_ssl_storage_port_enabled

-

user_defined_function_fail_timeout

user_defined_functions_fail_timeout

ms

user_defined_function_warn_timeout

user_defined_functions_warn_timeout

ms

cache_load_timeout_seconds

cache_load_timeout

s

Another TO DO is to add JMX methods supporting the new format. However, we may abandon this if virtual tables support configuration changes in the near future.

Notes for Cassandra Developers:

  • Most of our parameters are already moved to the new framework as part of CASSANDRA-15234. @Replaces is the annotation to be used when you make changes to any configuration parameters in Config class and cassandra.yaml, and you want to add backward compatibility with previous Cassandra versions. Converters class enumerates the different methods used for backward compatibility. IDENTITY is the one used for name change only. For more information about the other Converters, please, check the JavaDoc in the class. For backward compatibility virtual table Settings contains both the old and the new parameters with the old and the new value format. Only exception at the moment are the following three parameters: key_cache_save_period, row_cache_save_period and counter_cache_save_period which appear only once with the new value format. The old names and value format still can be used at least until the next major release. Deprecation warning is emitted on startup. If the parameter is of type duration, data rate or data storage, its value should be accompanied by a unit when new name is used.

  • Please follow the new format noun_verb when adding new configuration parameters.

  • Please consider adding any new parameters with the lowest supported by Cassandra unit when possible. Our new types also support long and integer upper bound, depending on your needs. All options for configuration parameters’ types are nested classes in our three main abstract classes - DurationSpec, DataStorageSpec, DataRateSpec.

  • If for some reason you consider the smallest unit for a new parameter shouldn’t be the one that is supported as such in Cassandra, you can use the rest of the nested classes in DurationSpec, DataStorageSpec. The smallest allowed unit is the one we use internally for the property, so we don’t have to do conversions to bigger units which will lead to precision problems. This is a problem only with DurationSpec and DataStorageSpec. DataRateSpec is handled internally in double.

  • New parameters should be added as non-negative numbers. For parameters where you would have set -1 to disable in the past, you might want to consider a separate flag parameter or null value. In case you use the null value, please, ensure that any default value introduced in the DatabaseDescriptor to handle it is also duplicated in any related setters.

  • Parameters of type data storage, duration and data rate cannot be set to Long.MAX_VALUE (former parameters of long type) and Integer.MAX_VALUE (former parameters of int type). That numbers are used during conversion between units to prevent an overflow from happening.

  • Any time you add @Replaces with a name change, we need to add an entry in this Python dictionary in CCM to support the same backward compatibility as SnakeYAML.

Please follow the instructions in requirements.txt in the DTest repo how to retag CCM after committing any changes. You might want to test also with tagging in your repo to ensure that there will be no surprise after retagging the official CCM. Please be sure to run a full CI after any changes as CCM affects a few of our testing suites.

  • Some configuration parameters are not announced in cassandra.yaml, but they are presented in the Config class for advanced users. Those also should be using the new framework and naming conventions.

  • As we have backward compatibility, we didn’t have to rework all python DTests to set config in the new format, and we exercise the backward compatibility while testing. Please consider adding any new tests using the new names and value format though.

  • In-JVM upgrade tests do not support per-version configuration at the moment, so we have to keep the old names and value format. Currently, if we try to use the new config for a newer version, that will be silently ignored and default config will be used.

  • SnakeYAML supports overloading of parameters. This means that if you add a configuration parameter more than once in your cassandra.yaml - the latest occasion will be the one to load in Config during Cassandra startup. In order to make upgrades as less disruptive as possible, we continue supporting that behavior also with adding old and new names of a parameter into cassandra.yaml.

  • Please ensure that any JMX setters/getters update the Config class properties and not some local copies. Settings Virtual Table reports the configuration loaded at any time from the Config class.

Example:

If you add the following to cassandra.yaml:

  1. hinted_handoff_enabled: true
  2. enabled_hinted_handolff: false

you will get loaded in Config:

  1. hinted_handoff_enabled: false

CASSANDRA-17379 was opened to improve the user experience and deprecate the overloading. By default, we refuse starting Cassandra with a config containing both old and new config keys for the same parameter. Start Cassandra with -Dcassandra.allow_new_old_config_keys=true to override. For historical reasons duplicate config keys in cassandra.yaml are allowed by default, start Cassandra with -Dcassandra.allow_duplicate_config_keys=false to disallow this. Please note that key_cache_save_period, row_cache_save_period, counter_cache_save_period will be affected only by -Dcassandra.allow_duplicate_config_keys.