Liberating cassandra.yaml Parameters’ Names from Their Units
Objective
Three big things happened as part of CASSANDRA-15234:
1) Renaming of parameters in cassandra.yaml
to follow the form noun_verb
.
2) Liberating cassandra.yaml
parameters from their units (DataStorage, DataRate and Duration) and introducing temporary smallest accepted unit per parameter (only for DataStorage and Duration ones)
3) Backward compatibility framework to support the old names and lack of units support until at least the next major release.
Renamed Parameters
The community has decided to allow operators to specify units for Cassandra parameters of types duration, data storage, and data rate. All parameters which had a particular unit (most of the time added as a suffix to their name) can be now set by using the format [value][unit]. The unit suffix has been removed from their names. Supported units:
Parameter Type | Units Supported |
---|---|
Duration | d, h, m, s, ms, us, µs, ns |
Data Storage | B, KiB, MiB, GiB |
Data Rate | B/s, MiB/s, KiB/s |
Example:
Old name and value format:
permissions_update_interval_ms: 0
New name and possible value formats:
permissions_update_interval: 0ms
permissions_update_interval: 0s
permissions_update_interval: 0d
permissions_update_interval: 0us
permissions_update_interval: 0µs
The work in CASSANDRA-15234 was already quite big, so we decided to introduce the notion of the smallest allowed unit per parameter for duration and data storage parameters. What does this mean? Cassandra’s internals still use the old units for parameters. If, for example, seconds are used internally, but you want to add a value in nanoseconds in cassandra.yaml
, you will get a configuration exception that contains the following information:
Accepted units: seconds, minutes, hours, days.
Why was this needed? Because we can run into precision issues. The full solution to the problem is to convert internally all parameters’ values to be manipulated with the smallest supported by Cassandra unit. A series of tickets to assess and maybe migrate to the smallest unit our parameters (incrementally, post CASSANDRA-15234) will be opened in the future.
Old Name | New Name | The Smallest Supported Unit |
---|---|---|
permissions_validity_in_ms | permissions_validity | ms |
permissions_update_interval_in_ms | permissions_update_interval | ms |
roles_validity_in_ms | roles_validity | ms |
roles_update_interval_in_ms | roles_update_interval | ms |
credentials_validity_in_ms | credentials_validity | ms |
credentials_update_interval_in_ms | credentials_update_interval | ms |
max_hint_window_in_ms | max_hint_window | ms |
native_transport_idle_timeout_in_ms | native_transport_idle_timeout | ms |
request_timeout_in_ms | request_timeout | ms |
read_request_timeout_in_ms | read_request_timeout | ms |
range_request_timeout_in_ms | range_request_timeout | ms |
write_request_timeout_in_ms | write_request_timeout | ms |
counter_write_request_timeout_in_ms | counter_write_request_timeout | ms |
cas_contention_timeout_in_ms | cas_contention_timeout | ms |
truncate_request_timeout_in_ms | truncate_request_timeout | ms |
streaming_keep_alive_period_in_secs | streaming_keep_alive_period | s |
cross_node_timeout | internode_timeout | - |
slow_query_log_timeout_in_ms | slow_query_log_timeout | ms |
memtable_heap_space_in_mb | memtable_heap_space | MiB |
memtable_offheap_space_in_mb | memtable_offheap_space | MiB |
repair_session_space_in_mb | repair_session_space | MiB |
internode_max_message_size_in_bytes | internode_max_message_size | B |
internode_send_buff_size_in_bytes | internode_socket_send_buffer_size | B |
internode_socket_send_buffer_size_in_bytes | internode_socket_send_buffer_size | B |
internode_socket_receive_buffer_size_in_bytes | internode_socket_receive_buffer_size | B |
internode_recv_buff_size_in_bytes | internode_socket_receive_buffer_size | B |
internode_application_send_queue_capacity_in_bytes | internode_application_send_queue_capacity | B |
internode_application_send_queue_reserve_endpoint_capacity_in_bytes | internode_application_send_queue_reserve_endpoint_capacity | B |
internode_application_send_queue_reserve_global_capacity_in_bytes | internode_application_send_queue_reserve_global_capacity | B |
internode_application_receive_queue_capacity_in_bytes | internode_application_receive_queue_capacity | B |
internode_application_receive_queue_reserve_endpoint_capacity_in_bytes | internode_application_receive_queue_reserve_endpoint_capacity | B |
internode_application_receive_queue_reserve_global_capacity_in_bytes | internode_application_receive_queue_reserve_global_capacity | B |
internode_tcp_connect_timeout_in_ms | internode_tcp_connect_timeout | ms |
internode_tcp_user_timeout_in_ms | internode_tcp_user_timeout | ms |
internode_streaming_tcp_user_timeout_in_ms | internode_streaming_tcp_user_timeout | ms |
native_transport_max_frame_size_in_mb | native_transport_max_frame_size | MiB |
max_value_size_in_mb | max_value_size | MiB |
column_index_size_in_kb | column_index_size | KiB |
column_index_cache_size_in_kb | column_index_cache_size | KiB |
batch_size_warn_threshold_in_kb | batch_size_warn_threshold | KiB |
batch_size_fail_threshold_in_kb | batch_size_fail_threshold | KiB |
compaction_throughput_mb_per_sec | compaction_throughput | MiB/s |
compaction_large_partition_warning_threshold_mb | compaction_large_partition_warning_threshold | MiB |
min_free_space_per_drive_in_mb | min_free_space_per_drive | MiB |
stream_throughput_outbound_megabits_per_sec | stream_throughput_outbound | MiB/s |
inter_dc_stream_throughput_outbound_megabits_per_sec | inter_dc_stream_throughput_outbound | MiB/s |
commitlog_total_space_in_mb | commitlog_total_space | MiB |
commitlog_sync_group_window_in_ms | commitlog_sync_group_window | ms |
commitlog_sync_period_in_ms | commitlog_sync_period | ms |
commitlog_segment_size_in_mb | commitlog_segment_size | MiB |
periodic_commitlog_sync_lag_block_in_ms | periodic_commitlog_sync_lag_block | ms |
max_mutation_size_in_kb | max_mutation_size | KiB |
cdc_total_space_in_mb | cdc_total_space | MiB |
cdc_free_space_check_interval_ms | cdc_free_space_check_interval | ms |
dynamic_snitch_update_interval_in_ms | dynamic_snitch_update_interval | ms |
dynamic_snitch_reset_interval_in_ms | dynamic_snitch_reset_interval | ms |
hinted_handoff_throttle_in_kb | hinted_handoff_throttle | KiB |
batchlog_replay_throttle_in_kb | batchlog_replay_throttle | KiB |
hints_flush_period_in_ms | hints_flush_period | ms |
max_hints_file_size_in_mb | max_hints_file_size | MiB |
trickle_fsync_interval_in_kb | trickle_fsync_interval | KiB |
sstable_preemptive_open_interval_in_mb | sstable_preemptive_open_interval | MiB |
key_cache_size_in_mb | key_cache_size | MiB |
row_cache_size_in_mb | row_cache_size | MiB |
counter_cache_size_in_mb | counter_cache_size | MiB |
networking_cache_size_in_mb | networking_cache_size | MiB |
file_cache_size_in_mb | file_cache_size | MiB |
index_summary_capacity_in_mb | index_summary_capacity | MiB |
index_summary_resize_interval_in_minutes | index_summary_resize_interval | m |
gc_log_threshold_in_ms | gc_log_threshold | ms |
gc_warn_threshold_in_ms | gc_warn_threshold | ms |
tracetype_query_ttl | trace_type_query_ttl | s |
tracetype_repair_ttl | trace_type_repair_ttl | s |
prepared_statements_cache_size_mb | prepared_statements_cache_size | MiB |
enable_user_defined_functions | user_defined_functions_enabled | - |
enable_scripted_user_defined_functions | scripted_user_defined_functions_enabled | - |
enable_materialized_views | materialized_views_enabled | - |
enable_transient_replication | transient_replication_enabled | - |
enable_sasi_indexes | sasi_indexes_enabled | - |
enable_drop_compact_storage | drop_compact_storage_enabled | - |
enable_user_defined_functions_threads | user_defined_functions_threads_enabled | - |
enable_legacy_ssl_storage_port | legacy_ssl_storage_port_enabled | - |
user_defined_function_fail_timeout | user_defined_functions_fail_timeout | ms |
user_defined_function_warn_timeout | user_defined_functions_warn_timeout | ms |
cache_load_timeout_seconds | cache_load_timeout | s |
Another TO DO is to add JMX methods supporting the new format. However, we may abandon this if virtual tables support configuration changes in the near future.
Notes for Cassandra Developers:
Most of our parameters are already moved to the new framework as part of CASSANDRA-15234.
@Replaces
is the annotation to be used when you make changes to any configuration parameters inConfig
class andcassandra.yaml
, and you want to add backward compatibility with previous Cassandra versions.Converters
class enumerates the different methods used for backward compatibility.IDENTITY
is the one used for name change only. For more information about the other Converters, please, check the JavaDoc in the class. For backward compatibility virtual tableSettings
contains both the old and the new parameters with the old and the new value format. Only exception at the moment are the following three parameters:key_cache_save_period
,row_cache_save_period
andcounter_cache_save_period
which appear only once with the new value format. The old names and value format still can be used at least until the next major release. Deprecation warning is emitted on startup. If the parameter is of type duration, data rate or data storage, its value should be accompanied by a unit when new name is used.Please follow the new format
noun_verb
when adding new configuration parameters.Please consider adding any new parameters with the lowest supported by Cassandra unit when possible. Our new types also support long and integer upper bound, depending on your needs. All options for configuration parameters’ types are nested classes in our three main abstract classes -
DurationSpec
,DataStorageSpec
,DataRateSpec
.If for some reason you consider the smallest unit for a new parameter shouldn’t be the one that is supported as such in Cassandra, you can use the rest of the nested classes in
DurationSpec
,DataStorageSpec
. The smallest allowed unit is the one we use internally for the property, so we don’t have to do conversions to bigger units which will lead to precision problems. This is a problem only withDurationSpec
andDataStorageSpec
.DataRateSpec
is handled internally in double.New parameters should be added as non-negative numbers. For parameters where you would have set -1 to disable in the past, you might want to consider a separate flag parameter or null value. In case you use the null value, please, ensure that any default value introduced in the DatabaseDescriptor to handle it is also duplicated in any related setters.
Parameters of type data storage, duration and data rate cannot be set to Long.MAX_VALUE (former parameters of long type) and Integer.MAX_VALUE (former parameters of int type). That numbers are used during conversion between units to prevent an overflow from happening.
Any time you add @Replaces with a name change, we need to add an entry in this Python dictionary in CCM to support the same backward compatibility as SnakeYAML.
Please follow the instructions in requirements.txt in the DTest repo how to retag CCM after committing any changes. You might want to test also with tagging in your repo to ensure that there will be no surprise after retagging the official CCM. Please be sure to run a full CI after any changes as CCM affects a few of our testing suites.
Some configuration parameters are not announced in cassandra.yaml, but they are presented in the Config class for advanced users. Those also should be using the new framework and naming conventions.
As we have backward compatibility, we didn’t have to rework all python DTests to set config in the new format, and we exercise the backward compatibility while testing. Please consider adding any new tests using the new names and value format though.
In-JVM upgrade tests do not support per-version configuration at the moment, so we have to keep the old names and value format. Currently, if we try to use the new config for a newer version, that will be silently ignored and default config will be used.
SnakeYAML supports overloading of parameters. This means that if you add a configuration parameter more than once in your
cassandra.yaml
- the latest occasion will be the one to load in Config during Cassandra startup. In order to make upgrades as less disruptive as possible, we continue supporting that behavior also with adding old and new names of a parameter intocassandra.yaml
.Please ensure that any JMX setters/getters update the Config class properties and not some local copies. Settings Virtual Table reports the configuration loaded at any time from the Config class.
Example:
If you add the following to cassandra.yaml
:
hinted_handoff_enabled: true
enabled_hinted_handolff: false
you will get loaded in Config
:
hinted_handoff_enabled: false
CASSANDRA-17379 was opened to improve the user experience and deprecate the overloading. By default, we refuse starting Cassandra with a config containing both old and new config keys for the same parameter. Start Cassandra with -Dcassandra.allow_new_old_config_keys=true
to override. For historical reasons duplicate config keys in cassandra.yaml
are allowed by default, start Cassandra with -Dcassandra.allow_duplicate_config_keys=false
to disallow this. Please note that key_cache_save_period
, row_cache_save_period
, counter_cache_save_period
will be affected only by -Dcassandra.allow_duplicate_config_keys
.