Key Generate Algorithm
Background
In traditional database software development, automatic primary key generation is a basic requirement and various databases provide support for this requirement, such as MySQL’s self-incrementing keys, Oracle’s self-incrementing sequences, etc.
After data sharding, it is a very tricky problem to generate global unique primary keys from different data nodes. Self-incrementing keys between different actual tables within the same logical table generate duplicate primary keys because they are not mutually perceived.
Although collisions can be avoided by constraining the initial value and step size of self-incrementing primary keys, additional O&M rules must to be introduced, making the solution lack completeness and scalability.
There are many third-party solutions that can perfectly solve this problem, such as UUID, which relies on specific algorithms to generate non-duplicate keys, or by introducing primary key generation services.
In order to cater to the requirements of different users in different scenarios, Apache ShardingSphere not only provides built-in distributed primary key generators, such as UUID, SNOWFLAKE, but also abstracts the interface of distributed primary key generators to facilitate users to implement their own customized primary key generators.
Parameters
Snowflake
Type: SNOWFLAKE
Attributes:
Name | DataType | Description | Default Value |
---|---|---|---|
worker-id (?) | long | The unique ID for working machine | 0 |
max-tolerate-time-difference-milliseconds (?) | long | The max tolerate time for different server’s time difference in milliseconds | 10 milliseconds |
max-vibration-offset (?) | int | The max upper limit value of vibrate number, range [0, 4096) . Notice: To use the generated value of this algorithm as sharding value, it is recommended to configure this property. The algorithm generates key mod 2^n (2^n is usually the sharding amount of tables or databases) in different milliseconds and the result is always 0 or 1 . To prevent the above sharding problem, it is recommended to configure this property, its value is (2^n)-1 | 1 |
Note: worker-id is optional
- In standalone mode, support user-defined configuration, if the user does not configure the default value of 0.
- In cluster mode, it will be automatically generated by the system, and duplicate values will not be generated in the same namespace.
UUID
Type: UUID
Attributes: None
Procedure
- Policy of distributed primary key configurations is for columns when configuring data sharding rules.
Sample
- Snowflake Algorithms
keyGenerators:
snowflake:
type: SNOWFLAKE
- UUID
keyGenerators:
uuid:
type: UUID