TiDB Lightning Requirements for the Target Database

Before using TiDB Lightning, you need to check whether the environment meets the requirements. This helps reduce errors during import and ensures import success.

Privileges of the target database

Based on the import mode and features enabled, the target database users should be granted with different privileges. The following table provides a reference.

FeatureScopeRequired privilegeRemarks
MandatoryBasic functionsTarget tableCREATE, SELECT, INSERT, UPDATE, DELETE, DROP, ALTERDROP is required only when tidb-lightning-ctl runs the checkpoint-destroy-all command
Target databaseCREATE
MandatoryLogical Import Modeinformation_schema.columnsSELECT
Physical Import Modemysql.tidbSELECT
-SUPER
-RESTRICTED_VARIABLES_ADMIN,RESTRICTED_TABLES_ADMINRequired when the target TiDB enables SEM
RecommendedConflict detection, max-errorSchema configured for lightning.task-info-schema-nameSELECT, INSERT, UPDATE, DELETE, CREATE, DROPIf not required, the value must be set to “”
OptionalParallel importSchema configured for lightning.meta-schema-nameSELECT, INSERT, UPDATE, DELETE, CREATE, DROPIf not required, the value must be set to “”
Optionalcheckpoint.driver = “mysql”checkpoint.schema settingSELECT,INSERT,UPDATE,DELETE,CREATE,DROPRequired when checkpoint information is stored in databases, instead of files

Storage space of the target database

The target TiKV cluster must have enough disk space to store the imported data. In addition to the standard hardware requirements, the storage space of the target TiKV cluster must be larger than the size of the data source x the number of replicas x 2. For example, if the cluster uses 3 replicas by default, the target TiKV cluster must have a storage space larger than 6 times the size of the data source. The formula has x 2 because:

  • Indexes might take extra space.
  • RocksDB has a space amplification effect.

It is difficult to calculate the exact data volume exported by Dumpling from MySQL. However, you can estimate the data volume by using the following SQL statement to summarize the DATA_LENGTH field in the information_schema.tables table:

  1. -- Calculate the size of all schemas
  2. SELECT
  3. TABLE_SCHEMA,
  4. FORMAT_BYTES(SUM(DATA_LENGTH)) AS 'Data Size',
  5. FORMAT_BYTES(SUM(INDEX_LENGTH)) 'Index Size'
  6. FROM
  7. information_schema.tables
  8. GROUP BY
  9. TABLE_SCHEMA;
  10. -- Calculate the 5 largest tables
  11. SELECT
  12. TABLE_NAME,
  13. TABLE_SCHEMA,
  14. FORMAT_BYTES(SUM(data_length)) AS 'Data Size',
  15. FORMAT_BYTES(SUM(index_length)) AS 'Index Size',
  16. FORMAT_BYTES(SUM(data_length+index_length)) AS 'Total Size'
  17. FROM
  18. information_schema.tables
  19. GROUP BY
  20. TABLE_NAME,
  21. TABLE_SCHEMA
  22. ORDER BY
  23. SUM(DATA_LENGTH+INDEX_LENGTH) DESC
  24. LIMIT
  25. 5;