Reparo User Guide

Reparo is a TiDB Binlog tool, used to recover the incremental data. To back up the incremental data, you can use Drainer of TiDB Binlog to output the binlog data in the protobuf format to files. To restore the incremental data, you can use Reparo to parse the binlog data in the files and apply the binlog in TiDB/MySQL.

The Reparo installation package (reparo) is included in the TiDB Toolkit. To download the TiDB Toolkit, see Download TiDB Tools.

Reparo usage

Description of command line parameters

  1. Usage of Reparo:
  2. -L string
  3. The level of the output information of logs
  4. Value: "debug"/"info"/"warn"/"error"/"fatal" ("info" by default)
  5. -V Prints the version.
  6. -c int
  7. The number of concurrencies in the downstream for the replication process (`16` by default). A higher value indicates a better throughput for the replication.
  8. -config string
  9. The path of the configuration file
  10. If the configuration file is specified, Reparo reads the configuration data in this file.
  11. If the configuration data also exists in the command line parameters, Reparo uses the configuration data in the command line parameters to cover that in the configuration file.
  12. -data-dir string
  13. The storage directory for the binlog file in the protobuf format that Drainer outputs ("data.drainer" by default)
  14. -dest-type string
  15. The downstream service type
  16. Value: "print"/"mysql" ("print" by default)
  17. If it is set to "print", the data is parsed and printed to standard output while the SQL statement is not executed.
  18. If it is set to "mysql", you need to configure the "host", "port", "user" and "password" information in the configuration file.
  19. -log-file string
  20. The path of the log file
  21. -log-rotate string
  22. The switch frequency of log files
  23. Value: "hour"/"day"
  24. -start-datetime string
  25. Specifies the time point for starting recovery.
  26. Format: "2006-01-02 15:04:05"
  27. If it is not set, the recovery process starts from the earliest binlog file.
  28. -stop-datetime string
  29. Specifies the time point of finishing the recovery process.
  30. Format: "2006-01-02 15:04:05"
  31. If it is not set, the recovery process ends up with the last binlog file.
  32. -safe-mode bool
  33. Specifies whether to enable safe mode. When enabled, it supports repeated replication.
  34. -txn-batch int
  35. The number of SQL statements in a transaction that is output to the downstream database (`20` by default).

Description of the configuration file

  1. # The storage directory for the binlog file in the protobuf format that Drainer outputs
  2. data-dir = "./data.drainer"
  3. # The level of the output information of logs
  4. # Value: "debug"/"info"/"warn"/"error"/"fatal" ("info" by default)
  5. log-level = "info"
  6. # Uses `start-datetime` and `stop-datetime` to specify the time range in which
  7. # the binlog files are to be recovered.
  8. # Format: "2006-01-02 15:04:05"
  9. # start-datetime = ""
  10. # stop-datetime = ""
  11. # Correspond to `start-datetime` and `stop-datetime` respectively.
  12. # They are used to specify the time range in which the binlog files are to be recovered.
  13. # If `start-datetime` and `stop-datetime` are set, there is no need to set `start-tso` and `stop-tso`.
  14. # When you perform a full recovery or resume an incremental recovery, set start-tso to tso + 1 or stop-tso + 1, respectively.
  15. # start-tso = 0
  16. # stop-tso = 0
  17. # The downstream service type
  18. # Value: "print"/"mysql" ("print" by default)
  19. # If it is set to "print", the data is parsed and printed to standard output
  20. # while the SQL statement is not executed.
  21. # If it is set to "mysql", you need to configure `host`, `port`, `user` and `password` in [dest-db].
  22. dest-type = "mysql"
  23. # The number of SQL statements in a transaction that is output to the downstream database (`20` by default).
  24. txn-batch = 20
  25. # The number of concurrencies in the downstream for the replication process (`16` by default). A higher value indicates a better throughput for the replication.
  26. worker-count = 16
  27. # Safe-mode configuration
  28. # Value: "true"/"false" ("false" by default)
  29. # If it is set to "true", Reparo splits the `UPDATE` statement into a `DELETE` statement plus a `REPLACE` statement.
  30. safe-mode = false
  31. # `replicate-do-db` and `replicate-do-table` specify the database and table to be recovered.
  32. # `replicate-do-db` has priority over `replicate-do-table`.
  33. # You can use a regular expression for configuration. The regular expression should start with "~".
  34. # The configuration method for `replicate-do-db` and `replicate-do-table` is
  35. # the same with that for `replicate-do-db` and `replicate-do-table` of Drainer.
  36. # replicate-do-db = ["~^b.*","s1"]
  37. # [[replicate-do-table]]
  38. # db-name ="test"
  39. # tbl-name = "log"
  40. # [[replicate-do-table]]
  41. # db-name ="test"
  42. # tbl-name = "~^a.*"
  43. # If `dest-type` is set to `mysql`, `dest-db` needs to be configured.
  44. [dest-db]
  45. host = "127.0.0.1"
  46. port = 3309
  47. user = "root"
  48. password = ""

Start example

  1. ./reparo -config reparo.toml

Reparo - 图1

Note

  • data-dir specifies the directory for the binlog file that Drainer outputs.

  • Both start-datatime and start-tso are used to specify the time point for starting recovery, but they are different in the time format. If they are not set, the recovery process starts from the earliest binlog file by default.

  • Both stop-datetime and stop-tso are used to specify the time point for finishing recovery, but they are different in the time format. If they are not set, the recovery process ends up with the last binlog file by default.

  • dest-type specifies the destination type. Its value can be “mysql” and “print.”

    • When it is set to mysql, the data can be recovered to MySQL or TiDB that uses or is compatible with the MySQL protocol. In this case, you need to specify the database information in [dest-db] of the configuration information.
    • When it is set to print, only the binlog information is printed. It is generally used for debugging and checking the binlog information. In this case, there is no need to specify [dest-db].
  • replicate-do-db specifies the database for recovery. If it is not set, all the databases are to be recovered.

  • replicate-do-table specifies the table for recovery. If it is not set, all the tables are to be recovered.