taosdump

Introduction

taosdump is a tool that supports backing up data from a running TDengine cluster and restoring the backed up data to the same, or another running TDengine cluster.

taosdump can back up a database, a super table, or a normal table as a logical data unit or backup data records in the database, super tables, and normal tables. When using taosdump, you can specify the directory path for data backup. If you do not specify a directory, taosdump will back up the data to the current directory by default.

If the specified location already has data files, taosdump will prompt the user and exit immediately to avoid data overwriting. This means that the same path can only be used for one backup.

Please be careful if you see a prompt for this and please ensure that you follow best practices and relevant SOPs for data integrity, backup and data security.

Users should not use taosdump to back up raw data, environment settings, hardware information, server configuration, or cluster topology. taosdump uses Apache AVRO as the data file format to store backup data.

Installation

There are two ways to install taosdump:

  • Install the taosTools official installer. Please find taosTools from Release History page and download and install it.

  • Compile taos-tools separately and install it. Please refer to the taos-tools repository for details.

Common usage scenarios

taosdump backup data

  1. backing up all databases: specify -A or -all-databases parameter.
  2. backup multiple specified databases: use -D db1,db2,... parameters;
  3. back up some super or normal tables in the specified database: use dbname stbname1 stbname2 tbname1 tbname2 ... parameters. Note that the first parameter of this input sequence is the database name, and only one database is supported. The second and subsequent parameters are the names of super or normal tables in that database, separated by spaces.
  4. back up the system log database: TDengine clusters usually contain a system database named log. The data in this database is the data that TDengine runs itself, and the taosdump will not back up the log database by default. If users need to back up the log database, users can use the -a or -allow-sys command-line parameter.
  5. Loose mode backup: taosdump version 1.4.1 onwards provides -n and -L parameters for backing up data without using escape characters and “loose” mode, which can reduce the number of backups if table names, column names, tag names do not use escape characters. This can also reduce the backup data time and backup data footprint. If you are unsure about using -n and -L conditions, please use the default parameters for “strict” mode backup. See the official documentation for a description of escaped characters.
taosdump - 图1tip
  • taosdump versions after 1.4.1 provide the -I argument for parsing Avro file schema and data. If users specify -s then only taosdump will parse schema.
  • Backups after taosdump 1.4.2 use the batch count specified by the -B parameter. The default value is 16384. If, in some environments, low network speed or disk performance causes “Error actual dump … batch …”, then try changing the -B parameter to a smaller value.
  • The export of taosdump does not support resuming from an interruption. Therefore, if the taosdump process terminates unexpectedly, delete all related files that have been exported or generated.
  • The import of taosdump supports resuming from an interruption, but when the process resumes, you will receive some “table already exists” messages, which could be ignored.

taosdump recover data

Restore the data file in the specified path: use the -i parameter plus the path to the data file. You should not use the same directory to backup different data sets, and you should not backup the same data set multiple times in the same path. Otherwise, the backup data will cause overwriting or multiple backups.

taosdump - 图2tip

taosdump internally uses TDengine stmt binding API for writing recovery data with a default batch size of 16384 for better data recovery performance. If there are more columns in the backup data, it may cause a “WAL size exceeds limit” error. You can try to adjust the batch size to a smaller value by using the -B parameter.

Detailed command-line parameter list

The following is a detailed list of taosdump command-line arguments.

  1. Usage: taosdump [OPTION...] dbname [tbname ...]
  2. or: taosdump [OPTION...] --databases db1,db2,...
  3. or: taosdump [OPTION...] --all-databases
  4. or: taosdump [OPTION...] -i inpath
  5. or: taosdump [OPTION...] -o outpath
  6. -h, --host=HOST Server host from which to dump data. Default is
  7. localhost.
  8. -p, --password User password to connect to server. Default is
  9. taosdata.
  10. -P, --port=PORT Port to connect
  11. -u, --user=USER User name used to connect to server. Default is
  12. root.
  13. -c, --config-dir=CONFIG_DIR Configure directory. Default is /etc/taos
  14. -i, --inpath=INPATH Input file path.
  15. -o, --outpath=OUTPATH Output file path.
  16. -r, --resultFile=RESULTFILE DumpOut/In Result file path and name.
  17. -a, --allow-sys Allow to dump system database
  18. -A, --all-databases Dump all databases.
  19. -D, --databases=DATABASES Dump listed databases. Use comma to separate
  20. database names.
  21. -N, --without-property Dump database without its properties.
  22. -s, --schemaonly Only dump table schemas.
  23. -y, --answer-yes Input yes for prompt. It will skip data file
  24. checking!
  25. -d, --avro-codec=snappy Choose an avro codec among null, deflate, snappy,
  26. and lzma.
  27. -S, --start-time=START_TIME Start time to dump. Either epoch or
  28. ISO8601/RFC3339 format is acceptable. ISO8601
  29. format example: 2017-10-01T00:00:00.000+0800 or
  30. 2017-10-0100:00:00:000+0800 or '2017-10-01
  31. 00:00:00.000+0800'
  32. -E, --end-time=END_TIME End time to dump. Either epoch or ISO8601/RFC3339
  33. format is acceptable. ISO8601 format example:
  34. 2017-10-01T00:00:00.000+0800 or
  35. 2017-10-0100:00:00.000+0800 or '2017-10-01
  36. 00:00:00.000+0800'
  37. -B, --data-batch=DATA_BATCH Number of data per query/insert statement when
  38. backup/restore. Default value is 16384. If you see
  39. 'error actual dump .. batch ..' when backup or if
  40. you see 'WAL size exceeds limit' error when
  41. restore, please adjust the value to a smaller one
  42. and try. The workable value is related to the
  43. length of the row and type of table schema.
  44. -I, --inspect inspect avro file content and print on screen
  45. -L, --loose-mode Use loose mode if the table name and column name
  46. use letter and number only. Default is NOT.
  47. -n, --no-escape No escape char '`'. Default is using it.
  48. -T, --thread-num=THREAD_NUM Number of thread for dump in file. Default is
  49. 8.
  50. -C, --cloud=CLOUD_DSN specify a DSN to access TDengine cloud service
  51. -R, --restful Use RESTful interface to connect TDengine
  52. -t, --timeout=SECONDS The timeout seconds for websocket to interact.
  53. -g, --debug Print debug info.
  54. -?, --help Give this help list
  55. --usage Give a short usage message
  56. -V, --version Print program version
  57. Mandatory or optional arguments to long options are also mandatory or optional
  58. for any corresponding short options.