Exporting data from Vitess
Since VTGate supports the MySQL protocol, in many cases it is possible to use existing client utilities when connecting to Vitess. This includes using logical dump tools such as mysqldump
, in certain cases.
This guide provides instructions on the required options when using these tools against a VTGate server for the purposes of exporting data from Vitess. It is recommended to follow the Backup and Restore guide for regular backups, since this method is performed directly on the tablet servers and is more efficient and safer for databases of any significant size. The dump methods that follow are typically not suitable for production backups, because Vitess does not implement all the locking constructs across a sharded database that are necessary to do a consistent logical backup while writing to the database. As a result, you will only be guaranteed to get a 100% consistent dump using these tools if you are sure that you are not writing to the database while running the dump.
mysqldump
The default invocation of mysqldump
attempts to execute statements which are not supported by Vitess, such as attempting to lock tables and dump GTID coordinates. The following options are required when using the mysqldump
binary from MySQL 5.7 to export data from the commerce
keyspace:
--lock-tables=off
: VTGate currently prohibits the syntaxLOCK TABLES
andUNLOCK TABLES
.--set-gtid-purged=OFF
:mysqldump
attemps to dump GTID coordinates of a server, but in the case of VTGate this does not make sense since it could be routing to multiple servers.--no-tablespaces
: This option disables dumping InnoDB tables by tablespace. This functionality is not yet supported by Vitess.....
: Additional mysqldump options like:-u <user>
,-p <password>
,-h <database server hostname>
.
For example to export the commerce
keyspace using the mysqldump
binary from MySQL 5.7:
$ mysqldump --set-gtid-purged=OFF --no-tablespaces .... commerce > commerce.sql
Vitess’ support for LOCK and UNLOCK statements is currently syntax-only. As a result, Vitess will simply ignore LOCK and UNLOCK statements without taking any underlying action. It is therefore unsafe to perform a locking mysqldump against a database that is actively being written to, and you should pause writes completely while performing the dump; or be willing to deal with any data inconsistencies that result.
NOTE: You will be limited by the Vitess row limits in the size of the tables that you can dump using this method. The default Vitess row limit is determined by the vttablet option -queryserver-config-max-result-size
and defaults to 10000 rows. So for an unsharded database, you will not be able to dump tables with more than 10000 rows, or N x 10000 rows if the table is fully sharded across N shards. Note that you should not blindly raise your row limits just because of this, it is an important Vitess operability and reliability feature. If you have large tables to dump, look into using go-mydumper instead.
To restore dump files created by mysqldump
, replay it against a Vitess server or other MySQL server using the mysql
command line client.
go-mydumper
Alternatively, you can use a slight modification of the go-mydumper
tool to export logical dumps of a Vitess keyspace. go-mydumper
has the advantage of being multi-threaded, and so can run faster on a database that has many tables. For a database with just one or a handful of large tables, go-mydumper
may not be that much faster than mysqldump
.
For information on the Vitess-compatible fork of go-mydumper
, see https://github.com/aquarapid/go-mydumper . Examples and instructions are available in the README.md in that repo. You will need to be able to compile golang binaries to use this tool.
go-mydumper
creates multiple files for each backup. To restore a backup, you can use the mysql
commandline client, but using the myloader
tool as described in the go-mydumper
repo above is easier and can be faster, since the loader is also multithreaded.