Managed, Online Schema Changes
<< Unmanaged Schema Changes Online DDL strategies >>
Note: this feature is EXPERIMENTAL.
Vitess offers managed, online schema migrations (aka Online DDL), transparently to the user. Vitess Onine DDL offers:
- Non-blocking migrations
- Migrations are asyncronously auto-scheduled, queued and executed by tablets
- Migration state is trackable
- Migrations are cancellable
- Migrations are retry-able
- Lossless, revertible migrations
- Support for declarative migrations
As general overview:
- User chooses a strategy for online DDL (online DDL is opt in)
- User submits one or more schema change queries, using the standard MySQL
CREATE TABLE
,ALTER TABLE
andDROP TABLE
syntax. - Vitess responds with a Job ID for each schema change query.
- Vitess resolves affected shards.
- A shard’s
primary
tablet schedules the migration to run when possible. - Tablets will independently run schema migrations:
ALTER TABLE
statements run viaVReplication
,gh-ost
orpt-online-schema-change
, as per selected strategyCREATE TABLE
statements run directly.DROP TABLE
statements run safely and lazily.
- Vitess provides the user a mechanism to view migration status, cancel or retry migrations, based on the job ID.
Syntax
The standard MySQL syntax for CREATE
, ALTER
and DROP
is supported.
ALTER TABLE
Use the standard MySQL ALTER TABLE
syntax to run online DDL. Whether your schema migration runs synchronously (the default MySQL behavior) or asynchronously (aka online), is determined by ddl_strategy
.
We assume we have a keyspace (schema) called commerce
, with a table called demo
, that has the following definition:
CREATE TABLE `demo` (
`id` int NOT NULL,
`status` varchar(32) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB
Consider the following schema migration statement:
ALTER TABLE demo MODIFY id bigint UNSIGNED;
This statement can be executed as:
- a
VReplication
, managed online migration - a
gh-ost
, managed online migration - a
pt-online-schema-change
, managed online migration - a synchronous, unmanaged schema change
See DDL Strategies for discussion around the different options.
CREATE TABLE
Use the standard MySQL CREATE TABLE
syntax. The query goes through the same migration flow as ALTER TABLE
does. The tablets eventually run the query directly on the MySQL backends.
DROP TABLE
Use the standard MySQL DROP TABLE
syntax. The query goes through the same migration flow as ALTER TABLE
does. Tables are not immediately dropped. Instead, they are renamed into special names, recognizable by the table lifecycle mechanism, to then slowly and safely transition through lifecycle states into finally getting dropped.
Statement transformations
Vitess may modify your queries to qualify for online DDL statement. Modifications include:
- A multi-table
DROP
statement is replaced by multipleDROP
statements, each operating on a single table (and each tracked by its own job ID). - A
CREATE INDEX
statement is replaced by the equivalentALTER TABLE
statement.
ddl_strategy
You will set either @@ddl_strategy
session variable, or -ddl_strategy
command line flag, to control your schema migration strategy, and specifically, to enable and configure online DDL. Details in DDL Strategies. A quick overview:
- The value
"direct"
, means not an online DDL. The empty value (""
) is also interpreted asdirect
. A query is immediately pushed and applied on backend servers. This is the default strategy. - The value
"online"
instructs Vitess to run anALTER TABLE
online DDL viaVReplication
. - The value
"gh-ost"
instructs Vitess to run anALTER TABLE
online DDL viagh-ost
. - The value
"pt-osc"
instructs Vitess to run anALTER TABLE
online DDL viapt-online-schema-change
. - You may specify arguments for your tool of choice, e.g.
"gh-ost --max-load Threads_running=200"
. Details follow.
CREATE
and DROP
statements run in the same way for "online"
, "gh-ost"
and "pt-osc"
strategies, and we consider them all to be online.
Running, tracking and controlling Online DDL
Vitess provides two interfaces to interacting with Online DDL:
- SQL commands, via
VTGate
- Command line interface, via
vtctl
Supported interactions are:
- Running migrations (submitting Online DDL requests)
- Tracking migrations
- Cancelling a migration
- Cancelling all pending migrations
- Retrying a migration
- Reverting a migration
See Audit and Control for a detailed breakdown. As quick examples:
Executing an Online DDL via VTGate/SQL
mysql> set @@ddl_strategy='online';
mysql> alter table corder add column ts timestamp not null default current_timestamp;
+--------------------------------------+
| uuid |
+--------------------------------------+
| bf4598ab_8d55_11eb_815f_f875a4d24e90 |
+--------------------------------------+
mysql> drop table customer;
+--------------------------------------+
| uuid |
+--------------------------------------+
| 6848c1a4_8d57_11eb_815f_f875a4d24e90 |
+--------------------------------------+
Executing an Online DDL via vtctl/ApplySchema
$ vtctlclient ApplySchema -ddl_strategy "online" -sql "ALTER TABLE demo MODIFY id bigint UNSIGNED" commerce
a2994c92_f1d4_11ea_afa3_f875a4d24e90
Migration flow and states
We highlight how Vitess manages migrations internally, and explain what states a migration goes through.
- Whether via
vtctlclient ApplySchema
or viaVTGate
as described above, a migration request entry is persisted in globaltopo
(e.g. the globaletcd
cluster). vtctld
periodically checks on new migration requests.vtctld
resolves the relevant shards, and theprimary
tablet for each shard.vtctld
pushes the request to all relevantprimary
tablets.- If not all shards confirm receipt,
vtctld
periodically keeps retrying pushing the request to the shards until all approve. - Internally, tablets persist the request in a designated table in the
_vt
schema. Do not manipulate that table directly as that can cause inconsistencies. - A shard’s
primary
tablet owns running the migration. It is independent of other shards. It will schedule the migration to run when possible. A tablet will not run two migrations at the same time. - A migration is first created in
queued
state. - If the tablet sees queued migration, and assuming there’s no reason to wait, it picks the oldest requested migration in
queued
state, and moves it toready
state. - For
ALTER TABLE
statements:- Tablet prepares for the migration. It creates a MySQL account with a random password, to be used by this migration only. It creates the command line invocation, and extra scripts if possible.
- The tablet then runs the migration. Whether
gh-ost
orpt-online-schema-change
, it first runs in dry run mode, and, if successful, in actual execute mode. The migration is then inrunning
state. - The migration will either run to completion, fail, or be interrupted. If successful, it transitions into
complete
state, which is the end of the road for that migration. If failed or interrupted, it transitions tofailed
state. The user may choose to retry failed migrations (see below). - The user is able to cancel a migration (details below). If the migration hasn’t started yet, it transitions to
cancelled
state. If the migration isrunning
, then it is interrupted, and is expected to transition intofailed
state.
- For
CREATE TABLE
statements:- Tablet runs the statement directly on the MySQL backend
- For
DROP TABLE
statements:- A multi-table
DROP TABLE
statement is converted to multiple single-tableDROP TABLE
statements - Each
DROP TABLE
is internally replaced with aRENAME TABLE
- Table is renamed using a special naming convention, identified by the Table Lifecycle mechanism
- As result, a single
DROP TABLE
statement may generate multiple distinct migrations with distinct migration UUIDs.
- A multi-table
By way of illustration, suppose a migration is now in running
state, and is expected to keep on running for the next few hours. The user may initiate a new ALTER TABLE...
statement. It will persist in global topo
. vtctld
will pick it up and advertise it to the relevant tablets. Each will persist the migration request in queued
state. None will run the migration yet, since another migration is already in progress. In due time, and when the executing migration completes (whether successfully or not), and assuming no other migrations are queued
, the primary
tablets, each in its own good time, will execute the new migration.
At this time, the user is responsible to track the state of all migrations. VTTablet does not report back to vtctld
. This may change in the future.
At this time, there are no automated retries. For example, a failover on a shard causes the migration to fail, and Vitess will not try to re-run the migration on the new primary
. It is the user’s responsibility to issue a retry
. This may change in the future.
Configuration
-retain_online_ddl_tables
: (vttablet
) determines how long vttablet should keep an old migrated table before purging it. Type: duration. Default: 24 hours.Example:
vttablet -retain_online_ddl_tables 48h
-migration_check_interval
: (vttablet
) interval between checks for submitted migrations. Type: duration. Default: 1 hour.Example:
vttablet -migration_check_interval 30s
-enable_online_ddl
: (vtgate
) whether Online DDL operations are at all possible throughVTGate
. Type: boolean. Default:true
Example:
vtgate -enable_online_ddl=false
to disable access to Online DDL viaVTGate
.
Auto resume after failure
VReplication based migrations (ddl_strategy="online"
) are resumable across failure and across primary failovers.
Auto retry after failure
Neither gh-ost
and pt-osc
are able to resume from point of failure, or after a failover. However, Vitess management can issue an automated retry (starting the migration afresh).
- which
vttablet
initiated the migration - how many times a migration has been retried
- whether a migration failed due to a
vttablet
failure (as is the case in a failover scenario)
Vitess will auto-retry a failed migration when:
- The migration failed due to a
vttablet
failure, and - it has not been retried (this is a temporary restriction)
The migration will be transitioned into queued
state, as if the user requested a retry
operation. Note that this takes place on a per-shard basis.
The primary use case is a primary failure and failover. The newly promoted tablet will be able to retry the migration that broke during the previous primary failure. To clarify, the migration will start anew, as at this time there is no mechanism to resume a broken migration.
Throttling
All three strategies: online
, gh-ost
and pt-osc
utilize the tablet throttler, which is a cooperative throttler service based on replication lag. The tablet throttler automatically detects topology REPLICA
tablets and adapts to changes in the topology. See Tablet throttler.
online
strategy uses the throttler by the fact VReplication natively uses the throttler on both source and target ends (for both reads and writes)gh-ost
uses the throttler via--throttle-http
, which is automatically provided by Vitesspt-osc
uses the throttler by replication lag plugin, automatically injected by Vitess.
NOTE that at this time (and subject to change) the tablet throttler is disabled by default. Enable it with vttablet
’s -enable-lag-throttler
flag. If the tablet throttler is disabled, schema migrations will not throttle on replication lag.
Table cleanup
All ALTER
strategies leave artifacts behind. Whether successful or failed, either the original table or the ghost table is left still populated at the end of the migration. Vitess explicitly makes sure the tables are not dropped at the end of the migration. This is for two reasons:
- Make the table/data still available for a while, and
- in MySQL pre
8.0.23
, aDROP TABLE
operation can be dangerous in production as it commonly locks the buffer pool for a substantial period.
The tables are kept for 24 hours after migration completion. Vitess automatically cleans up those tables as soon as a migration completes (either successful or failed). You will normally not need to do anything.
Artifact tables are identifiable via SELECT artifacts FROM _vt.schema_migrations
in a VExec
command, see below. You should generally not touch these tables. It’s possible to DROP
those tables with direct
DDL strategy. Note that dropping tables in production can be risky and lock down your database for a substantial period of time.