Serializable Transactions

In contrast to most databases, CockroachDB always uses SERIALIZABLE isolation, which is the strongest of the four transaction isolation levels) defined by the SQL standard and is stronger than the SNAPSHOT isolation level developed later. SERIALIZABLE isolation guarantees that even though transactions may execute in parallel, the result is the same as if they had executed one at a time, without any concurrency. This ensures data correctness by preventing all "anomalies" allowed by weaker isolation levels.

In this tutorial, you'll work through a hypothetical scenario that demonstrates the importance of SERIALIZABLE isolation for data correctness.

  • You'll start by reviewing the scenario and its schema.
  • You'll then execute the scenario at one of the weaker isolation levels, READ COMMITTED, observing the write skew anomaly and its implications. Because CockroachDB always uses SERIALIZABLE isolation, you'll run this portion of the tutorial on Postgres, which defaults to READ COMMITTED.
  • You'll finish by executing the scenario at SERIALIZABLE isolation, observing how it guarantees correctness. You'll use CockroachDB for this portion.

Note:

For a deeper discussion of transaction isolation and the write skew anomaly, see the Real Transactions are Serializable and What Write Skew Looks Like blog posts.

Overview

Scenario

  • A hospital has an application for doctors to manage their on-call shifts.
  • The hospital has a rule that at least one doctor must be on call at any one time.
  • Two doctors are on-call for a particular shift, and both of them try to request leave for the shift at approximately the same time.
  • In Postgres, with the default READ COMMITTED isolation level, the write skew anomaly results in both doctors successfully booking leave and the hospital having no doctors on call for that particular shift.
  • In CockroachDB, with the SERIALIZABLE isolation level, write skew is prevented, one doctor is allowed to book leave and the other is left on-call, and lives are saved.

Write skew

When write skew happens, a transaction reads something, makes a decision based on the value it saw, and writes the decision to the database. However, by the time the write is made, the premise of the decision is no longer true. Only SERIALIZABLE and some implementations of REPEATABLE READ isolation prevent this anomaly.

Schema

Schema for serializable transaction tutorial

Scenario on Postgres

Step 1. Start Postgres

  • If you haven't already, install Postgres locally. On Mac, you can use Homebrew:
  1. $ brew install postgres
  1. $ postgres -D /usr/local/var/postgres &

Step 2. Create the schema

  • Open a SQL connection to Postgres:
  1. $ psql
  • Create the doctors table:
  1. > CREATE TABLE doctors (
  2. id INT PRIMARY KEY,
  3. name TEXT
  4. );
  • Create the schedules table:
  1. > CREATE TABLE schedules (
  2. day DATE,
  3. doctor_id INT REFERENCES doctors (id),
  4. on_call BOOL,
  5. PRIMARY KEY (day, doctor_id)
  6. );

Step 3. Insert data

  • Add two doctors to the doctors table:
  1. > INSERT INTO doctors VALUES
  2. (1, 'Abe'),
  3. (2, 'Betty');
  • Insert one week's worth of data into the schedules table:
  1. > INSERT INTO schedules VALUES
  2. ('2018-10-01', 1, true),
  3. ('2018-10-01', 2, true),
  4. ('2018-10-02', 1, true),
  5. ('2018-10-02', 2, true),
  6. ('2018-10-03', 1, true),
  7. ('2018-10-03', 2, true),
  8. ('2018-10-04', 1, true),
  9. ('2018-10-04', 2, true),
  10. ('2018-10-05', 1, true),
  11. ('2018-10-05', 2, true),
  12. ('2018-10-06', 1, true),
  13. ('2018-10-06', 2, true),
  14. ('2018-10-07', 1, true),
  15. ('2018-10-07', 2, true);
  • Confirm that at least one doctor is on call each day of the week:
  1. > SELECT day, count(*) AS doctors_on_call FROM schedules
  2. WHERE on_call = true
  3. GROUP BY day
  4. ORDER BY day;
  1. day | doctors_on_call
  2. ------------+-----------------
  3. 2018-10-01 | 2
  4. 2018-10-02 | 2
  5. 2018-10-03 | 2
  6. 2018-10-04 | 2
  7. 2018-10-05 | 2
  8. 2018-10-06 | 2
  9. 2018-10-07 | 2
  10. (7 rows)

Step 4. Doctor 1 requests leave

Doctor 1, Abe, starts to request leave for 10/5/18 using the hospital's schedule management application.

  • The application starts a transaction:
  1. > BEGIN;
  • The application checks to make sure at least one other doctor is on call for the requested date:
  1. > SELECT count(*) FROM schedules
  2. WHERE on_call = true
  3. AND day = '2018-10-05'
  4. AND doctor_id != 1;

  1. count

  1. 1

(1 row)

Step 5. Doctor 2 requests leave

Around the same time, doctor 2, Betty, starts to request leave for the same day using the hospital's schedule management application.

  • In a new terminal, start a second SQL session:
  1. $ psql
  • The application starts a transaction:
  1. > BEGIN;
  • The application checks to make sure at least one other doctor is on call for the requested date:
  1. > SELECT count(*) FROM schedules
  2. WHERE on_call = true
  3. AND day = '2018-10-05'
  4. AND doctor_id != 2;

  1. count

  1. 1

(1 row)

Step 6. Leave is incorrectly booked for both doctors

  • In the terminal for doctor 1, since the previous check confirmed that another doctor is on call for 10/5/18, the application tries to update doctor 1's schedule:
  1. > UPDATE schedules SET on_call = false
  2. WHERE day = '2018-10-05'
  3. AND doctor_id = 1;
  • In the terminal for doctor 2, since the previous check confirmed the same thing, the application tries to update doctor 2's schedule:
  1. > UPDATE schedules SET on_call = false
  2. WHERE day = '2018-10-05'
  3. AND doctor_id = 2;
  • In the terminal for doctor 1, the application commits the transaction, despite the fact that the previous check (the SELECT query) is no longer true:
  1. > COMMIT;
  • In the terminal for doctor 2, the application commits the transaction, despite the fact that the previous check (the SELECT query) is no longer true:
  1. > COMMIT;

Step 7. Check data correctness

So what just happened? Each transaction started by reading a value that, before the end of the transaction, became incorrect. Despite that fact, each transaction was allowed to commit. This is known as write skew, and the result is that 0 doctors are scheduled to be on call on 10/5/18.

To check this, in either terminal, run:

  1. > SELECT * FROM schedules WHERE day = '2018-10-05';
  1. day | doctor_id | on_call
  2. ------------+-----------+---------
  3. 2018-10-05 | 1 | f
  4. 2018-10-05 | 2 | f
  5. (2 rows)

Again, this anomaly is the result of Postgres' default isolation level of READ COMMITTED, but note that this would happen with any isolation level except SERIALIZABLE and some implementations of REPEATABLE READ:

  1. > SHOW TRANSACTION_ISOLATION;

  1. transaction_isolation

read committed
(1 row)

Step 8. Stop Postgres

Exit each SQL shell with \q and then stop the Postgres server:

  1. $ pkill -9 postgres

Scenario on CockroachDB

When you repeat the scenario on CockroachDB, you'll see that the anomaly is prevented by CockroachDB's SERIALIZABLE transaction isolation.

Step 1. Start CockroachDB

  • If you haven't already, install CockroachDB locally.

  • Start a one-node CockroachDB cluster in insecure mode:

  1. $ cockroach start \
  2. --insecure \
  3. --store=serializable-demo \
  4. --listen-addr=localhost \
  5. --background

Step 2. Create the schema

  1. $ cockroach sql --insecure --host=localhost
  • Create the doctors table:
  1. > CREATE TABLE doctors (
  2. id INT PRIMARY KEY,
  3. name TEXT
  4. );
  • Create the schedules table:
  1. > CREATE TABLE schedules (
  2. day DATE,
  3. doctor_id INT REFERENCES doctors (id),
  4. on_call BOOL,
  5. PRIMARY KEY (day, doctor_id)
  6. );

Step 3. Insert data

  • Add two doctors to the doctors table:
  1. > INSERT INTO doctors VALUES
  2. (1, 'Abe'),
  3. (2, 'Betty');
  • Insert one week's worth of data into the schedules table:
  1. > INSERT INTO schedules VALUES
  2. ('2018-10-01', 1, true),
  3. ('2018-10-01', 2, true),
  4. ('2018-10-02', 1, true),
  5. ('2018-10-02', 2, true),
  6. ('2018-10-03', 1, true),
  7. ('2018-10-03', 2, true),
  8. ('2018-10-04', 1, true),
  9. ('2018-10-04', 2, true),
  10. ('2018-10-05', 1, true),
  11. ('2018-10-05', 2, true),
  12. ('2018-10-06', 1, true),
  13. ('2018-10-06', 2, true),
  14. ('2018-10-07', 1, true),
  15. ('2018-10-07', 2, true);
  • Confirm that at least one doctor is on call each day of the week:
  1. > SELECT day, count(*) AS on_call FROM schedules
  2. WHERE on_call = true
  3. GROUP BY day
  4. ORDER BY day;
  1. day | on_call
  2. +---------------------------+---------+
  3. 2018-10-01 00:00:00+00:00 | 2
  4. 2018-10-02 00:00:00+00:00 | 2
  5. 2018-10-03 00:00:00+00:00 | 2
  6. 2018-10-04 00:00:00+00:00 | 2
  7. 2018-10-05 00:00:00+00:00 | 2
  8. 2018-10-06 00:00:00+00:00 | 2
  9. 2018-10-07 00:00:00+00:00 | 2
  10. (7 rows)

Step 4. Doctor 1 requests leave

Doctor 1, Abe, starts to request leave for 10/5/18 using the hospital's schedule management application.

  • The application starts a transaction:
  1. > BEGIN;
  • The application checks to make sure at least one other doctor is on call for the requested date:
  1. > SELECT count(*) FROM schedules
  2. WHERE on_call = true
  3. AND day = '2018-10-05'
  4. AND doctor_id != 1;

Press enter a second time to have the server return the result:

  1. count
  2. +-------+
  3. 1
  4. (1 row)

Step 5. Doctor 2 requests leave

Around the same time, doctor 2, Betty, starts to request leave for the same day using the hospital's schedule management application.

  • In a new terminal, start a second SQL session:
  1. $ cockroach sql --insecure --host=localhost
  • The application starts a transaction:
  1. > BEGIN;
  • The application checks to make sure at least one other doctor is on call for the requested date:
  1. > SELECT count(*) FROM schedules
  2. WHERE on_call = true
  3. AND day = '2018-10-05'
  4. AND doctor_id != 2;

Press enter a second time to have the server return the result:

  1. count
  2. +-------+
  3. 1
  4. (1 row)

Step 6. Leave is booked for only 1 doctor

  • In the terminal for doctor 1, since the previous check confirmed that another doctor is on call for 10/5/18, the application tries to update doctor 1's schedule:
  1. > UPDATE schedules SET on_call = false
  2. WHERE day = '2018-10-05'
  3. AND doctor_id = 1;
  • In the terminal for doctor 2, since the previous check confirmed the same thing, the application tries to update doctor 2's schedule:
  1. > UPDATE schedules SET on_call = false
  2. WHERE day = '2018-10-05'
  3. AND doctor_id = 2;
  • In the terminal for doctor 1, the application tries to commit the transaction:
  1. > COMMIT;

Since CockroachDB uses SERIALIZABLE isolation, the database detects that the previous check (the SELECT query) is no longer true due to a concurrent transaction. It therefore prevents the transaction from committing, returning a retry error that indicates that the transaction must be attempted again:

  1. pq: restart transaction: HandledRetryableTxnError: TransactionRetryError: retry txn (RETRY_SERIALIZABLE): "sql txn" id=57dd0454 key=/Table/53/1/17809/1/0 rw=true pri=0.00710012 iso=SERIALIZABLE stat=PENDING epo=0 ts=1539116499.676097000,2 orig=1539115078.961557000,0 max=1539115078.961557000,0 wto=false rop=false seq=4

Tip:

For this kind of error, CockroachDB recommends a client-side transaction retry loop that would transparently observe that the one doctor cannot take time off because the other doctor already succeeded in asking for it. You can find generic transaction retry functions for various languages in our Build an App tutorials.

  • In the terminal for doctor 2, the application tries to commit the transaction:
  1. > COMMIT;

Since the transaction for doctor 1 failed, the transaction for doctor 2 can commit without causing any data correctness problems.

Step 7. Check data correctness

In either terminal, confirm that one doctor is still on call for 10/5/18:

  1. > SELECT * FROM schedules WHERE day = '2018-10-05';
  1. day | doctor_id | on_call
  2. +---------------------------+-----------+---------+
  3. 2018-10-05 00:00:00+00:00 | 1 | true
  4. 2018-10-05 00:00:00+00:00 | 2 | false
  5. (2 rows)

Again, the write skew anomaly was prevented by CockroachDB using the SERIALIZABLE isolation level:

  1. > SHOW TRANSACTION_ISOLATION;
  1. transaction_isolation
  2. +-----------------------+
  3. serializable
  4. (1 row)

Step 8. Stop CockroachDB

Once you're done with your test cluster, exit each SQL shell with \q and then stop the node:

  1. $ cockroach quit --insecure --host=localhost

If you do not plan to restart the cluster, you may want to remove the node's data store:

  1. $ rm -rf serializable-demo

What's next?

Explore other core CockroachDB benefits and features:

Was this page helpful?
YesNo