How Pony avoids lost updates
Lower isolation levels increase the ability of many users to access data at the same time, but it also can lead to database anomalies such as lost updates.
Let’s consider an example. Say we have two accounts. We need to provide a function which can transfer money from one account to another. During the transfer we check if the account has enough funds.
Let’s say we are using Django ORM for this task. Below if one of the possible ways of implementing such a function:
@transaction.atomic
def transfer_money(account_id1, account_id2, amount):
account1 = Account.objects.get(pk=account_id1)
account2 = Account.objects.get(pk=account_id2)
if amount > account1.amount: # validation
raise ValueError("Not enough funds")
account1.amount -= amount
account1.save()
account2.amount += amount
account2.save()
By default in Django, each save()
is performed in a separate transaction. If after the first save()
there will be a failure, the amount will just disappear. Even if there will be no failure, if another transaction will try to get the account statement in between of two save()
operations, the result will be wrong. In order to avoid such problems, both operations should be combined in one transaction. We can do that by decorating the function with the @transaction.atomic
decorator.
But even in this case we can encounter a problem. If two bank branches will try to transfer the full amount to different accounts at the same time, both operations will be performed. Each function will pass the validation and finally one transaction will override the results of another one. This anomaly is called “lost update”.
There are three ways to prevent such anomaly:
Use the SERIALIZABLE isolation level
Use SELECT FOR UPDATE instead SELECT
Use optimistic checks
If you use the SERIALIZABLE isolation level, the database will not allow to commit the second transaction by throwing an exception during commit. The disadvantage of such approach is that this level requires more system resources.
If you use SELECT FOR UPDATE then the transaction which hits the database first will lock the row and another transaction will wait.
The optimistic check doesn’t require more system resources and doesn’t lock the database rows. It eliminates the lost update anomaly by ensuring that the data wasn’t changed between the moment when we read it from the database and the commit operation.
The only way to avoid the lost update anomaly in Django is using the SELECT FOR UPDATE and you should use it explicitly. If you forget to do that or if you don’t realize that the problem of lost update exists with your business logic, your data can be lost.
Pony allows using all three approaches, having the third one, optimistic checks, turned on by default. This way Pony avoids the lost update anomaly completely. Also using the optimistic checks allows the highest concurrency because it doesn’t lock the database and doesn’t require extra resources.
The similar function for transferring money would look this way in Pony:
The SERIALIZABLE approach:
@db_session(serializable=True)
def transfer_money(account_id1, account_id2, amount):
account1 = Account[account_id1]
account2 = Account[account_id2]
if amount > account1.amount:
raise ValueError("Not enough funds")
account1.amount -= amount
account2.amount += amount
The SELECT FOR UPDATE approach:
@db_session
def transfer_money(account_id1, account_id2, amount):
account1 = Account.get_for_update(id=account_id1)
account2 = Account.get_for_update(id=account_id2)
if amount > account1.amount:
raise ValueError("Not enough funds")
account1.amount -= amount
account2.amount += amount
The optimistic check approach:
@db_session
def transfer_money(account_id1, account_id2, amount):
account1 = Account[account_id1]
account2 = Account[account_id2]
if amount > account1.amount:
raise ValueError("Not enough funds")
account1.amount -= amount
account2.amount += amount
The last approach is used by default in Pony and you don’t need to add anything else explicitly.