4

I have 2 models: Product and Order.

Product has an integer field for stock, whereas Order has a status and a foreign key to Product:

class Product(models.Model):
    name = models.CharField(max_length=30)
    stock = models.PositiveSmallIntegerField(default=1)

class Order(models.Model):
    product = models.ForeignKey('Product')
    DRAFT = 'DR'; INPROGRESS = 'PR'; ABORTED = 'AB'
    STATUS = ((INPROGRESS, 'In progress'),(ABORTED, 'Aborted'),)
    status = models.CharField(max_length = 2, choices = STATUS, default = DRAFT)

My goal is to have the product's stock decrease by one for each new order, and increase by one for each order cancellation. In that purpose, I've overloaded the save method of the Order model as such (inspired by Django: When saving, how can you check if a field has changed?):

from django.db.models import F

class Order(models.Model):
    product = models.ForeignKey('Product')
    status = models.CharField(max_length = 2, choices = STATUS, default = DRAFT)

    EXISTING_STATUS = set([INPROGRESS])

    __original_status = None

    def __init__(self, *args, **kwargs):
        super(Order, self).__init__(*args, **kwargs)
        self.__original_status = self.status

    def save(self, *args, **kwargs):
        old_status = self.__original_status
        new_status = self.status
        has_changed_status = old_status != new_status
        if has_changed_status:
            product = self.product
            if not old_status in Order.EXISTING_STATUS and new_status in Order.EXISTING_STATUS:
                product.stock = F('stock') - 1
                product.save(update_fields=['stock'])
            elif old_status in Order.EXISTING_STATUS and not new_status in Order.EXISTING_STATUS:
                product.stock = F('stock') + 1
                product.save(update_fields=['stock'])
        super(Order, self).save(*args, **kwargs)
        self.__original_status = self.status

Using the RestFramework, I've created 2 views, one for creating new orders, one for cancelling existing orders. Both use a straightforward serializer:

class OrderSimpleSerializer(serializers.ModelSerializer):

    class Meta:
        model = Order
        fields = (
            'id',
            'product',
            'status',
        )
        read_only_fields = (
            'status',
        )

class OrderList(generics.ListCreateAPIView):
    model = Order
    serializer_class = OrderSimpleSerializer

    def pre_save(self, obj):
        super(OrderList,self).pre_save(obj)
        product = obj.product
        if not product.stock > 0:
            raise ConflictWithAnotherRequest("Product is not available anymore.")
        obj.status = Order.INPROGRESS

class OrderAbort(generics.RetrieveUpdateAPIView):
    model = Order
    serializer_class = OrderSimpleSerializer

    def pre_save(self, obj):
        obj.status = Order.ABORTED

Here is how to access these two views:

from myapp.views import *

urlpatterns = patterns('',
    url(r'^order/$', OrderList.as_view(), name='order-list'),
    url(r'^order/(?P<pk>[0-9]+)/abort/$', OrderAbort.as_view(), name='order-abort'),
)

I am using Django 1.6b4, Python 3.3, Rest Framework 2.7.3 and PostgreSQL 9.2.

My problem is that concurrent requests can increase the stock of a product higher than original stock!

Here is the script I use to demonstrate that:

import sys
import urllib.request
import urllib.parse
import json

opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor)

def create_order():
    url = 'http://127.0.0.1:8000/order/'
    values = {'product':1}
    data  = urllib.parse.urlencode(values).encode('utf-8')
    request = urllib.request.Request(url, data)
    response = opener.open(request)
    return response

def cancel_order(order_id):
    abort_url = 'http://127.0.0.1:8000/order/{}/abort/'.format(order_id)
    values = {'product':1,'_method':'PUT'}
    data  = urllib.parse.urlencode(values).encode('utf-8')
    request = urllib.request.Request(abort_url, data)
    try:
        response = opener.open(request)
    except Exception as e:
        if (e.code != 403):
            print(e)
    else:
        print(response.getcode())

def main():
    response = create_order()
    print(response.getcode())
    data = response.read().decode('utf-8')
    order_id = json.loads(data)['id']
    time.sleep(1)
    for i in range(2):
        p = Process(target=cancel_order, args=[order_id])
        p.start()

if __name__ == '__main__':
    main()

This script gives the following output, for a product with a stock of 1:

201 # means it creates an order for Product, thus decreasing stock from 1 to 0
200 # means it cancels the order for Product, thus increasing stock from 0 to 1
200 # means it cancels the order for Product, thus increasing stock from 1 to 2 (shouldn't happen)

EDIT

I've added a sample project to reproduce the bug: https://github.com/ThinkerR/django-concurrency-demo

Community
  • 1
  • 1
Benjamin Toueg
  • 10,511
  • 7
  • 48
  • 79

3 Answers3

5

Take a look at django-concurrency. It handles concurrent editing using optimistic concurrency control pattern.

chappjc
  • 30,359
  • 6
  • 75
  • 132
anonymous
  • 51
  • 2
3

I think the problem is not about updating the product count atomically -- the F() expression for Django ORM should handle that correctly. However the combined operation of:

  1. Checking order status (does product count need update?)
  2. Updating product count
  3. Updating order status (to cancelled)

is not an atomic operation. It is possible to have the following sequence of events for two threads A and B (both handling cancel request for the same order):

A: Check order status: new is cancel, differs from previous one
B: Check order status: new is cancel, differs from previous one
A: Update product count atomically from 0 to 1
B: Update product count atomically from 1 to 2
A: Update order status to cancelled
B: Update order status to cancelled

What you need to do is one of the following:

  1. Change the whole operation to occur within a transaction. You didn't mention your database or transaction mode setup. Django defaults to autocommit mode where each database operation is commited separately. To change to transactions (probably tied to the HTTP request), see https://docs.djangoproject.com/en/dev/topics/db/transactions/
  2. Create a synchronization barrier to prevent both threads from updating the product count. You could do this with an atomic compare-and-swap operation on the database layer, but I'm not sure whether there's a ready F or similar primitive to do this. (The main idea is to change the ordering of order and product data updates so that you first update and test atomically the order status.)
  3. Use some other form of synchronization mechanism, using a distributed lock system (you can use Redis and many other systems for this). I think this would be overkill for this case, though.

In summary: Unless you are using HTTP-level transactions for your application already, try setting ATOMIC_REQUESTS = True in your Django configuration file (settings.py).

If you don't or can't do that please note that the alternative methods do not give you order-product pair consistency. Try to think what will happen if the Django server crashes between updates to the product and the order -- only one will be updated. (This is where database transactions are a must where the database engine notices that the client aborted -- due to broken network connection -- and rolls back the transaction.)

  • I've tried `ATOMIC_REQUESTS = True`, but it didn't work. I don't understand why though. That's why I came here crying for help. As said in the comment of the answer of @esauro, `select_for_update(nowait=True)` satisfies my need. However I am not sure what the behaviour would be in case of a server crash. – Benjamin Toueg Sep 23 '13 at 17:23
  • What's the database backend you are using? MySQL, Postgres, something other ...? – Santeri Paavolainen Sep 23 '13 at 19:07
  • Check the OP: PostgreSQL 9.2 – Benjamin Toueg Sep 24 '13 at 09:25
  • I think my problem comes from a confusion between transactions and locks. This is still kind of blurry to me at the moment... I fixed my problem by making an upstream transaction with `select_for_update` on **Order**, which has the property of locking the row in the database. – Benjamin Toueg Sep 29 '13 at 00:21
0

As you mention you have a race condition over concurrent request. To get rid of this you should made the operations atomic. What I would do is make orders operations atomic using Redis. Then writing back to regular database whenever I could.

http://redis.io/

EDIT:

After some comments, it seems that the best approach is to include select_for_update(wait=True)

esauro
  • 1,276
  • 10
  • 17
  • Would it be possible to make it work only by using Postgre ? The redis solution seems a bit overkill to me at the moment. – Benjamin Toueg Sep 23 '13 at 11:48
  • 1
    You can do it this in a number of ways. You could even implement a semaphore (i'm currently doing this way in one project). Doing with postgres through locks could be another option, but I think django doens't support it itself. Take a look at http://stackoverflow.com/questions/320096/django-how-can-i-protect-against-concurrent-modification-of-database-entries – esauro Sep 23 '13 at 11:56
  • 1
    Hey, thanks for the link. Here is what worked : `select_for_update(nowait=True)`. I have tried `nowait=False` by myself but it didn't work, but I've just tested `nowait=True`and it works. Will continue testing and post an answer. – Benjamin Toueg Sep 23 '13 at 12:18