Consider a Django application with a single RESTful API that creates objects (using Django REST Framework). As part of this API, I do some validation to make sure the creation calls are idempotent, such that if you call the creation API twice, the first will succeed, and the second will fail with a custom error code.
I have a scenario for testing this API which intermittently fails in the following way:
- First API call, success, returns 201 -> object has supposedly been created
- Immediately after response, second API call is made
- Validation logic calls
MyModel.objects.get(some_field=some_value)
to check if this is a duplicate call or not - No such object is found, despite being created in step 1, thus a duplicate object is created
- When inspecting the admin/querying the model, both objects can be seen.
Some more data:
- There is no explicit caching on this model, or any other caching involved in this process.
- I am unable to reproduce this locally
- on my deployment setup there is about a 5% failure rate for this possible race condition.
- Both local and deployment use PostgreSQL.
- Deployment environment does have general caching enabled, but when enabling cache locally still no repro.
What might be causing this race condition? Does Django ORM have any failure modes where I might be getting stale data? Is there any way I can defensively protect the validation from getting stale data?