I have a code, that should work under concurrent request and heavy load.
I wrote an example to give a better understanding of that I'm trying to do:
def add_tag():
with transaction.atomic():
image = Image.objects.get(pk=2)
tag = Tag.objects.get(pk=6)
image.tags.add(tag) # concurrent insert
return 'done'
class Command(BaseCommand):
def handle(self, *args, **options):
with ProcessPoolExecutor(max_workers=3) as executor:
futures = []
for _ in range(3):
futures.append(executor.submit(add_tag))
for future in as_completed(futures):
print(future.result())
And here is my models:
class Image(models.Model):
title = models.CharField(max_length=255)
tags = models.ManyToManyField('ov_tags.Tag')
class Tag(models.Model):
title = models.CharField(max_length=255)
I'm trying to insert in ManyToMany-relation table in parallel. Obviously, this causes an error, because of READ COMMITED isolation level:
django.db.utils.IntegrityError: duplicate key value violates unique constraint
Absolutely fine, but how to remove this error completely?
To protect my image, I tried to use select_for_update on Image select.
image = Image.objects.select_for_update().get(pk=2)
And... It works! I run it several times. There are no errors anymore and item inserted correctly. But I don't know why?
Is select_for_update locking relational table anyhow? Or is it happening on an application side? Is there a right way to achieve such behavior?
Can I use empty select to lock for insert?
SELECT "image_tags"."tag_id" FROM "image_tags" WHERE ("image_tags"."tag_id" IN (6) AND "image_tags"."image_id" = 2) FOR UPDATE