0

I have this two related django models

class Item(models.Model):
    item_nbr = models.IntegerField(primary_key=True)
    created = models.DateTimeField(auto_now_add=True)
    item_nbr_desc = models.CharField(max_length=155)

class SetItem(models.Model):
    set_id = models.CharField(primary_key=True, default='', max_length=12)
    set_nbr = models.IntegerField()
    items = models.ForeignKey(Item)

Im running an script (periodically) to read the Item table from another database, and use that dataframe to update the django database item table. I'm using the django orm framework to make the interface with the django database

script.py

from app.models import Item

item_table = pd.read_sql(__) 
item_table = some_transformations(item_table.copy())
#I remove all the Items that will be updated 
Item.objects.filter(item_nbr__in=item_table.item_nbr.unique()).delete()
item_table_records = item_table.to_dict('records')
item_instances = []
fields = item_table.keys()
for record in item_table_records:
    kwargs = {
             field: record[field] for field in fields
             }
    item_instances.append(Item(**kwargs))
Item.objects.bulk_create(item_instances) # update the new rows 

the problem is that the setItem table its deleted each time that I delete the Items related (because the on_delete=models.CASCADE behavior). I want to update the items not erasing the setItem related rows, and i don't want to change the on_delete default behavior because only in this script I need to upload a whole table, its possible that i want to delete a Item in another context and i hope that the cascade behavior is working. what can i do? its there a bulk update function that might perform a non-destructive update of the table?

Pablo
  • 3,135
  • 4
  • 27
  • 43

2 Answers2

1

You can update the items one by one:

items_by_pk = Item.objects.in_bulk(item_table.item_nbr.values)
for record in item_table.to_dict('records'):
    item = items_by_pk[record['item_nbr']]
    for field, value in record:
        setattr(item, field, value)
    item.save()
emulbreh
  • 3,421
  • 23
  • 27
  • the `items_by_pk = Item.objects.in_bulk(item_table.item_nbr)` gives me a `ValueError`. `ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().` – Pablo Jan 22 '18 at 12:32
  • the performance its really poor!! considering that my `Item` table has more than 300K registers. I need a "bulk" solution :( – Pablo Jan 22 '18 at 12:43
  • I just fixed the ValueError. You need to make sure to batch the queries together in transactions, but it's still going to be slow. Can you write a raw SQL query that would be faster? – emulbreh Jan 22 '18 at 18:32
  • Im trying that aproach reading the answers over this question https://stackoverflow.com/questions/12661253/how-to-bulk-update-with-django – Pablo Jan 23 '18 at 11:50
0

adding the @transaction.atomic decorator, and using the save() method you will have a "batch" update without the need to delete the item table.

from django.db import transaction

@transaction.atomic
def item_update(item_table):
    items_by_pk = Item.objects.in_bulk(item_table.item_nbr.values)
    for record in item_table.to_dict('records'):
        item = items_by_pk[record['item_nbr']]
        for field, value in record:
           setattr(item, field, value)
        item.save()

source

Pablo
  • 3,135
  • 4
  • 27
  • 43