5

I'm trying to update many records inside a table using Peewee library. Inside a for loop, i fetch a single record and then I update it but this sounds awful in terms of performance so I need to do the update in bulk. Current code look like this:

usernames_to_update = get_target_usernames()
for username in usernames_to_update:
    user = User.get(User.username == username) # username is primary key
    if user.type == 'type_1':
        user.some_attr_1 = some_value_1
    elif user.type == 'type_2':
        user.some_attr_2 = some_value_2
    # elif ....
    user.save()

In the documentation, there is insert_many function but nothing like update_many. Searching around i came up with these solutions:

  1. Executing raw query using CASE: Link
  2. Using replace_many: Link
  3. Using update : Link

But i couldn't find any examples of how to use the second or third solution. Can somebody clarify how cases 2 and 3 can be used?

Offofue
  • 159
  • 1
  • 10

2 Answers2

5

You want the .update() method:

query = User.update(validated=True).where(User.username.in_(usernames_to_update))
query.execute()

Edit: so you want to conditionally set the value during an update. You can use the Case helper. Untested:

some_value_1 = 'foo'
some_value_2 = 'bar'
case_stmt = Case(User.type, [
    ('type_1', some_value_1),
    ('type_2', some_value_2)])
query = User.update(some_field=case_stmt).where(User.username.in_(list_of_usernames))
query.execute()

Docs can be found here: http://docs.peewee-orm.com/en/latest/peewee/api.html#Case

coleifer
  • 24,887
  • 6
  • 60
  • 75
  • I simplified my code but I think it went wrong. I edited the question to reflect the issue, sorry , my fault. How answer changes now? – Offofue Aug 30 '18 at 18:31
  • Thanks for the answer. Any note about `replace_many` method? Can be it used too? just to know – Offofue Aug 31 '18 at 04:33
  • BTW, it seems that `case` query can only work on a single column, right? So to update multiple columns i should create separate update queries right? Does these multiple update queries still need optimizations? – Offofue Aug 31 '18 at 14:39
  • You can use ValuesList() to update multiple values, but you probably ought to be comfortable with sql. – coleifer Aug 31 '18 at 17:05
4

The new best answer is to use the bulk_update() method found here:

with database.atomic():
    User.bulk_update(user_list, fields=['username'], batch_size=50)
David Bernat
  • 324
  • 2
  • 11
  • As the creator of Peewee, you know. Two questions that suggest I am missing something: 1. Under what conditions does bulk_update() outperform update()? 2. The Peewee 3.9.6 documents specify: "In addition, Peewee also offers Model.bulk_update(), which can efficiently update one or more columns on a list of models." What does this specify? This was the source of my original updating of this answer. – David Bernat Aug 01 '19 at 00:34
  • It is more efficient that iterating over all the items in a loop and updating them singly. The point of bulk_update is to update fields which may have multiple different values, where an ordinary UPDATE query would not suffice. e.g., with Model.update() you specify a list of fields to new values, whereas with .bulk_update() each model can have different values for the columns, and they are resolved using a CASE statement. – coleifer Aug 08 '19 at 13:55
  • bulk_update exists only from 3.0+ version – mrkiril Dec 02 '20 at 16:07
  • If I want to set the same value for all elements, I still need to iterate over my user_list to set all values in the objects to my value, and then use bulk update to perform only one SQL UPDATE query, is that correct ? – gdelab Mar 19 '21 at 18:36