3

I've got a list of dictionaries that is already sorted by a key id.

y = [{'id': 0, 'name': 'Frank'},
     {'id': 5, 'name': 'Hank'},
     {'id': 8, 'name': 'Fred'},
     {'id': 30, 'name': 'Jill'}]

I want to insert a new element into the list.

y.append({'id': 6, 'name': 'Jenkins'})

How do I avoid sorting the list again as follows once the new element is added?

y = sorted(y, key=lambda x: x['id'])

The ideal outcome is:

y = [{'id': 0, 'name': 'Frank'},
     {'id': 5, 'name': 'Hank'},
     {'id': 6, 'name': 'Jenkins'},
     {'id': 8, 'name': 'Fred'},
     {'id': 30, 'name': 'Jill'}]

Edit:

Using bisect.insort(y, {'id': 6, 'name': 'Jenkins'}) will work only for the first key, if the dict is sorted by name, it will fail.

Bamcclur
  • 1,949
  • 2
  • 15
  • 19
  • Do yourself a favor and create a class with id and name properties. You can then provide a method for appropriate ordering by the \_\_cmp\_\_ method. – guidot Jun 28 '16 at 14:41
  • After your edit the question a completely new requirement is added, that the field to sort the list is variable. This is more of a database problem... – guidot Jun 28 '16 at 14:46

2 Answers2

5

Since a insertion in a list is in O(n) anyway, any clever bisect algorithm is not that useful, so you can simply loop the list to find the position where it should be inserted, and then insert it. Something like:

new_value = {'id': 6, 'name': 'Jenkins'}

for index, value in enumerate(y):
    # Assuming y is in increasing order.
    if value['id'] > new_value['id']:
        y.insert(index, new_value)
        break
Community
  • 1
  • 1
Seb D.
  • 5,046
  • 1
  • 28
  • 36
  • 7
    You might want to be sure that you add the new value at least somewhere (maybe at the very end) – Wikunia Jan 31 '17 at 09:59
  • 1
    I realize this is an old post, but when has an insertion in a sorted list ever been O(n)? An insertion sort of an unsorted list is O(n), but the OP has explicitly stated the list to be given a new element is already sorted. If you have a sorted list of 10 million items, are you going to traverse through the entire list to find the insertion point, or bisect it, determine which half contains the appropriate location, then repeat? Such a strategy would be O(log(n)). – soporific312 Aug 23 '20 at 17:33
  • @soporific312 that's to find the position yes, but that doesn't change the fact that you need to call `.insert` after, and that alone is O(n) in Python, hence my answer. If you want to optimize that, you need to use another data structure than a `list`. – Seb D. Oct 21 '20 at 17:07
  • 1
    @SébastienDeprez Thanks for the follow up. I confused insertion and search. – soporific312 Oct 24 '20 at 14:25
5

I recommend to change your dict to a class, overload the comparison operator, and then use bisect.insort(). I don't recommend to iterate over all items to find the insert location as proposed in the other reply. From theory, the complexity is still O(n), but, but searching the item takes much longer than inserting a new element. I guess inserting will just do a few memcopies, while for the search you have to compare every single element.

From my example, iterating to find the location, then insert takes about 2 minutes to complete. The same data with bisect.insert() takes about 10 seconds.

Clemens
  • 81
  • 1
  • 2
  • 1
    I feel the Idea you present here is really good. Thanks, @Clemens. It would be better if you post an example for better understanding. – Rishabh Batra Oct 08 '20 at 06:35
  • I have the same issue and don't know how to `bisect.insert()` when the dictionary element has a key and a value. You seem to have written code that you timed; could you add it in the answer? – miguelmorin Aug 17 '22 at 15:26