1

I'm trying to add a unique ID to every item in a list and struggling with some strange behaviour I don't understand from Python.

I have this function:

    def add_IDs(d):

        for x in range(len(d)):
            var = d.pop(x)
            var['list_id'] = x
            d.insert(x, var)
        return d

Into which I input this data:

[{'db_number': 1, 'quantity': 15, 'quality': 1},
 {'db_number': 1, 'quantity': 20, 'quality': 0},
 {'db_number': 1, 'quantity': 20, 'quality': 0},
 {'db_number': 1, 'quantity': 80, 'quality': 0},
 {'db_number': 2, 'quantity': 4, 'quality': 0}]

I expect this output:

[{'db_number': 1, 'quantity': 15, 'quality': 1, 'list_id': 0},
 {'db_number': 1, 'quantity': 20, 'quality': 0, 'list_id': 1},
 {'db_number': 1, 'quantity': 20, 'quality': 0, 'list_id': 2},
 {'db_number': 1, 'quantity': 80, 'quality': 0, 'list_id': 3},
 {'db_number': 2, 'quantity': 4, 'quality': 0, 'list_id': 4}]

But instead the second dict in the list, gets 'list_id': 2 instead of 'list_id': 1

[{'db_number': 1, 'quantity': 15, 'quality': 1, 'list_id': 0},
 {'db_number': 1, 'quantity': 20, 'quality': 0, 'list_id': 2},
 {'db_number': 1, 'quantity': 20, 'quality': 0, 'list_id': 2},
 {'db_number': 1, 'quantity': 80, 'quality': 0, 'list_id': 3},
 {'db_number': 2, 'quantity': 4, 'quality': 0, 'list_id': 4}]

As a test I wrote this:

    def add_IDs(d):

        for x in range(len(d)):
            var = d.pop(x)
            var['list_id'] = x
            d.insert(x, var)
        return d

    data2 = [{'db_number': 1, 'quantity': 15, 'quality': 1},
             {'db_number': 1, 'quantity': 20, 'quality': 0},
             {'db_number': 1, 'quantity': 20, 'quality': 0},
             {'db_number': 1, 'quantity': 80, 'quality': 0},
             {'db_number': 2, 'quantity': 4, 'quality': 0}]

    print(data)
    print(data2)

    l1 = add_IDs(data)
    l2 = add_IDs(data2)
    print(l1)
    print(l2)

    print("")
    print('Does data = data2?')
    print(data == data2)
    print('Does l1 = l2?')
    print(l1 == l2)

Which gives this output:


[{'db_number': 1, 'quantity': 15, 'quality': 1}, {'db_number': 1, 'quantity': 20, 'quality': 0}, {'db_number': 1, 'quantity': 20, 'quality': 0}, {'db_number': 1, 'quantity': 80, 'quality': 0}, {'db_number': 2, 'quantity': 4, 'quality': 0}] 
[{'db_number': 1, 'quantity': 15, 'quality': 1}, {'db_number': 1, 'quantity': 20, 'quality': 0}, {'db_number': 1, 'quantity': 20, 'quality': 0}, {'db_number': 1, 'quantity': 80, 'quality': 0}, {'db_number': 2, 'quantity': 4, 'quality': 0}] 
[{'db_number': 1, 'quantity': 15, 'quality': 1, 'list_id': 0}, {'db_number': 1, 'quantity': 20, 'quality': 0, 'list_id': 2}, {'db_number': 1, 'quantity': 20, 'quality': 0, 'list_id': 2}, {'db_number': 1, 'quantity': 80, 'quality': 0, 'list_id': 3}, {'db_number': 2, 'quantity': 4, 'quality': 0, 'list_id': 4}] 
[{'db_number': 1, 'quantity': 15, 'quality': 1, 'list_id': 0}, {'db_number': 1, 'quantity': 20, 'quality': 0, 'list_id': 1}, {'db_number': 1, 'quantity': 20, 'quality': 0, 'list_id': 2}, {'db_number': 1, 'quantity': 80, 'quality': 0, 'list_id': 3}, {'db_number': 2, 'quantity': 4, 'quality': 0, 'list_id': 4}]

Does data = data2? 
False 
Does l1 = l2? 
False

Process finished with exit code 0

As far as I can see, the input data is identical for both, the inbuilt comparison tool tells me the printed values are identical, but the output is still different and the checks say they are different. Can someone shed some light on what I'm missing?

F1rools22
  • 97
  • 7
  • You are iterating over the length of `d` but also popping the elements off and inserting them into the same list. – Chrispresso Sep 10 '20 at 18:32
  • @Chrispresso That's correct. I just wanted to take the item out of the list and replace it in the same position with the added ID tag. At this point I'm sure I could find an alternative method to solve the issue but I'd quite like to understand why this is happening like it is anyway. – F1rools22 Sep 10 '20 at 18:34

1 Answers1

2

First off, you can simplify your logic substantially:

def add_ids(items):
    for index, item in enumerate(items):
        item['list_id'] = index

I was unable to reproduce your issue in python 2 or python3.

Note that the len issue mentioned in a comment will not come into play, since you calculate it once, when the range is created, not at each point in the list.


With the additional information from your comment that the approach above gave the same behavior, I know your issue - you are using the same object for two entries in your list.

x, y, z = {}, {}, {}

items = [x, y, y, z]
for index, item in enumerate(items):
    print(index, item, id(item))

Note that index 1 and 2 have the same id

0 {} 4446764960
1 {} 4446764960
2 {} 4446790512
3 {} 4430894656

Then running

add_ids(items)

sets the index for y twice, once for index 1 and then for index 2.

assert items == [{'list_id': 0}, {'list_id': 2}, {'list_id': 2}, {'list_id': 3}]

Any change to y will show up in both items[1] and items[2], since they are the same object.

Cireo
  • 4,197
  • 1
  • 19
  • 24
  • I've just done this but unfortunately the same disparity in the output remains. – F1rools22 Sep 10 '20 at 18:38
  • 1
    Hah. Then I know your problem. Will update – Cireo Sep 10 '20 at 18:39
  • Cheers. I'm not sure why question has been marked as a duplicate. While adding items in my dict to a fresh list might solve the issue, it doesn't explain the behaviour. – F1rools22 Sep 10 '20 at 18:44
  • 1
    Ignore the closing, your problem is different, I've raised a flag. – Cireo Sep 10 '20 at 18:46
  • You are correct. It looks like both entries in the list are the same. I have absolutely no idea how that has happened tho, time to go bug hunting. Thanks! – F1rools22 Sep 10 '20 at 19:04
  • I managed to find the issue and solve it with just 3 lines of code, despite it needing 3 hours to diagnose and fault find. Thanks for the help :D – F1rools22 Sep 10 '20 at 20:00
  • Glad I could be of use! Until next time =) – Cireo Sep 10 '20 at 20:12