3

I have a list of dictionaries:

data = [
    {'name': 'foo', 'scores': [2]},
    {'name': 'bar', 'scores': [4, 9, 3]},
    {'name': 'baz', 'scores': [6, 1]}
]

I want to create a new list which has each individual score separated out like this:

list = [
    {'name': 'foo', 'scores': [2], 'score': 2},
    {'name': 'bar', 'scores': [4, 9, 3], 'score': 4},
    {'name': 'bar', 'scores': [4, 9, 3], 'score': 9},
    {'name': 'bar', 'scores': [4, 9, 3], 'score': 3},
    {'name': 'baz', 'scores': [6, 1], 'score': 6},
    {'name': 'baz', 'scores': [6, 1], 'score': 1}
]

I can then loop through each row, and each score, to create a new dictionary:

for row in data:
    scores = row['scores']  # list of values
    for score in scores:
        new_row = row
        new_row['score'] = score
        print(new_row)

Which gives me exactly what I want:

{'name': 'foo', 'scores': [2], 'score': 2}
{'name': 'bar', 'scores': [4, 9, 3], 'score': 4}
{'name': 'bar', 'scores': [4, 9, 3], 'score': 9}
{'name': 'bar', 'scores': [4, 9, 3], 'score': 3}
{'name': 'baz', 'scores': [6, 1], 'score': 6}
{'name': 'baz', 'scores': [6, 1], 'score': 1}

However, I'm having trouble adding these dictionaries to a list. When I use the append() function to add each dictionary to a new list:

list = []

for row in data:
    scores = row['scores']  # list of values
    for score in scores:
        new_row = row
        new_row['score'] = score
        list.append(new_row)

    print(list)

It seems to overwrite some of the previous items:

[
{'name': 'foo', 'scores': [2], 'score': 2},
{'name': 'bar', 'scores': [4, 9, 3], 'score': 3},
{'name': 'bar', 'scores': [4, 9, 3], 'score': 3},
{'name': 'bar', 'scores': [4, 9, 3], 'score': 3},
{'name': 'baz', 'scores': [6, 1], 'score': 1},
{'name': 'baz', 'scores': [6, 1], 'score': 1}
]

What's going on here? Why is it printing the rows correctly, but overwriting previous items when adding to a list? I thought append() simply adds new items to the end of a list without altering other items?

kmario23
  • 57,311
  • 13
  • 161
  • 150
Alan
  • 509
  • 4
  • 15
  • `new_row = row` does not copy the data, it just creates a reference that points to the same data. You might want to look at https://docs.python.org/2/library/copy.html – MFisherKDX May 02 '19 at 04:54
  • Please don't use variable names such as `list` which shadows keyword – kmario23 May 02 '19 at 05:36

3 Answers3

4

Here new_row always reference the current row object, that is the same for every score in this row object. You need to create a new object copying the current row. Use deepcopy from the copy package.

from copy import deepcopy
for row in data:
    scores = row['scores']  # list of values
    for score in scores:
        new_row = deepcopy(row)
        ...
Miguel Garcia
  • 356
  • 1
  • 5
4

How about a simple list comprehension, to achieve all these in a single step:

In [269]: [{**d, **{'score': v}} for d in data for v in d['scores']]
Out[269]: 
[{'name': 'foo', 'score': 2, 'scores': [2]},
 {'name': 'bar', 'score': 4, 'scores': [4, 9, 3]},
 {'name': 'bar', 'score': 9, 'scores': [4, 9, 3]},
 {'name': 'bar', 'score': 3, 'scores': [4, 9, 3]},
 {'name': 'baz', 'score': 6, 'scores': [6, 1]},
 {'name': 'baz', 'score': 1, 'scores': [6, 1]}]

Explanation/Clarification:

This list comprehension does what OP finally needs. We start by iterating over each dictionary in our list of dictionaries data and for each value v in current dictionary's scores with this nested for loop,

for d in data for v in d['scores']  # order goes from left to right

we add a key score and a value v by unpacking and then we also unpack the current dictionary since OP needs that as well. At the end we concatenate both of these using {**d, **{'score': v}} and that's what we need to achieve.

The concatenation is done using { } or dict() because we unpack the keys and values from both d and {'score': v}; Thus, an alternative is:

In [3]: [dict(**d, **{'score': v}) for d in data for v in d['scores']]
Out[3]: 
[{'name': 'foo', 'score': 2, 'scores': [2]},
 {'name': 'bar', 'score': 4, 'scores': [4, 9, 3]},
 {'name': 'bar', 'score': 9, 'scores': [4, 9, 3]},
 {'name': 'bar', 'score': 3, 'scores': [4, 9, 3]},
 {'name': 'baz', 'score': 6, 'scores': [6, 1]},
 {'name': 'baz', 'score': 1, 'scores': [6, 1]}]

For more details on dictionary unpacking examples, please refer peps/pep-0448/

Community
  • 1
  • 1
kmario23
  • 57,311
  • 13
  • 161
  • 150
  • 1
    This does not directly answer OPs question. – b-fg May 02 '19 at 05:10
  • Could you please explain how does this dict comprehension works! – Devesh Kumar Singh May 02 '19 at 05:14
  • 1
    @kmario23 Thanks! I'd never prevously given list comprehensions much thought. Seems like they're a great way to avoid 'for' loops when unpacking nested lists and dictionaries. Do you mind please explaining how the following concatenation works and what the asterisks do: {**d, **{'score': v}} – Alan May 03 '19 at 00:58
  • @Alan Yes, list comprehensions are indeed very convenient and concise way to quickly construct sequences. Please see the updated info! – kmario23 May 03 '19 at 07:28
  • @Alan Also added a more clear way, if that helps :) – kmario23 May 03 '19 at 07:35
0

The answers above are great. Thanks! Here I just explain the reason of the bug in a simple way. I added two print():

for score in scores:
        print(row)
        new_row = row
        new_row['score'] = score
        list.append(new_row)
        print(list)

part of the results:

......
{'name': 'bar', 'scores': [4, 9, 3]}
[{'name': 'foo', 'scores': [2], 'score': 2}, {'name': 'bar', 'scores': [4, 9, 3], 'score': 4}]
{'name': 'bar', 'scores': [4, 9, 3], 'score': 4}
[{'name': 'foo', 'scores': [2], 'score': 2}, {'name': 'bar', 'scores': [4, 9, 3], 'score': 9}, {'name': 'bar', 'scores': [4, 9, 3], 'score': 9}]
{'name': 'bar', 'scores': [4, 9, 3], 'score': 9}
[{'name': 'foo', 'scores': [2], 'score': 2}, {'name': 'bar', 'scores': [4, 9, 3], 'score': 3}, {'name': 'bar', 'scores': [4, 9, 3], 'score': 3}, {'name': 'bar', 'scores': [4, 9, 3], 'score': 3}]
......

So now we can see whennew_row = row, they refer to the same object. When new_row changes, row also changes. The list result is the result of the last loop for each scores.

  • Thanks for the clarifcation. On the 2nd loop I see it correctly adds the first item in the row, but on the 3rd loop it overwrites the preceding item in the row. Hence the need for deepcopy() to copy the object rather than the reference. – Alan May 03 '19 at 01:08
  • You're welcome! I also learned some knowledge from your question. Thanks for sharing! – Juliecodestack May 03 '19 at 03:55