Iterate over two lists of dicts and create list of tuples without loop

Question

I have two lists of dicts: list1 and list2.

print(list1)
[{'name': 'fooa', 'desc': 'bazv', 'city': 1, 'ID': 1},
 {'name': 'bard', 'desc': 'besd', 'city': 2, 'ID': 1},
 {'name': 'baer', 'desc': 'bees', 'city': 2, 'ID': 1},
 {'name': 'aaaa', 'desc': 'bnbb', 'city': 1, 'ID': 2},
 {'name': 'cgcc', 'desc': 'dgdd', 'city': 1, 'ID': 2}]

print(list2)
[{'name': 'foo', 'desc': 'baz', 'city': 1, 'ID': 1},
 {'name': 'bar', 'desc': 'bes', 'city': 1, 'ID': 1},
 {'name': 'bar', 'desc': 'bes', 'city': 2, 'ID': 1},
 {'name': 'aaa', 'desc': 'bbb', 'city': 1, 'ID': 2},
 {'name': 'ccc', 'desc': 'ddd', 'city': 1, 'ID': 2}]

I need a list of tuples that will hold two paired dicts (one dict from each list) with the same city and ID.

I did it with double loop:

list_of_tuples = []
for i in list1:
    for j in list2:
        if i['ID'] == j['ID'] and i['city'] == j['city']:
            list_of_tuples.append((i, j))
print(list_of_tuples)

[({'name': 'fooa', 'desc': 'bazv', 'city': 1, 'ID': 1},
  {'name': 'foo', 'desc': 'baz', 'city': 1, 'ID': 1}),
 ({'name': 'fooa', 'desc': 'bazv', 'city': 1, 'ID': 1},
  {'name': 'bar', 'desc': 'bes', 'city': 1, 'ID': 1}),
 ({'name': 'bard', 'desc': 'besd', 'city': 2, 'ID': 1},
  {'name': 'bar', 'desc': 'bes', 'city': 2, 'ID': 1}),
 ({'name': 'baer', 'desc': 'bees', 'city': 2, 'ID': 1},
  {'name': 'bar', 'desc': 'bes', 'city': 2, 'ID': 1}),
 ({'name': 'aaaa', 'desc': 'bnbb', 'city': 1, 'ID': 2},
  {'name': 'aaa', 'desc': 'bbb', 'city': 1, 'ID': 2}),
 ({'name': 'aaaa', 'desc': 'bnbb', 'city': 1, 'ID': 2},
  {'name': 'ccc', 'desc': 'ddd', 'city': 1, 'ID': 2}),
 ({'name': 'cgcc', 'desc': 'dgdd', 'city': 1, 'ID': 2},
  {'name': 'aaa', 'desc': 'bbb', 'city': 1, 'ID': 2}),
 ({'name': 'cgcc', 'desc': 'dgdd', 'city': 1, 'ID': 2},
  {'name': 'ccc', 'desc': 'ddd', 'city': 1, 'ID': 2})]

Question: How to do this in a more pythonic way (without loops)?

@ScottHunter Correct. But you must agree there are more pythonic ways to do this — ritlew, Mar 28 '19 at 14:28
You could write a *list comprehension*, that's more idiomatic (and faster) than repeatedly appending to a list, but there'll still be a loop. — jonrsharpe, Mar 28 '19 at 14:28
I'm sorry I put it wrong. I meant without using loops. Maybe using a list comprehension. — lemon, Mar 28 '19 at 14:29
@lemon you realise that a list comprehension is just different syntax for a loop, right? There's a lot of pushing for 1-liners here but often they become a mess and won't run must faster, if at all — roganjosh, Mar 28 '19 at 14:31
@roganjosh, Of course, I just want the code to be more laconic — lemon, Mar 28 '19 at 14:32
Did you try just ```list_of_tuples = list(zip(list1, list2))```? — accdias, Mar 28 '19 at 14:38

Jacques Gaudin · Accepted Answer · 2019-03-28T15:05:05.453

You can use itertools.product and filter:

from itertools import product


list1 = [{'name': 'fooa', 'desc': 'bazv', 'city': 1, 'ID': 1},
         {'name': 'bard', 'desc': 'besd', 'city': 2, 'ID': 1},
         {'name': 'baer', 'desc': 'bees', 'city': 2, 'ID': 1},
         {'name': 'aaaa', 'desc': 'bnbb', 'city': 1, 'ID': 2},
         {'name': 'cgcc', 'desc': 'dgdd', 'city': 1, 'ID': 2}]

list2 = [{'name': 'foo', 'desc': 'baz', 'city': 1, 'ID': 1},
         {'name': 'bar', 'desc': 'bes', 'city': 1, 'ID': 1},
         {'name': 'bar', 'desc': 'bes', 'city': 2, 'ID': 1},
         {'name': 'aaa', 'desc': 'bbb', 'city': 1, 'ID': 2},
         {'name': 'ccc', 'desc': 'ddd', 'city': 1, 'ID': 2}]

def condition(x):
    return x[0]['ID'] == x[1]['ID'] and x[0]['city'] == x[1]['city']

list_of_tuples = list(filter(condition, product(list1, list2)))

I retract my earlier statement. You actually managed to make it readable. — Aran-Fey, Mar 28 '19 at 14:49

score 3 · Answer 2 · answered Mar 28 '19 at 14:48

This is a problem well suited for pandas. If you convert the lists to DataFrames, matching the records on ID and city is the same as an inner join of the two DataFrames.

import pandas as pd

# convert lists to DataFrames
df1 = pd.DataFrame(list1)
df2 = pd.DataFrame(list2)

# merge the two DataFrames
print(df1.merge(df2, on=["ID", "city"]))
#   ID  city desc_x name_x desc_y name_y
#0   1     1   bazv   fooa    baz    foo
#1   1     1   bazv   fooa    bes    bar
#2   1     2   besd   bard    bes    bar
#3   1     2   bees   baer    bes    bar
#4   2     1   bnbb   aaaa    bbb    aaa
#5   2     1   bnbb   aaaa    ddd    ccc
#6   2     1   dgdd   cgcc    bbb    aaa
#7   2     1   dgdd   cgcc    ddd    ccc

Now you have the matched records in each row. Since the desc and name columns were present in both (and not used for the merge), they get subscripted with _x and _y to differentiate between the two souce DataFrames.

You just need to reformat it to be in your desired output. You can achieve this using to_dict and a list comprehension:

list_of_tuples = [
    (
        {"name": r["name_x"], "desc": r["desc_x"], "city": r["city"], "ID": r["ID"]},
        {"name": r["name_y"], "desc": r["desc_y"], "city": r["city"], "ID": r["ID"]}
    ) for r in df1.merge(df2, on=["ID", "city"]).to_dict(orient="records")
]

print(list_of_tuples)
#[({'ID': 1, 'city': 1, 'desc': 'bazv', 'name': 'fooa'},
#  {'ID': 1, 'city': 1, 'desc': 'baz', 'name': 'foo'}),
# ({'ID': 1, 'city': 1, 'desc': 'bazv', 'name': 'fooa'},
#  {'ID': 1, 'city': 1, 'desc': 'bes', 'name': 'bar'}),
# ({'ID': 1, 'city': 2, 'desc': 'besd', 'name': 'bard'},
#  {'ID': 1, 'city': 2, 'desc': 'bes', 'name': 'bar'}),
# ({'ID': 1, 'city': 2, 'desc': 'bees', 'name': 'baer'},
#  {'ID': 1, 'city': 2, 'desc': 'bes', 'name': 'bar'}),
# ({'ID': 2, 'city': 1, 'desc': 'bnbb', 'name': 'aaaa'},
#  {'ID': 2, 'city': 1, 'desc': 'bbb', 'name': 'aaa'}),
# ({'ID': 2, 'city': 1, 'desc': 'bnbb', 'name': 'aaaa'},
#  {'ID': 2, 'city': 1, 'desc': 'ddd', 'name': 'ccc'}),
# ({'ID': 2, 'city': 1, 'desc': 'dgdd', 'name': 'cgcc'},
#  {'ID': 2, 'city': 1, 'desc': 'bbb', 'name': 'aaa'}),
# ({'ID': 2, 'city': 1, 'desc': 'dgdd', 'name': 'cgcc'},
#  {'ID': 2, 'city': 1, 'desc': 'ddd', 'name': 'ccc'})]

"This is a problem well suited for pandas". Unless he's already using pandas for this project, adding pandas as a dependency (and all its transitive dependencies) isn't a good solution for this problem ;-) — Guybrush, Mar 28 '19 at 14:50
@Guybrush by "this problem" I meant the problem of merging datasets using primary keys — pault, Mar 28 '19 at 14:51

score 1 · Answer 3 · answered Mar 28 '19 at 14:30

1

Having nested loops is not "not pythonic". However, you can achieve the same result with a list comprehension. I don't think it's more readable though:

[(i, j) for j in list2 for i in list1 if i['ID'] == j['ID'] and i['city'] == j['city']]

answered Mar 28 '19 at 14:30

Guybrush

2,680
1
10
17

Using list comprehension is not always more "pythonic", in this case that just made the nested loops less readable imo. – Delgan Mar 28 '19 at 14:31
I couldn't agree more with you @Delgan. In this specific case, I would personally go for the nested loops solution that is more readable. – Guybrush Mar 28 '19 at 14:33
What do you mean list comprehensions isn't more pythonic? – ritlew Mar 28 '19 at 14:33
@ritlew First, what does "Pythonic" exactly mean? I think readability and simplicity are better goals that "Pythonicity" :-) – Guybrush Mar 28 '19 at 14:34

Iterate over two lists of dicts and create list of tuples without loop

3 Answers3