0

I have two lists containing dictionaries:

list_a:

[{'id': 1, 'name': 'test'}, {'id': 2, 'name': 'test1'},....]

list_b:

[{'id': 1, 'age': 10}, {'id': 2, 'age': 20}, ....]

I want to merge these two lists with the result being:

[{'id': 1, 'name': 'test', 'age': 10}, {'id': 2, 'name': 'test1', 'age': 20}....]

I wan to use the nest loop to make it:

result= []
for i in list_a:
   for j in list_b:
     if i['id'] == j['id']:
      i['age']=j['age']
      result.append(i)

but there are 2000 elements for list_a, the ids of list_b is belongs to list_a, but the count of list_b is possibly less than 2000. the time complexityis of this method is too high, there a better way to merge them?

pangpang
  • 8,581
  • 11
  • 60
  • 96
  • 1
    I guess you wanted `"name": "test1"` in your second merged dict? – Eli Korvigo Apr 06 '16 at 14:09
  • 2
    I'm voting to close this question as off-topic because it's a performance question, not a debugging question, meaning it belongs on Code Review, not Stack Overflow. http://codereview.stackexchange.com/ – ArtOfWarfare Apr 06 '16 at 14:15
  • 3
    Possible duplicate of [join two lists of dictionaries on a single key](http://stackoverflow.com/questions/5501810/join-two-lists-of-dictionaries-on-a-single-key) – Yaron Apr 06 '16 at 14:17
  • @ArtOfWarfare Stack Overflow is for specific questions, not necessarily only for debugging. That said, I consider this question as "too broad" for Stack Overflow because there's too many possible answers, and am voting to close it as such. – Simon Forsberg Apr 06 '16 at 14:19
  • Your intention is not clear to me, because you say nothing about handling common fields upon merge. – Eli Korvigo Apr 06 '16 at 14:25

5 Answers5

1

Not really, but dict.setdefault and dict.update probably are your friends for this.

data = {}
lists = [
   [{'id': 1, 'name': 'test'}, {'id': 2, 'name': 'test1'},],
   [{'id': 1, 'age': 10}, {'id': 2, 'age': 20},]
]

for each_list in lists:
   for each_dict in each_list:
       data.setdefault(each_dict['id'], {}).update(each_dict)

Result:

>>> data
{1: {'age': 10, 'id': 1, 'name': 'test'},
 2: {'age': 20, 'id': 2, 'name': 'test1'}}

This way you can lookup by id (or just get data.values() if you want a plain list). Its been 20 years since I took my algorithms class, but I guess this is close enough to O(n) while your sample is more O(n²). This solution has some interesting properties: does not mutate the original lists, works for any number of lists, works for uneven lists containing distinct sets of "id".

Paulo Scardine
  • 73,447
  • 11
  • 124
  • 153
0
answer = {}
for d in list_a: answer[d['id']] = d

for d in list_b:
    if d['id'] not in d:
        answer[d['id']] = d
        continue
    for k,v in d.items():
        answer[d['id']][k] = v

answer = [d for k,d in sorted(answer.items(), key=lambda s:s[0])]
inspectorG4dget
  • 110,290
  • 27
  • 149
  • 241
0

No, I think this is the best way because you are joining all data in the simplest data structure.

You can know how to implement it here

I hope my answer will be helpful for you.

Community
  • 1
  • 1
iblancasa
  • 334
  • 1
  • 5
0

It could be done in one line, given the items in list1 and list2 are in the same order, id wise, I mean.

result = [item1 for item1, item2 in zip(list1, list2) if not item1.update(item2)]

For a more lengthy one

for item1, item2 in zip(list1, list2):
    item1.update(item2)
    # list1 will be mutated to the result
C Panda
  • 3,297
  • 2
  • 11
  • 11
0

To find a better way one needs to know how the values are generated and how they will be used.

For example if you have them as csv files you can use a Table-like module like pandas (I'll create them from your lists but they have a read_csv and from_csv as well):

import pandas as pd
df1 = pd.DataFrame.from_dict([{'id': 1, 'name': 'test'}, {'id': 2, 'name': 'test1'}])
df2 = pd.DataFrame.from_dict([{'id': 1, 'age': 10}, {'id': 2, 'age': 20}])
pd.merge(df1, df2, on='id')

enter image description here

Or if they come from a database most databases already have a JOIN ON (for example MYSQL) option.

MSeifert
  • 145,886
  • 38
  • 333
  • 352