Remove duplicates from list of dictionary using specific keys

Question

So I have a list of dictionary as follow:

a = [{'author':'John','country':'us','gender':'male'},
     {'author':'Sean','country':'uk','gender':'male'},
     {'author':'Sean','country':'russia','gender':'male'},
     {'author':'Mike','country':'japan','gender':'male'}]

So now, only based on the author I want to remove duplicates from this list of dictionary irrespective of other key values. Output should be as follow with entry no. 3 removed. (author is repeated)

    a = [{'author':'John','country':'us','gender':'male'},
         {'author':'Sean','country':'uk','gender':'male'},
         {'author':'Mike','country':'japan','gender':'male'}]

Please suggest the shortest way possible!

What did you search for, and what did you find? Based on that, what did you try, and how did it fail? — tripleee, Jan 07 '20 at 19:05

score 2 · Answer 1 · answered Jan 07 '20 at 19:10

2

First thing that comes to my mind and should do the trick:

list(dict([(elem['author'], elem) for elem in a]).values())

although there might exist some cleaner and/or shorter way.

answered Jan 07 '20 at 19:10

tyrrr

528
2
11

score 1 · Answer 2 · answered Jan 07 '20 at 19:09

I think pandas should do this for us:

import pandas as pd

df = pd.DataFrame(a, index=None)

a = df.drop_duplicates(['author']).to_dict(orient='record')

print(a)

Outputs:

[{'author': 'John', 'country': 'us', 'gender': 'male'},
 {'author': 'Sean', 'country': 'uk', 'gender': 'male'},
 {'author': 'Mike', 'country': 'japan', 'gender': 'male'}]

Or if you care about memory and don't want to store both a and df, create the dataframe into a (a = pd.DataFrame(a, index=None)).

Remove duplicates from list of dictionary using specific keys

2 Answers2