-3

So I have a list of dictionary as follow:

a = [{'author':'John','country':'us','gender':'male'},
     {'author':'Sean','country':'uk','gender':'male'},
     {'author':'Sean','country':'russia','gender':'male'},
     {'author':'Mike','country':'japan','gender':'male'}]

So now, only based on the author I want to remove duplicates from this list of dictionary irrespective of other key values. Output should be as follow with entry no. 3 removed. (author is repeated)

    a = [{'author':'John','country':'us','gender':'male'},
         {'author':'Sean','country':'uk','gender':'male'},
         {'author':'Mike','country':'japan','gender':'male'}]

Please suggest the shortest way possible!

Prasad Ostwal
  • 368
  • 2
  • 10

2 Answers2

2

First thing that comes to my mind and should do the trick:

list(dict([(elem['author'], elem) for elem in a]).values())

although there might exist some cleaner and/or shorter way.

tyrrr
  • 528
  • 2
  • 11
1

I think pandas should do this for us:

import pandas as pd

df = pd.DataFrame(a, index=None)

a = df.drop_duplicates(['author']).to_dict(orient='record')

print(a)

Outputs:

[{'author': 'John', 'country': 'us', 'gender': 'male'},
 {'author': 'Sean', 'country': 'uk', 'gender': 'male'},
 {'author': 'Mike', 'country': 'japan', 'gender': 'male'}]

Or if you care about memory and don't want to store both a and df, create the dataframe into a (a = pd.DataFrame(a, index=None)).

emremrah
  • 1,733
  • 13
  • 19