0

I have a list of dictionary items that includes some duplicates. What I would like to do is iterate through this dictionary and pick out all of the duplicate items and then do something with them.

For example if I have the following list of dictionary:

animals = [
{'name': 'aardvark', 'value': 1}, 
{'name': 'badger', 'value': 2}, 
{'name': 'cat', 'value': 3},
{'name': 'aardvark', 'value': 4},
{'name': 'cat', 'value': 5}]

I would like to go through the list "animals" and extract the two dictionary entries for aardvark and cat and then do something with them.

for example:

duplicates = []
for duplicate in animals:
    duplicates.append(duplicate)

The output I would like is for the list 'duplicates' to contain:

{'name': 'aardvark', 'value': 1},
{'name': 'cat', 'value': 3},
{'name': 'aardvark', 'value': 4},
{'name': 'cat', 'value': 5}

as always, any help is greatly appreciated and will hopefully go along way to me learning more about python.

VXMH
  • 61
  • 2
  • 11

5 Answers5

0

This works!!!

animals = [
{'name': 'aardvark', 'value': 1}, 
{'name': 'badger', 'value': 2}, 
{'name': 'cat', 'value': 3},
{'name': 'aardvark', 'value': 4},
{'name': 'cat', 'value': 5},
{'name': 'lion', 'value': 6}, 
{'name': 'lion', 'value': 6}, 
]

uniq = dict()
dup_list = list()

for i in animals:
    if not i["name"] in uniq:
        uniq[i["name"]] = i["name"]
    else:
        dup_list.append(i)

print(dup_list)
Dinesh Kumar
  • 478
  • 4
  • 16
0

You can sort the name of all the animals so that duplicates will be one next to the other. The time it takes is O(n log n).

names = [a['name'] for a in animals]
names.sort()
duplicates = []
prev, curr = None, None
for n in names:
    if prev is None:
        prev = n
        continue
    curr = n
    if curr == prev:
        duplicates.append(n)
    prev = curr
Gianluca Micchi
  • 1,584
  • 15
  • 32
0

So for this you should iterate through the dictionary with 2 for loops to check all the possible pairs and compare values and see if they match. Edited with desired output. Something like this:

animals = [
{'name': 'aardvark', 'value': 1}, 
{'name': 'badger', 'value': 2}, 
{'name': 'cat', 'value': 3},
{'name': 'aardvark', 'value': 4},
{'name': 'cat', 'value': 5}
]

duplicates = []
for i in range(len(animals)):
    for j in range(i+1, len(animals)):
        if animals[i]['name'] == animals[j]['name']:
            duplicates.extend([animals[i], animals[j]])

print(duplicates)
Toni Sredanović
  • 2,280
  • 1
  • 11
  • 13
0

With old-good defaultdict:

from collections import defaultdict
import pprint

d = defaultdict(list)
animals = [
    {'name': 'aardvark', 'value': 1}, {'name': 'badger', 'value': 2},
    {'name': 'cat', 'value': 3}, {'name': 'aardvark', 'value': 4},
    {'name': 'cat', 'value': 5}]

for an in animals:
    d[an['name']].append(an)

dups = [v for k,v in d.items() if len(v) > 1]
pprint.pprint(dups)

The output (list of lists/dups):

[[{'name': 'aardvark', 'value': 1}, {'name': 'aardvark', 'value': 4}],
 [{'name': 'cat', 'value': 3}, {'name': 'cat', 'value': 5}]]
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105
0

To achieve what you want to do you can transform your data animals into a pandas DataFrame juste like this :

import pandas as pd
animals = pd.DataFrame(animals)

You'll obtain a table like this :

    name    value
0   aardvark    1
1   badger      2
2   cat         3
3   aardvark    4
4   cat         5

Pandas' DataFrames are structures helping you manipulating the data. (https://pandas.pydata.org/pandas-docs/stable/getting_started/index.html)

You can perform a lot of operations, for instance detecting duplicates as follow :

# Using duplicated() function
df.duplicated(subset=['name'], keep = False)
# It will give you a list of booleans associated with indexes as follow :
0     True
1    False
2     True
3     True
4     True

Once you know which lines are duplicates, you can filter your data like this and obtain the desired result :

duplicates = df[df.duplicated(subset=['name'], keep = False)]
# Gives you the following output :

    name    value
0   aardvark    1
2   cat         3
3   aardvark    4
4   cat         5

Good luck with your learning of python !

Clem G.
  • 396
  • 3
  • 8