2

I have a list of dictionaries where one of the key values is NOT unique:

arr = [{'host': '144.217.103.15', 'port': 3000}, 
       {'host': '158.69.115.201', 'port': 8080},
       {'host': '144.217.103.15', 'port': 1020},]

I want to make the given array unique in regards to the 'host' key so that the final output would be:

result = [{'host': '158.69.115.201', 'port': 8080}, 
          {'host': '144.217.103.15', 'port': 1020},]

or it could be:

result = [{'host': '144.217.103.15', 'port': 3000}, 
          {'host': '158.69.115.201', 'port': 8080},]

What is the Pythonic way of doing so?

etayluz
  • 15,920
  • 23
  • 106
  • 151

2 Answers2

8

You could just convert to a dict and extract the values:

>>> { d['host']: d for d in arr }.values()
[{'host': '144.217.103.15', 'port': 1020}, {'host': '158.69.115.201', 'port': 8080}]

For Python3, you could convert the dict_values to a list:

>>> list({d['host']: d for d in arr}.values())
[{'host': '144.217.103.15', 'port': 1020}, {'host': '158.69.115.201', 'port': 8080}]

If you want to keep the original order (minus the host duplicates), you could use an OrderedDict:

>>> from collections import OrderedDict
>>> list(OrderedDict( (d['host'], d) for d in arr).values())
[{'host': '144.217.103.15', 'port': 1020}, {'host': '158.69.115.201', 'port': 8080}]

Finally, if you want a list of dictionary with unique host and port, you could use a tuple as key:

>>> list(OrderedDict(((d['host'], d['port']), d) for d in arr).values())
[{'host': '144.217.103.15', 'port': 3000}, {'host': '158.69.115.201', 'port': 8080}, {'host': '144.217.103.15', 'port': 1020}]
Eric Duminil
  • 52,989
  • 9
  • 71
  • 124
  • Thanks @Eric! Now what if I wanted to do this on arr in place, so that the output is back on arr? – etayluz Apr 10 '17 at 15:45
  • 1
    @etayluz: It's not really in place because it creates a new list and assigns it back to `arr`, but you could just write `arr = list(OrderedDict( (d['host'], d) for d in arr).values())`. – Eric Duminil Apr 10 '17 at 15:48
  • what if i need to filter by max and min of port? – Siva Sankar May 06 '20 at 10:29
  • @SivaSankar: Something like `[d for d in arr if 2000 < d['port'] < 4000]` possibly? It outputs `[{'host': '144.217.103.15', 'port': 3000}]` – Eric Duminil May 06 '20 at 11:29
  • Assume I dont know the port value, I just need to store according to max/min/orderby first – Siva Sankar May 06 '20 at 13:36
  • @SivaSankar: Sorry, I'm not sure I understand your question. You could write an official question on StackOverflow. If you do, be sure to show what you already tried and what you want to achieve. – Eric Duminil May 06 '20 at 20:22
1

Keeping the first entry:

arr = [{'host': '144.217.103.15', 'port': 3000}, 
       {'host': '158.69.115.201', 'port': 8080},
       {'host': '144.217.103.15', 'port': 1020},]
hosts = set()
out = []
for entry in arr:
    if not entry['host'] in hosts:
        out.append(entry)
        hosts.add(entry['host'])
print(out)

#[{'host': '144.217.103.15', 'port': 3000}, {'host': '158.69.115.201', 'port': 8080}]
Thierry Lathuille
  • 23,663
  • 10
  • 44
  • 50