2

I would like to get the unique elements from a list of dictionary based on the value of a field and retain the other fields.

Following the is the format of data I have.

[ {id:"1000", text: "abc", time_stamp: "10:30"},
  {id:"1001", text: "abc", time_stamp: "10:31"},
  {id:"1002", text: "bcd", time_stamp: "10:32"} ]

I would like an output as follows: (Unique based on the text but retains other fields)

[ {id:"1000", text: "abc", time_stamp: "10:30"}, # earlier time stamp
  {id:"1002", text: "bcd", time_stamp: "10:32"} ]

Here please notice that the uniqueness is based on the text, and I would like to retain the id and the time_stamp value as well. This question is different from Python - List of unique dictionaries question asked previously.

I tried:

Method 1: Collecting only text values from the dictionary, converting it to a list, passing it to a set, and getting the unique text values, but I lost the id and time_stamp.

Method 2: I also tried ahead, I traversed through the list of the dictionary and checked if the text value was present in the unique_list_of_text, if not append to a list_of_unique_dictionary. But this code was taking a lot of time, as I am working with a data set which has 350,000 records. Is there a better way to do it? Code for method 2:

def find_unique_elements(list_of_elements):
    no_of_elements = len(list_of_elements)
        unique_list_of_text = []
        unique_list_of_elements = []
        for iterator in range(0, no_of_elements):
            if not list_of_elements[iterator]['text'] in unique_list_of_text:
                unique_list_of_full_text.append(list_of_elements[iterator]['text'])
                unique_list_of_elements.append(list_of_elements[iterator])
        return unique_list_of_elements
Yash Tibrewal
  • 172
  • 2
  • 14

2 Answers2

1

You could make a new list and just check if the item is there or not,

To make it a bit more faster, may be i'd use a better datastructure

$ cat unique.py

id = 'id'
text = 'text'
time_stamp = 'time_stamp'

data = [ {id:"1000", text: "abc", time_stamp: "10:30"},
   {id:"1001", text: "abc", time_stamp: "10:31"},
   {id:"1002", text: "bcd", time_stamp: "10:32"} ]

keys = set()
unique_items = []
for item in data:
    if item['text'] not in keys:
        unique_items.append(item)
    keys.add(item['text'])

print(unique_items)

$ python data.py 
[{'text': 'abc', 'id': '1000', 'time_stamp': '10:30'}, {'text': 'bcd', 'id': '1002', 'time_stamp': '10:32'}]
han solo
  • 6,390
  • 1
  • 15
  • 19
1

You can create a dictionary from the reversed list and get values from that dictionary:

id, text, time_stamp = 'id', 'text', 'timestamp'

l = [ {id:"1000", text: "abc", time_stamp: "10:30"},
  {id:"1001", text: "abc", time_stamp: "10:31"},
  {id:"1002", text: "bcd", time_stamp: "10:32"} ]

d = {i[text]: i for i in reversed(l)}
new_l = list(d.values())
print(new_l)
# [{'id': '1002', 'text': 'bcd', 'timestamp': '10:32'}, {'id': '1000', 'text': 'abc', 'timestamp': '10:30'}]

# if the order should be preserved
new_l.reverse()
print(new_l)
# [{'id': '1000', 'text': 'abc', 'timestamp': '10:30'}, {'id': '1002', 'text': 'bcd', 'timestamp': '10:32'}]

If the order in the final list is impotant use OrderedDict instead of dict in Python 3.6 and below.

Mykola Zotko
  • 15,583
  • 3
  • 71
  • 73