Checking for duplicates in a list Python

Question

dataset:
raw_data = [[1, John, 23, 32], [1, Jane, 10, 20], [1, Max, 90, 70], [2, Harry, 32, 56]]

list = []
for i in raw_data:
    if i[0] in list:
        x = i[0] + 0.1
        list.append(x)
    else:
        list.append(i[0])

I would actually like to obtain list = [1, 1.1, 1.2, 2]

However, my code is giving my list = [1, 1.1, 1.1, 2]

How can I run another loop in my list to add a 0.1 to a duplicated number?

Mureinik · Answer 1 · 2017-09-04T06:47:17.877

0

You could use a dictionary to cache the increments:

cache = {}
result = []
for i in raw_data:
    if i[0] in cache:
        cache[i[0]] += 0.1
    else:
        cache[i[0]] = 1

    result.append(cache[i[0]])

EDIT:
Using a defaultdict would save the condition inside the loop. Whether or not it's more elegant is in the eye of the beholder, though:

from collections import defaultdict
cache = defaultdict(lambda : 0.9)       
result = []
for i in raw_data:
    cache[i[0]] += 0.1
    result.append(cache[i[0]])

edited Sep 04 '17 at 06:47

answered Sep 04 '17 at 06:41

Mureinik

297,002
52
306
350

I read through your code, but I can't quite understand what you are doing. however, the output for both codes are not quite right. output: `[1, 1.1, 1.2000000000000002, 1]` – Tim Ong Sep 04 '17 at 07:22
@TimOng essentially, I'm keeping a cache for each value encountered, and incrementing it by `0.1` each time its encountered. The result you're seeing is just the built in inaccuracy of floating point math (try, e.g., to just evaluate `1.1+0.1` in python). You could check out the solutions to [this question](https://stackoverflow.com/q/783897/2422776) regarding truncating it to one decimal place. – Mureinik Sep 04 '17 at 09:02

Checking for duplicates in a list Python

1 Answers1