How to find first value in a list having no duplicates?

Question

l1 = ['A','B','C','D','A','B']
l2 = []

'C' is the first value in list l1, i want to create a function so that it returns C in l2.

ShadowRanger · Answer 1 · 2018-11-08T05:49:31.617

In 3.6 and higher, this is very easy. Now that dicts preserve insertion order, collections.Counter can be used to efficiently count all elements in a single pass, then you can just scan the resulting Counter in order to find the first element with a count of 1:

from collections import Counter

l1 = ['A','B','C','D','A','B']
l2 = [next(k for k, v in Counter(l1).items() if v == 1)]

Work is strictly O(n), with only one pass of the input required (plus a partial pass of the unique values in the Counter itself), and the code is incredibly simple. In modern Python, Counter even has a C accelerator for counting inputs that pushes all the Counter construction work to the C layer, making it impossible to beat. If you want to account for the possibility that no such element exists, just wrap the l2 initialization to make it:

try:
    l2 = [next(k for k, v in Counter(l1).items() if v == 1)]
except StopIteration:
    l2 = []
    # ... whatever else makes sense for your scenario ...

or avoid exception handling with itertools.islice (so l2 is 0-1 items, and it still short-circuits once a hit is found):

from itertools import islice

l2 = list(islice((k for k, v in Counter(l1).items() if v == 1), 1))

@ShadowRanger Never mind, I think I got the point now (and deleted previous comment). However the insertion order preservation can be rellied upon starting with 3.7 according to this answer https://stackoverflow.com/a/39980744/5378816 — VPfB, Nov 08 '18 at 07:32
@VPfB: Yeah, for 3.6, you can only *rely* on it for CPython (the reference interpreter) and PyPy (which first debuted the `dict` design that was naturally ordered). But as of 3.7 it's a language guarantee, not merely something that works as an implementation detail. — ShadowRanger, Nov 08 '18 at 11:06

Sufiyan Ghori · Answer 2 · 2018-11-08T05:21:18.150

1

You can convert list to string and then compare index of each character from left and right using find and rfind functions of string. It stops counting as soon as the first match is found,

l1 = ['A','B','C','D','A','B']

def i_list(input):
    l1 = ''.join(input)
    for i in l1:
        if l1.find(i) == l1.rfind(i):
            return(i)


print(i_list(l1))

# output
C

edited Nov 08 '18 at 05:21

answered Nov 08 '18 at 05:09

Sufiyan Ghori

18,164
14
82
110

DYZ · Answer 3 · 2018-11-08T05:28:10.173

1

An implementation using a defaultdict:

# Initialize
from collections import defaultdict
counts = defaultdict(int)

# Count all occurrences
for item in l1: 
    counts[item] += 1

# Find the first non-duplicated item
for item in l1:
    if counts[item] == 1: 
        l2 = [item]
        break
else:
    l2 = []

edited Nov 08 '18 at 05:28

answered Nov 08 '18 at 05:15

DYZ

55,249
10
64
93

One liners are more appreciated. – 0xInfection Nov 08 '18 at 05:22
4

@TheInfectedDrake Says who? – DYZ Nov 08 '18 at 05:22
1

Heh. This answer isn't so bad, and back in the Python 2 days, it would be the fastest (`Counter` wasn't C accelerated back then, `defaultdict` is, and neither of them was insertion ordered, so looping over the input a second time was the easiest way to find the first unique item). It's just that `Counter` got faster (specifically for one-liner usage counting a few larger iterables), and `dict`s got insertion ordered (allowing you to avoid reiterating the input iterable), so my solution ends up winning on both performance and simplicity. I'll kick in an upvote though. :-) – ShadowRanger Nov 08 '18 at 05:47
@ShadowRanger I know your solution is faster - I timed all of them :) – DYZ Nov 08 '18 at 05:48

slider · Answer 4 · 2018-11-08T05:51:25.843

1

As a follow up to ShadowRanger's answer, if you're using a lower version of Python, it's not that more complicated to filter the original list so that you don't have to rely on the ordering of the counter items:

from collections import Counter

l1 = ['A','B','C','D','A','B']
c = Counter(l1)
l2 = [x for x in l1 if c[x] == 1][:1]

print(l2)  # ['C']

This is also O(n).

edited Nov 08 '18 at 05:51

answered Nov 08 '18 at 05:35

slider

12,810
1
26
42

1

You don't need `filter`, and you can still short-circuit. `l2 = [next(x for x in l1 if c[x] == 1)]` (with similar `StopIteration` handling if desired). Any use of `filter` that needs a `lambda` should really just be a listcomp or genexpr, so if you don't mind complete iteration a second time, this code could still be improved to `[x for x in l1 if c[x] == 1][:1]`. – ShadowRanger Nov 08 '18 at 05:47
*Any use of filter that needs a lambda should really just be a listcomp or genexpr* That is good to know. Thank you :) I'll skip short circuiting as your comment and answer explain it nicely. – slider Nov 08 '18 at 05:53

score 0 · Answer 5 · answered Nov 08 '18 at 06:45

We can do it also with "numpy".

def find_first_non_duplicate(l1):
    indexes_counts = np.asarray(np.unique(l1, return_counts=True, return_index=True)[1:]).T
    (not_duplicates,) = np.where(indexes_counts[:, 1] == 1)
    if not_duplicates.size > 0:
        return [l1[np.min(indexes_counts[not_duplicates, 0])]]
    else:
        return []

print(find_first_non_duplicate(l1))

# output
['C']

score 0 · Answer 6 · answered Nov 08 '18 at 12:29

A faster way to get the first unique element in a list:

Check each element one by one and store it in a dict.

Loop through the dict and check the first element whose count is

def countElement(a):
    """
    returns a dict of element and its count
    """
    g = {}
    for i in a:
        if i in g: 
            g[i] +=1
        else: 
            g[i] =1
    return g


#List to be processed - input_list
input_list = [1,1,1,2,2,2,2,3,3,4,5,5,234,23,3,12,3,123,12,31,23,13,2,4,23,42,42,34,234,23,42,34,23,423,42,34,23,423,4,234,23,42,34,23,4,23,423,4,23,4] #Input from user


try:
    if input_list: #if list is not empty
        print ([i for i,v in countElement(input_list).items() if v == 1][0]) #get the first element whose count is 1
    else: #if list is empty
        print ("empty list in Input")
except: #if list is empty - IndexError
    print (f"Only duplicate values in list - {input_list}")

This is a slower (thanks to repeated tests for membership and Python level loops, as well as not short-circuiting when it finds the first element with a count of 1), *much* more verbose way of doing what [`collections.Counter` can do for you](https://stackoverflow.com/a/53201925/364696) (and [`defaultdict(int)` could do a lot of it too for that matter](https://stackoverflow.com/a/53201914/364696)), with the same limitations (only works on 3.6+ as it relies on `dict`s being insertion ordered). Generally, if the goal is counting, you want to use `collections.Counter`. — ShadowRanger, Nov 08 '18 at 20:01
@ShadowRanger: Thanks for your valuable suggestions. Short-circuiting is something that I learned in this thread. — Akash Swain, Nov 09 '18 at 03:12

How to find first value in a list having no duplicates?

6 Answers6

A faster way to get the first unique element in a list: