4

I need help with finding the index of the first duplicate element in array. I tried this but it gets me all indices of all duplicate elements.

res = [idx for idx, item in enumerate(input) if item in input[:idx]]
cmaher
  • 5,100
  • 1
  • 22
  • 34
Mark
  • 55
  • 3
  • Please provide more information about your problem with code – Teshie Ethiopia Oct 13 '22 at 16:46
  • @CarlHR Eliminating duplicates and finding the index of the first duplicate are related, but not the same. – CrazyChucky Oct 13 '22 at 16:48
  • Sidenote: you're shadowing `input`. Once you define your own variable named that, you lose access to Python's builtin `input` (and could potentially encounter confusing bugs if you try to use it). – CrazyChucky Oct 13 '22 at 16:50
  • When saying „the first duplicate element“ are you referring to the first duplicate for a specific number or for all different numbers that have duplicates in the array?. – ai2ys Oct 13 '22 at 17:32

4 Answers4

4

Unfortunately your idea is O(n^2), because slicing and checking membership is O(n) and you do this within a loop. It's essentially a nested loop.

Instead, you can do this in O(n) using a single loop:

seen = {}
for i, item in enumerate(my_input):
    if item in seen:
        break
    seen[item] = i
else:
    i = item = None

At the end of the loop, item is the first duplicate item and i is its index. seen[item] will hold the index of the first occurrence of this duplicate item.

wim
  • 338,267
  • 99
  • 616
  • 750
2

tl;dr: use wim's or Mad Physicist's answers. Anything based on the technique you're starting with will do a lot of unnecessary looping. The following isn't so much a solution as some parallel ideas about how your current code could be improved.

The simplest modification to your code would be to simply access the first item of the resulting list:

res = [idx for idx, item in enumerate(values) if item in values[:idx]][0]

But that's kind of silly, since you'll still be traversing the entire list, finding all duplicates, then throwing away all but one. It will also throw a KeyError if the list is empty (that is, if there are no duplicates). A cleaner and more efficient method would be to call next on a version of your list comprehension turned into a generator expression:

res = next((idx for idx, item in enumerate(values) if item in values[:idx]), None)

None is supplied as a default which will be returned if no duplicates are found.

This still has the same efficiency issues that wim points out, but knowing about indexing, next, and generator expressions can be useful!

CrazyChucky
  • 3,263
  • 4
  • 11
  • 25
  • 1
    I figured out how to do it in O(N) if you're willing to initialize a set on a separate line – Mad Physicist Oct 13 '22 at 18:04
  • I usually try to avoid side effects and `or` tricks like that because I come back in a month and have to puzzle out what I did, but that's clever! – CrazyChucky Oct 13 '22 at 18:36
1

You can maintain a set of items you've seen so far:

def first_dup(a):
    seen = set()
    for i, e in enumerate(a):
        if e in seen:
            return i
        seen.add(e)
    return -1

You can isolate the set check into a separate function and combine the rest into an efficient one-liner using next:

def first_dup(a):
    seen = set()
    return next((i for i, e in enumerate(a) if (e in seen or seen.add(e))), -1)

This cheats by using the fact that or will evaluate the second expression any time the first is False, and set.add is an in-place operation that returns None.

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
0

If you want the first index in the iteration:

array=[2,3,4,5,3]
for i,a in enumerate(array):
    for j in range(i):
        if a==array[j]:
            print(j)
            exit()

If you want the last index in the iteration:

array=[2,3,4,5,3]
for i,a in enumerate(array):
    for j in range(i):
        if a==array[j]:
            print(i)
            exit()