List index has duplicate index numbers

Question

type(train_x)
numpy.ndarray

train_samples = train_x.tolist()

When I print the index of my samples, you can see that there are duplicates that are out of order. Why might this be happening?

It is messing up my pipeline downstream... but sometimes it runs fine when the index decides to preserve itself.

for tr in train_samples:
    print(train_samples.index(tr))

...
11
12
13
14 # here
15
...
39
40
41
42
14 # here
...

Proving answer about duplicate entries:

@AndrasDeak now that i understand why it's happening it is definitely a duplicate... but not even close to that question — Kermit, Apr 10 '20 at 21:42
If the linked question applies to your problem, you should be using `enumerate`. `list.index` has to search your list from the start each time. — Andras Deak -- Слава Україні, Apr 10 '20 at 21:45
Can you clarify your question? _When I print the index of my samples, you can see that there are duplicates that are out of order._ ..... _sometimes it runs fine when the index decides to preserve itself._ What `index`? `index()` is the list method, right? — AMC, Apr 10 '20 at 21:48
@AMC SO auto intelligence wouldn't let me submit the question title as to "why it was duplicated" — Kermit, Apr 10 '20 at 21:49
If you later want to shuffle your items (judging from the variable name) you can generate an index array `np.arange(train_x.size)`, and use a random shuffling index array to shuffle the data and these indices simultaneously. — Andras Deak -- Слава Україні, Apr 10 '20 at 21:50

ApproachingDarknessFish · Accepted Answer · 2020-04-10T21:42:08.323

1

The index method searches from the front of the list, so if your data contains duplicate values, index will always only find the first one.

>>> values = ['a', 'b', 'c', 'a']
>>> for v in values:
...  print("value", v, "occurs at index", values.index(v))
... 
value a occurs at index 0
value b occurs at index 1
value c occurs at index 2
value a occurs at index 0

From the docs for list.index (emphasis added):

Return the index in the list of the first item whose value is x. It is an error if there is no such item.

edited Apr 10 '20 at 21:42

answered Apr 10 '20 at 21:36

ApproachingDarknessFish

14,133
7
40
79

Ah, yeah I checked for dupes but was checking 15 not 14 messing up the effect of 0 – Kermit Apr 10 '20 at 21:39

List index has duplicate index numbers

1 Answers1