R calling non-unique values duplicates

Question

I have the following csv file as dataframe called key: https://www.dropbox.com/s/vy7bxlh2oyvh141/key.csv?dl=0

When I run:

dup <- key[which(duplicated(key$Genotype)), ]

I get a dataframe with 100 rows, most of actually appear unique:

> head(dup)
       Pot  Genotype
193 142698 PI-177384
194 142700 PI-178900
195 142702 PI-179275
196 142704 PI-179276
197 142706 PI-179277
198 142712 PI-179690

Does anyone know the reason for this?

`duplicated` only returns TRUE the second time a value appears. For example `duplicated(c(1,2,2))` returns FALSE, FALSE, TRUE. The first 2 hasn't been seen before but the second is a duplicate. — MrFlick, Mar 04 '20 at 20:06
Possible duplicate (or at least very related): https://stackoverflow.com/questions/12495345/find-indices-of-duplicated-rows — MrFlick, Mar 04 '20 at 20:09

score 1 · Accepted Answer · answered Mar 04 '20 at 20:10

1

If you want to create a df of duplicated rows, you'll have to alter the code to include a !

This should work:

dup <- key[which(!duplicated(key$Genotype)), ]

answered Mar 04 '20 at 20:10

Matt

7,255
2
12
34

Awesome! Feel free to click the green checkmark to accept the answer if it solved your question. – Matt Mar 04 '20 at 20:18

R calling non-unique values duplicates

1 Answers1