8

Does a function like this exist in any package?

isdup <- function (x) duplicated (x) | duplicated (x, fromLast = TRUE)

My intention is to use it with dplyr to display all rows with duplicated values in a given column. I need the first occurrence of the duplicated element to be shown as well.

In this data.frame for instance

dat <- as.data.frame (list (l = c ("A", "A", "B", "C"), n = 1:4))
dat

> dat
  l n
1 A 1
2 A 2
3 B 3
4 C 4

I would like to display the rows where column l is duplicated ie. those with an A value doing:

library (dplyr)
dat %>% filter (isdup (l))

returns

  l n
1 A 1
2 A 2
dmontaner
  • 2,076
  • 1
  • 14
  • 17
  • 3
    Why not just use the one you defined? – Rich Scriven May 20 '16 at 17:40
  • 2
    Take a look at [this post](http://stackoverflow.com/questions/37148567/fastest-way-to-remove-all-duplicates-in-r/37149066#37149066) for alternative methods along with an efficiency analysis. – lmo May 20 '16 at 18:47
  • Is just easier If I do not need to write it every time... thanks for the hints. – dmontaner May 20 '16 at 22:49

1 Answers1

22

dat %>% group_by(l) %>% filter(n() > 1)

I don't know if it exists in any package, but since you can implement it easily, I'd say just go ahead and implement it yourself.

Nick Larsen
  • 18,631
  • 6
  • 67
  • 96
  • 1
    Thanks! Your solution is clean and works for me. The answers given in the supposed duplicate question links only appeared to deal with duplicates within a single vector, not returning all columns of a data frame where a single column has duplicates. – jNorris Jun 21 '18 at 18:56
  • This works perfectly. – R.S. Jun 24 '18 at 02:37