Getting values that appear exactly n-times

Question

I specifically started to think in this problem trying to get the values form a vector that were not repeated. unique is not good (up to what I could collect from the documentation) because it gives you repeated elements, but only once. duplicated has the same problem since it gives you FALSE the first time it finds a value that is duplicated. This was my workaround

> d=c(1,2,4,3,4,6,7,8,5,10,3)
> setdiff(d,unique(d[duplicated(d)]))
[1]  1  2  6  7  8  5 10

The following is a more general approach

> table(d)->g
> as.numeric(names(g[g==1]))
[1]  1  2  5  6  7  8 10

which we can generalize to other value than 1. But I find this solution a bit clumsy, transforming strings to numbers. Is there a better or more straightforward way to get this vector?

I think that out of all the proposed answers, your `table` one is the least clumsy one. Efficient, less code, no external packages required. — David Arenburg, Sep 30 '14 at 15:17

score 4 · Accepted Answer · answered Sep 30 '14 at 14:56

4

You could sort the values, then use rle to get the values that appear n times consecutively.

rl <- rle(sort(d))

rl$values[rl$lengths==1]
## [1]  1  2  5  6  7  8 10

rl$values[rl$lengths==2]
## [1] 3 4

answered Sep 30 '14 at 14:56

James Trimble

1,868
13
20

A5C1D2H2I1M1N2O1R2T1 · Answer 2 · 2014-09-30T15:41:03.337

You could also do something like this in base R.

as.numeric(levels(factor(d))[tabulate(factor(d)) == 1])
# [1]  1  2  5  6  7  8 10

I've used factor and levels to make the approach more general (so "d" can include negative values and 0s).

Of course, even for something like this, you can expect a performance boost from "data.table", with which you can do something like:

library(data.table)
as.data.table(d)[, .N, by = d][N == 1]$d
# [1]  1  2  6  7  8  5 10

DMT · Answer 3 · 2014-09-30T15:24:49.563

2

The one liner here is completely unnecessary but one-liners are always nice

Say you want to find all the elements that happen 2 times. Using the plyr package:

count(d)$x[count(d)$freq==2]
#Output
#[1] 3 4

edited Sep 30 '14 at 15:24

answered Sep 30 '14 at 15:08

DMT

1,577
10
16

3

Hmm `count`... Do we know him? – David Arenburg Sep 30 '14 at 15:11

score 1 · Answer 4 · answered Sep 30 '14 at 15:06

1

You can use duplicated for n=1, just call it twice and use the fromLast argument.

sort(d[! (duplicated(d) | duplicated(d, fromLast=TRUE))])
# [1]  1  2  5  6  7  8 10

answered Sep 30 '14 at 15:06

Matthew Plourde

43,932
7
96
113

score 1 · Answer 5 · answered Sep 30 '14 at 15:10

1

I prefer the other answers, but this seemed like a good excuse to test my skills with dplyr:

library(dplyr)
as.data.frame(table(d)) %>%
  filter(Freq == 1) %>%
  select(d)
---
   d
1  1
2  2
3  5
4  6
5  7
6  8
7 10

answered Sep 30 '14 at 15:10

Chase

67,710
18
144
161

Getting values that appear exactly n-times

5 Answers5