0

I have a dataset, which i define for example like this:

type <- c(1,1,1,2,2,2,2,2,3,3,4,4,5)
val <- c(4,1,1,2,8,2,3,2,3,3,4,4,5)
tdt <- data.frame(plu, occur)

So it looks like this:

   type  val
    1     4
    1     1
    1     1
    2     2
    2     8
    2     2
    2     3
    2     2
    3     3
    3     3
    4     4
    4     4
    5     5
    5     7

I want to find how many unique vals each type gets (turnover). So desired result is:

   type  turnover
    1     2
    2     3
    3     1
    4     1
    5     2

How could i get it? How this function should look like? I know how to count occurrences of each type, but not with each unique val

french_fries
  • 1,149
  • 6
  • 22

2 Answers2

1

With n_distinct, we can get the number of unique elements grouped by 'type'

library(dplyr)
tdt %>%
      group_by(type) %>%
      summarise(turnover = n_distinct(val))
# A tibble: 5 x 2
#   type turnover
#  <int>    <int>
#1     1        2
#2     2        3
#3     3        1
#4     4        1
#5     5        2

Or with distinct and count

tdt %>%
    distinct() %>%
    count(type)
#  type n
#1    1 2
#2    2 3
#3    3 1
#4    4 1
#5    5 2

Or using uniqueN from data.table

library(data.table)
setDT(tdt)[, .(turnover = uniqueN(val)), type]

Or with table in base R after getting the unique rows

table(unique(tdt)$type)

data

tdt <- structure(list(type = c(1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 
4L, 4L, 5L, 5L), val = c(4L, 1L, 1L, 2L, 8L, 2L, 3L, 2L, 3L, 
3L, 4L, 4L, 5L, 7L)), class = "data.frame", row.names = c(NA, 
-14L))
akrun
  • 874,273
  • 37
  • 540
  • 662
1

Another base R option is using aggregate

tdtout <- aggregate(val~.,tdt,function(v) length(unique(v)))

such that

> tdtout
  type val
1    1   2
2    2   3
3    3   1
4    4   1
5    5   2

data

> dput(tdt)
structure(list(type = c(1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 4, 4, 5, 
5), val = c(4, 1, 1, 2, 8, 2, 3, 2, 3, 3, 4, 4, 5, 7)), class = "data.frame", row.names = c(NA,
-14L))
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81