1

I have a dataset like this:

Year Month Day Location Target Perpetrator
1970  5     1  Place1   x      A
1970  7     5  Place2   y      A
1971  2     3  Place3   x      B
1972  10    8  Place4   x      C
1972  12   13  Place2   y      C
1973  1     3  Place5   z      B

I am totally lost on how to do this. I have tried

data <- data %>%
  distinct() %>%
  count(Perpetrator)

but that only gives me the count of each unique value in "Perpetrator" of course.

The output I was is the count of each unique value in "Perpetrator" by YEAR. How can I do this?

Vadim Kotov
  • 8,084
  • 8
  • 48
  • 62
azura
  • 81
  • 5
  • try `data %>% group_by(Year) %>% distinct() %>% count(perpetrator)` – morgan121 Feb 28 '20 at 05:10
  • 1
    this is exactly what I was looking for! I have been trying this for hours, thanks so much – azura Feb 28 '20 at 05:12
  • 1
    another way (without resulting in a tibble) is `ddply(data, .(Year), summarise, n = n_distinct(Perpetrator))` from the `plyr` package. I personally perfer this way as I hate tibbles :P – morgan121 Feb 28 '20 at 05:14
  • Does this answer your question? [How to add count of unique values by group to R data.frame](https://stackoverflow.com/questions/17421776/how-to-add-count-of-unique-values-by-group-to-r-data-frame) – morgan121 Feb 28 '20 at 05:17
  • 2
    You can `count` multiple variables `data %>% count(Year, Perpetrator)` – Ronak Shah Feb 28 '20 at 05:18

1 Answers1

0

In base R we can use tapply.

with(dat, tapply(Perpetrator, Year, FUN=length))
# 1970 1971 1972 1973 
#    2    1    2    1 

Data:

dat <- structure(list(Year = c(1970L, 1970L, 1971L, 1972L, 1972L, 1973L
), Month = c(5L, 7L, 2L, 10L, 12L, 1L), Day = c(1L, 5L, 3L, 8L, 
13L, 3L), Location = c("Place1", "Place2", "Place3", "Place4", 
"Place2", "Place5"), Target = c("x", "y", "x", "x", "y", "z"), 
    Perpetrator = c("A", "A", "B", "C", "C", "B")), row.names = c(NA, 
-6L), class = "data.frame")
jay.sf
  • 60,139
  • 8
  • 53
  • 110