Counting dates (as a class) in R

Question

Let's say that I have a simple data frame in R, as follows:

#example data frame
a = c("red","red","green")
b = c("01/01/1900","01/02/1950","01/05/1990")
df = data.frame(a,b)
colnames(df)<-c("Color","Dates")

My goal is to count the number of dates (as a class - not individually) for each variable in the "Color" column. So, the result would look like this:

#output should look like this:
a = c("red","green")
b = c("2","1")
df = data.frame(a,b)
colnames(df)<-c("Color","Dates")

Red was associated with two dates -- the dates themselves are unimportant, I'd just like to count the aggregate number of dates per color in the data frame.

score 2 · Answer 1 · answered Jan 06 '17 at 16:40

2

Or in base R:

sapply(split(df, df$Color), nrow)
# green   red 
#     1     2

answered Jan 06 '17 at 16:40

Ege Rubak

4,347
1
10
18

I like this one best. – Mike Wise Jan 06 '17 at 16:55
This is great. Thank you. A complication, however - let's say there is an NA in red, like this: `a=c("red","red","red","green")` `b=c("01/01/1900","01/02/1950","NA","01/05/1990")` `df=data.frame(a,b)` `colnames(df)<-c("Color","Dates")` ...could we not count the NA somehow? – knaslund Jan 06 '17 at 19:02
You could just start by omitting `NA` values: `df <- omit.na(df)` and then continue as before. It just occurred to me that you can simply use `table(df$Color)` to get what you want if you are really just counting the number of times each color occurs in the table (after removing `NA` values). – Ege Rubak Jan 07 '17 at 17:10

score 1 · Accepted Answer · answered Jan 06 '17 at 16:36

1

We can use data.table

library(data.table)
setDT(df)[, .(Dates = uniqueN(Dates)) , Color]
#   Color Dates
#1:   red     2
#2: green     1

answered Jan 06 '17 at 16:36

akrun

874,273
37
540
662

This would work, but what if the dates are not unique? So, in red for example, both dates are "01/01/1900" ? – knaslund Jan 06 '17 at 16:40
@knaslund It will be 1 using this answer. What is your expected for that case? Do you need `setDT(df)[, .(Dates = .N), Color]` – akrun Jan 06 '17 at 16:40
ah, yes this seems like it will work fabulously! thank you! – knaslund Jan 06 '17 at 16:46

score 0 · Answer 3 · answered Jan 06 '17 at 16:53

0

using the dplyr package from the tidyverse:

library(dplyr)
df %>% group_by(Color) %>% summarise(n())
# # A tibble: 2 × 2
#    Color `n()`
#   <fctr> <int>
# 1  green     1
# 2    red     2

answered Jan 06 '17 at 16:53

Mike Wise

22,131
8
81
104

Counting dates (as a class) in R

3 Answers3