Count the Number of Unique Character Types based on Date in R

Question

I have a dataframe titled FilteredData with many columns. Specifically, there are two columns I am interested in: Date and Sale number.

I want to group all Sale number entries by dates. Date is a date-type field, and Sale number is a character-type field. If I'm not mistaken, I think these types are the reason why other Q&As on S.O. haven't been much help to me.

How can I do this?

I've tried the following:

aggregate(FilteredData$`Sale number`, by FilteredData$Date, FUN = count)
group_by(FilteredData$`Sale number`, FilteredData$Date)

Neither worked, and neither did the solution found here when I tried it.

I tried the following:

library(sqldf)
Freq = sqldf('SELECT Date, COUNT('Sale Number') FROM FilteredData GROUP BY Date')

and it surprisingly worked. However, is there a way to obtain this result without having to use SQL syntax, i.e. something "purely" in R?

Maybe just `table(FilteredData$FundedDate)` or `rowSums(table(FilteredData$FundedDate, FilteredData$FundedDate) > 0)` for unique counts by date. — lmo, May 24 '17 at 20:05
@lmo I've been banging my head around trying to get what you just suggested. Thank you — daOnlyBG, May 24 '17 at 20:08
I notice that `by FilteredData$FundedDate` cannot work. Did you have an = in there? — G5W, May 24 '17 at 20:09
or `rowSums(table(FilteredData$FundedDate, FilteredData$\`Sale number\`) > 0)` for unique counts by date. (typo in previous comment). — lmo, May 24 '17 at 20:12
`by FilteredData$FundedDate` won't parse. You need `by = FilteredData$FundedDate ` — G5W, May 24 '17 at 20:13

MichaelChirico · Answer 1 · 2017-05-24T20:09:47.663

1

You can use data.table as follows:

library(data.table)
setDT(FilteredData)

FilteredData[ , uniqueN(`Sale number`), by = Date]

I'm not sure if dplyr has a tailored function for this... you may just want length(unique(`Sale number`)) there.

edited May 24 '17 at 20:09

answered May 24 '17 at 20:06

MichaelChirico

33,841
14
113
198

regarding your note on the bottom- you're correct, it was a typo on my end. Fixed now, thanks – daOnlyBG May 24 '17 at 20:08

score 1 · Answer 2 · answered May 24 '17 at 20:08

1

Your question is a little unclear... So you want to group by date and then count the number of non-duplicate entries within a date?

dplyr can do this:

FilteredData %>% # take filtered data
  group_by(FundedDate) %>% # group by the date
  subset(!duplicated('Sale number')) %>% # remove rows that are duplicated sales numbers 
  count('Sale number') # count sales numbers

answered May 24 '17 at 20:08

Mark White

1,228
2
10
25

Hello Mark, I just edited my question- let me know if my fixing the variable names clear things up a bit. Thanks – daOnlyBG May 24 '17 at 20:10
Gotcha. Does my answer help? That's how I do grouped counts. – Mark White May 24 '17 at 20:11

Count the Number of Unique Character Types based on Date in R

2 Answers2