0

I have a dataframe of userid, events and the date that event occurred and I want to create a new dataframe showing how many events were completed on each date per user. I have put an example below of what I am trying to achieve. Date is in d/m/y.

I have tried using gather() from tidyr but couldn't get it to work, and I have also tried using with().

So my my dataframe, df, looks like this currently:

UserId    Date           Event

123       01-01-2019     Open
123       01-01-2019     Push
123       03-01-2019     Open
234       01-01-2019     Open
234       02-01-2019     Open
345       02-01-2019     Push
345       05-01-2019     Open
345       05-01-2019     Push
345       05-01-2019     Move
456       03-01-2019     Open
567       05-01-2019     Open
567       05-01-2019     Push


I want to create a new one based on df, that looks as follows:

UserId     01-01-2019    02-01-2019    03-01-2019    04-01-2019    05-01-2019

123        2             0             1             0             0
234        1             1             0             0             0
345        0             1             0             0             3
456        0             0             1             0             0
567        0             0             0             0             2

So I am essentially trying to get a dataframe of every day over a 90 day period Any help with this would be great. When using with() any dates that didn't have an event associated with them were omitted which I don't want to happen.

J.Wade2
  • 13
  • 3

1 Answers1

0

We can try

library(tidyverse)
df1 %>%
     count(UserId, Date) %>%
     spread(Date, n, fill  = 0)
# A tibble: 5 x 5
#  UserId `01-01-2019` `02-01-2019` `03-01-2019` `05-01-2019`
#   <int>        <dbl>        <dbl>        <dbl>        <dbl>
#1    123            2            0            1            0
#2    234            1            1            0            0
#3    345            0            1            0            3
#4    456            0            0            1            0
#5    567            0            0            0            2

or with table

table(df1[1:2])
akrun
  • 874,273
  • 37
  • 540
  • 662