Time series with categories manipulation in R

Question

I have a table with three columns: Date, Individual and Category. I do not know how many individuals there are. I also do not know how many categories there are in total. Here is a toy example that I manually created with two individuals and three categories:

Date         Indiv Cat
3/7/2019    IndivA  1
4/7/2019    IndivA  1
5/7/2019    IndivA  1
6/7/2019    IndivA  1
7/7/2019    IndivA  2
8/7/2019    IndivA  2
9/7/2019    IndivA  3
10/7/2019   IndivA  2
11/7/2019   IndivA  2
3/7/2019    IndivB  1
4/7/2019    IndivB  1
5/7/2019    IndivB  1
6/7/2019    IndivB  2
7/7/2019    IndivB  2
8/7/2019    IndivB  3
9/7/2019    IndivB  3
10/7/2019   IndivB  3
11/7/2019   IndivB  2

Here is the expected out:

Date    Cat1  Cat2  Cat3
3/7/2019    2   0   0
4/7/2019    2   0   0
5/7/2019    2   0   0
6/7/2019    1   1   0
7/7/2019    0   2   0
8/7/2019    0   1   1
9/7/2019    0   0   2
10/7/2019   0   1   1
11/7/2019   0   1   0

The number of column will vary depending on the number of category. The number of rows will vary depending on the date length.

Is there an way to transform the initial table to yield the result that I created manually please ?

Try: `df %>% group_by(Date, Cat) %>% summarise(n = n()) %>% spread(Cat, n)` — Pablo Rod, Aug 02 '19 at 09:25
Hello, Thank you very much! I guess it is the same answer? it works well! — Rene Chan, Aug 02 '19 at 09:35
I suggested an attempt. Cett, who answered after my comment, wrote down a different approach., but then edited to what it's pretty much the code in my comment — Pablo Rod, Aug 02 '19 at 09:48

score 0 · Accepted Answer · answered Aug 02 '19 at 09:26

0

you could use the dplyr package and aggregate by date and Cat:

library(dplyr)
df %>% mutate(Cat = paste0("Cat", Cat)) %>%
  group_by(Date, Cat) %>%
  summarise(n = n()) %>%
  spread(Cat, n)

answered Aug 02 '19 at 09:26

Cettt

11,460
7
35
58

Time series with categories manipulation in R

1 Answers1