I have a data.frame which looks more or less like this one but with 195780 observations:
structure(list(fecha = structure(c(17897, 17897, 17897, 17897,
17897, 17897, 17897, 17897, 17897, 17897, 17897, 17897, 17897,
17897, 17897, 18255, 18255, 18255, 18255, 18255, 18255, 18256,
18256, 18256, 18256, 18256, 18256, 18256, 18256, 18256, 18256,
18358, 18358, 18358, 18358, 18358, 18358, 18358, 18358, 18358,
18358, 18358, 18358, 18358, 18358, 18358, 18358, 18900, 18900,
18900, 18900, 18900, 18900, 18900, 18900, 18900, 18900, 18900
), class = "Date"), año = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L), .Label = c("2019", "2020", "2021"), class = "factor"),
mes = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L,
12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 9L, 9L, 9L, 9L,
9L, 9L, 9L, 9L, 9L, 9L, 9L), .Label = c("Enero", "Febrero",
"Marzo", "Abril", "Mayo", "Junio", "Julio", "Agosto", "Septiembre",
"Octubre", "Noviembre", "Diciembre"), class = "factor"),
tipo_dia = c("F", "F", "F", "F", "F", "F", "F", "F", "F",
"F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F",
"L", "L", "L", "L", "L", "L", "L", "L", "L", "L", "L", "L",
"L", "L", "L", "L", "L", "L", "L", "L", "L", "L", "L", "L",
"L", "L", "L", "L", "L", "L", "L", "L", "L", "L", "L", "L",
"L")), row.names = c(NA, -58L), class = c("tbl_df", "tbl",
"data.frame"))
What I want is to obtain the count of tipo_dia
for year and month but just the count of unique values i.e.: there are N "F" days in Enero 2019, K "L" days in Enero 2020, and so on.
I've tried
df%>%
group_by(año,mes,tipo_dia)%>%
summarise(tipo_dia_conteo=sum(unique(tipo_dia)))
Which gives me this error:
Error: Problem with `summarise()` column `tipo_dia_conteo`.
i `tipo_dia_conteo = sum(unique(tipo_dia))`
And also:
df%>%
group_by(año,mes,tipo_dia)%>%
summarise(tipo_dia_conteo=length(unique(tipo_dia)))
Which gives me a result that I clearly don't want: (Note that here appears some "S" because I'm working with my full database)
# A tibble: 121 x 4
# Groups: año, mes [33]
año mes tipo_dia tipo_dia_conteo
<fct> <fct> <chr> <int>
1 2019 Enero D 1
2 2019 Enero F 1
3 2019 Enero L 1
4 2019 Enero S 1
5 2019 Febrero D 1
6 2019 Febrero F 1
7 2019 Febrero L 1
8 2019 Febrero S 1
9 2019 Marzo D 1
10 2019 Marzo F 1
So basically what I expect is the number of "D" days for Enero 2019, "L" and "S" days for same month and so on to September 2021. I feel like kinda stuck so any help will be much appreciated.