I am trying to get all occurences of a value in a data frame per row like this:
a b c d e
1 1 1 0 -1 NA
2 0 -1 -1 1 NA
3 -1 0 NA NA 1
to this
a b c d e count.-1 count.0 count.1 count.NA
1 1 1 0 -1 NA 1 1 2 1
2 0 -1 -1 1 NA 2 1 1 1
3 1 0 NA NA 1 0 1 2 2
which I am doing like this at the moment:
df = df %>%
by_row(
..f = function(x) {
sum(is.na(x[1:8]))
},
.to = "count_na",
.collate = "cols"
) %>%
by_row(
..f = function(x) {
sum(x[1:8] == 1, na.rm = T)
},
.to = "count_positive",
.collate = "cols"
) %>%
by_row(
..f = function(x) {
sum(x[1:8] == -1, na.rm = T)
},
.to = "count_negative",
.collate = "cols"
) %>%
by_row(
..f = function(x) {
sum(x[1:8] == 0, na.rm = T)
},
.to = "count_neutral",
.collate = "cols"
)
The problem is however that for 5 mil rows this takes forever to complete (over 3 hours, is there some better way to do this?