I want to count occurrences of the three factors for each column of mydata, so I thought of the function table
Some data of mydata:
A0AUT A0AYT A0AZT A0B2T A0B3T
100130426 no_change no_change no_change no_change no_change
100133144 no_change no_change down no_change no_change
100134869 no_change no_change no_change no_change no_change
10357 no_change up no_change no_change up
10431 no_change up no_change no_change no_change
136542 no_change up no_change no_change no_change
> str(mydata)
'data.frame': 20531 obs. of 518 variables:
$ A0AUT: Factor w/ 3 levels "down","no_change",..: 2 2 2 2 2 2 2 2 2 2 ...
$ A0AYT: Factor w/ 3 levels "down","no_change",..: 2 2 2 3 3 3 2 2 2 3 ...
$ A0AZT: Factor w/ 3 levels "down","no_change",..: 2 1 2 2 2 2 1 2 2 2 ...
$ A0B2T: Factor w/ 3 levels "down","no_change",..: 2 2 2 2 2 2 1 2 2 2 ...
$ A0B3T: Factor w/ 3 levels "down","no_change",..: 2 2 2 3 2 2 2 2 2 2 ...
$ A0B5T: Factor w/ 3 levels "down","no_change",..: 2 2 2 3 2 2 2 2 2 2 ...
$ A0B7T: Factor w/ 3 levels "down","no_change",..: 2 2 2 2 2 2 1 2 2 2 ...
$ A0B8T: Factor w/ 3 levels "down","no_change",..: 2 1 1 2 3 2 2 2 2 2 ...
$ A0BAT: Factor w/ 3 levels "down","no_change",..: 2 2 2 2 2 2 2 2 2 2 ...
$ A0BCT: Factor w/ 3 levels "down","no_change",..: 2 2 2 2 3 2 2 2 2 2 ...
Now I do:
occurences <- apply(mydata, 1, table)
> occurences[[1]] # 100130426
no_change up
508 10
> occurences[[2]] # 100133144
down no_change up
45 446 27
But I want them as a matrix (or at least I think it is easier to deal with) so I made this:
freq <- sapply(occurences, function(x){
c(x, rep(0, 3 - length(x)))
})
> freq[,1:5]
100130426 100133144 100134869 10357 10431
no_change 508 45 14 3 3
up 10 446 411 330 268
0 27 93 185 247
However as you can see the number of no_change for 100133144 went to the up row!
My expected output would be:
> freq[,1:5]
100130426 100133144 100134869 10357 10431
up 10 45 14 3 3
no_change 508 446 411 330 268
down 0 27 93 185 247
How can I make it so that each value is well placed? As you can see each table may be just one to three elements, so doing:
freq <- matrix(unlist(occurences), nrow=3)
results on error, because not multiple of 3.
I might have taken a bad approach to count the frequencies of mydata by column. I would prefer to have an approach with just base R, without using any library