1

This is a continuation of the question [Get frequency using two groupings in a dataframe

I have a dataframe as follows:

Gene    Rptname   freq
Gene1   Cha1      1
Gene1   Cha2      1
Gene1   Cha3      1
Gene1   Cha4      1
Gene2   Cha1      1
Gene2   TH1       2
Gene2   KI2       2
Gene2   Cha4      2
Gene2   Cha5      4

I would like to convert the dataframe into the following format

       Cha1  Cha2  Cha3  Cha4  Cha5  KI2  TH1

Gene1   1     1     1     1      0    0    0
Gene2   1     0     0     2      4    2    2

I've got a feeling this is a reshape data issue but I dont know how to reshape the data-

Community
  • 1
  • 1
Sebastian Zeki
  • 6,690
  • 11
  • 60
  • 125

1 Answers1

7

You could do

xtabs(freq~., df1)
#         Rptname
#Gene    Cha1 Cha2 Cha3 Cha4 Cha5 KI2 TH1
# Gene1    1    1    1    1    0   0   0
# Gene2    1    0    0    2    4   2   2

Or

library(reshape2)
acast(df1, Gene~Rptname, value.var='freq', fill=0)
#       Cha1 Cha2 Cha3 Cha4 Cha5 KI2 TH1
#Gene1    1    1    1    1    0   0   0
#Gene2    1    0    0    2    4   2   2

Or use spread to get a 'data.frame' output

library(tidyr)
spread(df1, Rptname, freq, fill=0)

If your original dataset is 'dat', this could be also done using table

table(dat[c('Gene', 'Rptname')])
akrun
  • 874,273
  • 37
  • 540
  • 662