1

I'm sorry for the newbie question but I cannot find an answer.

I have a table with 3 columns:

File  Word  Occurrences
  f1   cat            2
  f1   dog            1
  f2   cat            4
  f2   pig            3

And I want to convert it in a table where each file corresponds to a row and each column to the occurrences of a specific word:

File  Cat  Dog  Pig
  f1    2    1    0
  f2    4    0    3
  • The urls are not the exact dupes. – akrun Nov 11 '15 at 12:44
  • @akrun http://stackoverflow.com/a/5890831/3710546, from first link. –  Nov 11 '15 at 12:52
  • @Pascal Thanks for that url. But as mentioned [here](http://stackoverflow.com/questions/33636641/combine-multiple-rows-with-same-field-in-r/33636750#33636750), it is not a valid reason to close the question (this link was also closed and then reopened based on that). – akrun Nov 11 '15 at 12:55
  • @akrun It is your viewpoint that it is not a dupe. It is for me. That's all. –  Nov 11 '15 at 12:56
  • @Pascal I didn't say it is your viewpoint. But, you can close it. I will undupe it. :-) – akrun Nov 11 '15 at 12:57
  • @Pascal It is also not my viewpoint. It is based on the precedence in that link. – akrun Nov 11 '15 at 12:58
  • My point is that if another clear dupe can be unduped, this can be too. – akrun Nov 11 '15 at 13:03
  • @akrun, for what reason? Just because a mistake was possibly made somewhere else (by "unduping" a duplicate), doesn't mean it should be repeated elsewhere. That's at least my opinion.. – talat Nov 11 '15 at 13:06
  • @akrun, I don't see anything put forward by Ananda here and I don't know what other question you referred to. My comment was general – talat Nov 11 '15 at 13:08
  • @docendodiscimus It is there in the discussion `@Frank, would you know that xtabs sums without either trying it out or trying to interpret the help page for xtabs?` – akrun Nov 11 '15 at 15:25

2 Answers2

3

Try

library(reshape2)
 dcast(df1, File~Word, value.var='Occurrences', sum)

Or

 xtabs(Occurrences~File+Word, df1)
akrun
  • 874,273
  • 37
  • 540
  • 662
2

my try without additional packages:

d <- read.table(header=TRUE, text=
'File  Word  Occurrences
f1   cat            2
f1   dog            1
f2   cat            4
f2   pig            3')
d.w <- reshape(d, dir="wide", idvar="File", timevar="Word")
d.w[is.na(d.w)] <- 0
d.w
jogo
  • 12,469
  • 11
  • 37
  • 42