0

I am an early user of Rstudio, and i have a quite simple problem, but unfortunately i am not able to solve it. I just want to aggregate rows of my data.frame by words contained on the first column of the df. The data.frame is made by five columns: The first one is made by words; the second, the third, the fourth, the fifth ones are made by numeric values.

for example if the data would be:

SecondWord  X Y Z Q
NO          1 2 2 1
NO          0 0 1 0
YES         1 1 1 1

i expect to see a result like:

SecondWord  X Y Z Q
NO          1 2 3 1
YES         1 1 1 1

How could i do? i have tried to use the following method:

test <- read.csv2("test.csv")
df<-aggregate(.~Secondword,data=test, FUN = sum, na.rm=TRUE)

But the values were not the ones i expected to see. Thank you for your future helps and sorry for the "simple" question.

Silvia
  • 405
  • 4
  • 17

2 Answers2

1

You can also use tidyverse

library(tidyverse)
df <- test %>%
  group_by(SecondWord) %>%
  summarize_each(funs(sum))

df
# SecondWord     X     Y     Z     Q
#         NO     1     2     3     1
#        YES     1     1     1     1
MeetMrMet
  • 1,349
  • 8
  • 14
0

ddply should work as well.

For example, something like:

library(plyr)
grouped <- ddply(test, "Secondword", numcolwise(sum))
paoloqaz
  • 21
  • 3
  • 2
    Generally, you should identify the packages used in the body of the question and refer to help files like `?ddply` rather than using a link that might break a few years from now. – Frank Apr 05 '17 at 16:50
  • Well, I expect Google still alive a few years from now. If it breaks, search for ddply there. – paoloqaz Apr 05 '17 at 17:06