-1

I have a dataframe like this (below are the 10 first row and the 6 first columns) :

                     Order Ino.RSclayNR1a Ino.RSclayNR1b Ino.RSclayNR1c Ino.RSclayNR2a Ino.RSclayNR2b
OTU107 Gammaproteobacteria              5              4              0              0              0
OTU98       Actinobacteria              0              0              0              0              0
OTU71  Alphaproteobacteria              0              0              0              0              0
OTU79       Actinobacteria              0              5              0              0              0
OTU164 Alphaproteobacteria              0              0              1              0              0
OTU39  Alphaproteobacteria             11              5              5              4              7
OTU45  Alphaproteobacteria              0              0              0              0              0
OTU41       Actinobacteria              0              2              0              1              1
OTU120      Actinobacteria             10             12              5              7              3
OTU110 Alphaproteobacteria              0              0              0              0              0

I want to group rows which share the same value in the 'Order' column, and for each column, sum all the values involved.

The result would return a table with 3 rows (Gammaproteobacteria, Actinobacteria, Alphaproteobacteria), and the same number of columns, each column containing the rowSums of the grouped rows. The rownames of the starting table can be dropped.

I have been looking at many subjects on stackoverflow and have been trying with the dplyr package but I'm struggling.

Thanks in advance.

Micawber
  • 707
  • 1
  • 5
  • 19

1 Answers1

1

This should be a bread and butter dplyr call:

df %>% group_by(Order) %>% summarize_all(sum)

  • Okay, I'm silly : I tried that, without realizing I was typing `sum()` instead of `sum` ! Thank you – Micawber Mar 06 '19 at 11:48