0

I have a dataset that looks like this:

> visascore[239:250,]
# A tibble: 12 x 6
     num article_num paragraph_num date       word          score
   <int>       <int>         <int> <fct>      <chr>         <dbl>
 1    12           2             4 04/12/2017 future        0.228
 2    12           2             4 04/12/2017 priced       -0.280
 3    13           3             1 19/12/2017 summary       0.284
 4    13           3             1 19/12/2017 visa          0.741
 5    13           3             1 19/12/2017 losing       -0.587
 6    13           3             1 19/12/2017 payments      0.238
 7    13           3             1 19/12/2017 ma           -0.275
 8    13           3             1 19/12/2017 visa          0.741
 9    13           3             1 19/12/2017 acquisitions  0.416
10    14           3             2 19/12/2017 ma           -0.275
11    14           3             2 19/12/2017 visa          0.741
12    14           3             2 19/12/2017 access        0.376

what I want to do is SUM the value in the "score" column for each paragraph ("paragraph_num") in each article ("article_num"). Is there a way to achieve this? I even thought of bypass this by summing for each "num" variable (which is basically the uninterrupted sequence of all paragraphs and group them later into each article) but I don't know how to do it.

ekad
  • 14,436
  • 26
  • 44
  • 46
  • I've looked at the question you all said mine was similar but the answer provided by @Flo.P was the one I was looking for. I apologize for the duplicate question though – TheClutch01 Mar 15 '18 at 12:00

1 Answers1

0
library(dplyr)    
visascore %>%
       group_by(article_num, paragraph_num) %>%
       summarise(sum_by_art_para = sum(score))
Flo.P
  • 371
  • 2
  • 7
  • this is perfect, exactly what I looked for, thank You very much. It's also very easy to "edit" and adapt so, great, thank You again – TheClutch01 Mar 15 '18 at 11:58