How can I summarize the multiple columns?

Question

I am trying to summarize multiple columns. The below is my raw dataset:

   neg_sentiment neu_sentiment pos_sentiment age
1           1270           609           594  39
2            374           150           260  77
3             30            23           138  67
4             20            30           138  59
5             78           194           566  70
6            420           275           609  50
7           1233           447           745  57
8           1096           390          1022  43
9           1688           704           888  61
10           667           563          2051  69

My desired output is like below :

  new_age         pos_sentiment  neu sentiment  neg_sentiment
  <chr>              <int>           <int>           <int>
1 age(old) > 60       1000            1000            500
2 age(young) < 60     1000            1000            500

But I just could summarize for one column.

df_1 %>%
  group_by(new_age = c('age(old) > 60', 'age(young) < 60')[(age < 60) + 1]) %>%
  summarise(neg_sentiment = sum(neg_sentiment))

Could you help me make the desired output?

The below is dput:

    structure(list(neg_sentiment = c(1270L, 374L, 30L, 20L, 78L,  420L, 1233L, 1096L, 1688L, 667L, 260L, 315L, 1060L, 5089L, 647L, 3691L, 3217L, 4155L, 13345L, 6121L, 478L, 2152L, 49863L, 4224L, 267L, 1660L, 1340L, 148L, 2660L, 484L, 2495L, 603L, 379L, 6340L, 951L, 86L, 7389L, 7459L, 1363L, 597L, 12915L, 1642L, 1808L, 352L, 13970L, 2740L, 10351L, 664L), neu_sentiment = c(609L, 150L, 23L, 30L, 194L, 275L, 447L, 390L, 704L, 563L, 197L, 245L, 403L, 2972L, 229L, 1453L, 1012L, 1686L, 4229L, 2429L, 168L, 955L, 18109L, 
1889L, 165L, 984L, 664L, 78L, 967L, 198L, 960L, 208L, 190L, 1672L, 562L, 35L, 2840L, 2477L, 473L, 439L, 6102L, 816L, 881L, 238L, 4292L, 1026L, 4705L, 524L), pos_sentiment = c(594L, 260L, 138L, 138L, 566L, 609L, 745L, 1022L, 888L, 2051L, 375L, 582L, 394L, 6480L, 215L, 1254L, 1049L, 1676L, 3404L, 1890L, 177L, 2621L, 28169L, 2111L, 559L, 4103L, 1348L, 354L, 1556L, 528L, 1256L, 252L, 488L, 3661L, 1740L, 57L, 2381L, 2552L, 491L, 1376L, 2002L, 876L, 2214L, 799L, 5522L, 2417L, 4606L, 2010L), age = c(39, 77, 
67, 59, 70, 50, 57, 43, 61, 69, 57, 49, 51, 63, 69, 63, 74, 77, 43, 56, 54, 80, 70, 49, 67, 73, 69, 70, 57, 48, 55, 54, 67, 60, 54, 68, 66, 87, 74, 67, 64, 66, 62, 91, 83, 60, 40, 74)), class = "data.frame", row.names = c(NA, -48L))

Use `summarise_all()`, i.e. `df %>% group_by(new_age = c('age(old) > 60', 'age(young) < 60')[(age < 60) + 1]) %>% summarise_all(sum)` — Sotos, Jan 15 '20 at 13:12

How can I summarize the multiple columns?

0 Answers0