1

I was wondering how to convert my Data below to my Desired_output?

My Desired_output is a columsum of n_started and n_finished columns for each district in Data.

Data = read.table(h=TRUE,text="
district grade n_started  n_finished
A        1     5          9
A        2     3          6
B        1     1          2
B        2     2          4
B        3     6          7")

Desired_output=
"
district  n_started  n_finished
A         5+3          9+6
B        1+2+6        2+4+7"
Simon Harmel
  • 927
  • 2
  • 9
  • Base R: `aggregate(n_started + n_finished ~ district, Data, FUN = sum)`. Dplyr: `group_by(Data, district) %>% summarize(across(c(n_started, n_finished), ~ sum(.)))`. Data.table: `as.data.table(Data)[, lapply(.SD, sum), .SDcols = c("n_started", "n_finished"), by = .(district)]`. – r2evans Apr 07 '23 at 16:03
  • SimonHarmel, if the dupe-links (and my previous comment) don't work for you, [edit] your question to expand the example so that it reflects why the given solutions don't work, then @-ping me and we'll discuss and optionally reopen. Thanks! – r2evans Apr 07 '23 at 16:21

2 Answers2

2

You can use tidyverse group_by and summarise to accomplish this:

library(tidyverse)
Data %>% 
  group_by(district) %>%
  summarise(n_started = sum(n_started), n_finished = sum(n_finished))
Tyler
  • 336
  • 1
  • 7
0

You can use aggregate()

Desired_output = aggregate(cbind(n_started=Data$n_started, n_finished=Data$n_finished), by=list(district=Data$district), FUN=sum)