0

I am using the 'esoph' data in R. It is a data frame with records for 88 age/alcohol/tobacco combinations. Here is an excerpt of its data:

Here is an excerpt of its data.

What I am trying to achieve is to pracitcally remove the alcgp column but keep its data for the ncases and ncontrols columns. So for every entry which has the same age group (agegp) and tobacco group (tobgp) but different alcohol group (alcgp) to add up the values for cases and controls and store them in a single row.

For example row 1, 5, 9, 12 would be merged.

schefflaa
  • 33
  • 1
  • 10
  • 1
    Please do not provide your data as an image, but rather as reproducible R code, for example via `dput`. See also: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Julian_Hn Dec 19 '20 at 16:42
  • I agree in general, but in this case, this data comes with R and so I don't think it is necessary. – asifzuba Dec 19 '20 at 17:01

2 Answers2

1

This is a one-liner in dplyr. Would highly recommend you take a look at it.

library(dplyr)
esoph %>% 
    group_by(agegp, tobgp) %>% 
    summarize(total_cases=sum(ncases), 
              total_controls = sum(controls))
...
...
   agegp tobgp    total_cases total_controls
 1 25-34 0-9g/day           0             70
 2 25-34 10-19              1             19
 3 25-34 20-29              0             11
 4 25-34 30+                0             16
 5 35-44 0-9g/day           2            109
 6 35-44 10-19              4             46
 7 35-44 20-29              3             27
 8 35-44 30+                0             17
 9 45-54 0-9g/day          14            104
10 45-54 10-19             13             57
asifzuba
  • 450
  • 3
  • 7
0

It can be efficiently done in data.table

esoph[,alcgp := NULL] # remove alcgp column

Merge (sum) ncases and ncontrols by agegp and topgp. If you

esoph[, .(sum(ncases),sum(ncontrols)), .(agegp,topgp)]

Clarification:

you should convert your dataset to data.table first

library(data.table)
setDT(esoph)

Would be nice if you provide reproducible example, its hard to reproduce from screenshot.

Azat
  • 75
  • 6