1

I have the following example data set:

data.frame(SEX=c("M","F","M","F"),COMPLAINT=c("headache","headache", "dizziness", "dizziness"),
           reports=c(5,4,9,12), users = c(1250,3460,2500,1850))

  SEX COMPLAINT reports users
1   M  headache       5  1250
2   F  headache       4  3460
3   M dizziness       9  2500
4   F dizziness      12  1850

My question is how to merge rows 1 and 2 , and 3 and 4 so that my data frame is as follows:

 COMPLAINT reports_male reports_female users_male users_female
1  headache            5              4       1250         3460
2 dizziness            9             12       2500         1850

Anyone got a quick solution that I can use for a (much) larger dataset?

Joep_S
  • 481
  • 4
  • 22

2 Answers2

3

We can use the dcast from data.table which can take multiple value.var columns and is quite efficient on big datasets

library(data.table)
dcast(setDT(df1), COMPLAINT ~ SEX, value.var = c("reports", "users"))
#    COMPLAINT reports_F reports_M users_F users_M
#1: dizziness        12         9    1850    2500
#2:  headache         4         5    3460    1250
akrun
  • 874,273
  • 37
  • 540
  • 662
2

As seen in How to reshape data from long to wide format?, we can use library(reshape2) and then

reshape(df, idvar = "COMPLAINT", timevar = "SEX", direction = "wide").

  COMPLAINT reports.M users.M reports.F users.F
1  headache         5    1250         4    3460
3 dizziness         9    2500        12    1850
Community
  • 1
  • 1
LAP
  • 6,605
  • 2
  • 15
  • 28