Reformatting data frame shape in R

Question

I would like to reformat this data frame:

mydf <- read.table(
  text = "FORM   DOSE   gmean_AUC   mean_AUC   gmean_Cmax   mean_Cmax
  A      100     150         160          50           55
  B       50     70          75           30           32",
  header = TRUE, stringsAsFactors = FALSE)

into the following:

mydfout <-
EXPOSURE    FORM     DOSE   gmean     mean
 AUC        A       100     150       160
 AUC        B       50      70         75
 Cmax       A       100     50         55
 Cmax       B       50      30         32

How can I do this in R. This reformatting would make it heaps easy for me to generate and export my Table in R.

A5C1D2H2I1M1N2O1R2T1 · Accepted Answer · 2018-01-31T07:16:18.180

This is a very standard "wide-to-long" reshape question, so a good starting point would be the reshape() function.

reshape(mydf, direction = "long", idvar = 1:2, varying = 3:ncol(mydf), 
    timevar = "EXPOSURE", sep = "_")
##            FORM DOSE EXPOSURE gmean mean
## A.100.AUC     A  100      AUC   150  160
## B.50.AUC      B   50      AUC    70   75
## A.100.Cmax    A  100     Cmax    50   55
## B.50.Cmax     B   50     Cmax    30   32

Another option would be melt() from "data.table" (as opposed to melt() from "reshape2"):

melt(as.data.table(mydf), measure.vars = patterns("^gmean", "^mean"))

The downside is that you don't get the "AUC" and "Cmax" values, but you can reintroduce those manually with:

melt(as.data.table(mydf), measure.vars = patterns("^gmean", "^mean"))[
  , variable := factor(variable, labels = c("AUC", "Cmax"))][]

To get around this while the "data.table" team works on it, you can also try ReshapeLong_() from this Gist.

The usage would be:

ReshapeLong_(mydf, c(gmean = "^gmean_", mean = "^mean_"), variable.name = "EXPOSURE")
##    DOSE FORM EXPOSURE gmean mean
## 1:  100    A      AUC   150  160
## 2:   50    B      AUC    70   75
## 3:  100    A     Cmax    50   55
## 4:   50    B     Cmax    30   32

score 3 · Answer 2 · answered Jan 31 '18 at 07:00

A solution using dplyr and tidyr.

library(dplyr)
library(tidyr)

mydfout <- mydf %>%
  gather(Type, Value, -FORM, -DOSE) %>%
  separate(Type, into = c("Summary", "EXPOSURE")) %>%
  spread(Summary, Value) %>%
  select(EXPOSURE, FORM, DOSE, gmean, mean) %>%
  arrange(EXPOSURE)
mydfout
#   EXPOSURE FORM DOSE gmean mean
# 1      AUC    A  100   150  160
# 2      AUC    B   50    70   75
# 3     Cmax    A  100    50   55
# 4     Cmax    B   50    30   32

DATA

mydf <- read.table(text = "FORM   DOSE   gmean_AUC   mean_AUC   gmean_Cmax   mean_Cmax
A      100     150         160          50           55
                B       50     70          75           30           32",
                header = TRUE, stringsAsFactors = FALSE)

Pretty neat solution. To make it even less verbose: `mydf %>% gather(var, val, 3:6) %>% separate(var, c('key', 'EXPOSURE')) %>% spread(key, val) %>% select(EXPOSURE, everything()) %>% arrange(EXPOSURE)` — mpalanco, Jan 31 '18 at 09:11

Reformatting data frame shape in R

2 Answers2