1

I have a data set where each 'subject' with a particular 'stim' condition has a certain value for the variable 'FOXP3', 'GATA3' and 'GZMB'. I would like to group the 'subject' and 'stim' together to get the corresponding values for 'FOXP3', 'GATA3' and 'GZMB'.

So the data looks like this:

subject   stim    FOXP3  GATA3  GZMB
TA        no stim   0      0.50   0
SA        11002     1       0     0.9
RK        4512      0.3     0.5   0.9

I want the data to look like this

            TA+no stim      SA+11002       RK+4512   
FOXP3        0                 1             0.3
GATA3        0.5               0             0.5 
GZMB         0                 0.9           0.9

How do I do it?

MrFlick
  • 195,160
  • 17
  • 277
  • 295
Taliman
  • 57
  • 7

2 Answers2

0

This is how I would do it:

library(dplyr)
library(tidyr)
df <- read.table(text = 
"subject   stim    FOXP3  GATA3  GZMB
TA        no-stim   0      0.50   0
SA        11002     1       0     0.9
RK        4512      0.3     0.5   0.9", header = T)
df2 <- df %>% unite(col = subject_stim, subject, stim, sep = "+") %>% 
{as_tibble(cbind(names = names(.), t(.)))} 
colnames(df2)[2:4] <- df2[1,2:4]
df3 <- df2[-1,]

It is not the prettiest but it works. The as_tibble(cbind(names = names(.), t(.))) comes from this question

see24
  • 1,097
  • 10
  • 21
0

Using data.table, you can do in two steps using melt and dcast:

## first melt
df1 <- melt(data = df, id.vars = c('subject','stim'))
df1$col_name = paste(df1$subject,df1$stim, sep = '+')

## then spread
df1 <- dcast(df1, variable ~ col_name, value.var = 'value')
print(df1)

     variable RK+4512 SA+11002 TA+no_stim
1:    FOXP3     0.3      1.0        0.0
2:    GATA3     0.5      0.0        0.5
3:     GZMB     0.9      0.9        0.0
YOLO
  • 20,181
  • 5
  • 20
  • 40
  • actually, these functions are the same you would find in reshape2, you could use them as well. – YOLO Jul 17 '18 at 19:13