0

I am trying to convert a data frame to a certain format. I have been using dplyr, reshape2, etc. to accomplish this. Basically, I begin with a data frame as follows:

library(dplyr)
library(plotly)
library(data.table)
library(reshape2)

set.seed(1)
data <- data.frame(ID = paste0("ID",1:10), A = runif(10), B = runif(10), C = runif(10), D = runif(10), E = runif(10), fill = factor(rep("gray", 10), levels = c("gray", palette)))
data$ID <- as.character(data$ID)

This creates a data frame of the following format.

   ID         A         B         C         D         E fill
1 ID1 0.2655087 0.2059746 0.9347052 0.4820801 0.8209463 gray
2 ID2 0.3721239 0.1765568 0.2121425 0.5995658 0.6470602 gray
3 ID3 0.5728534 0.6870228 0.6516738 0.4935413 0.7829328 gray
4 ID4 0.9082078 0.3841037 0.1255551 0.1862176 0.5530363 gray
5 ID5 0.2016819 0.7698414 0.2672207 0.8273733 0.5297196 gray
6 ID6 0.8983897 0.4976992 0.3861141 0.6684667 0.7893562 gray

I will keep this "data" data frame for future purposes. But I need to rearrange it to create a new data frame (now called "dat_long") as follows:

datt <- data.frame(t(data))
names(datt) <- as.matrix(datt[1, ])
datt <- datt[-1, ]
datt[] <- lapply(datt, function(x) type.convert(as.character(x)))
setDT(datt, keep.rownames = TRUE)[]
colnames(datt)[1] <- "x"
dat_long <- melt(datt, id.vars ="x" )

This leads to the following structure:

head(dat_long,12)
       x variable      value
 1:    A      ID1 0.26550866
 2:    B      ID1  0.2059746
 3:    C      ID1 0.93470523
 4:    D      ID1  0.4820801
 5:    E      ID1  0.8209463
 6: fill      ID1       gray
 7:    A      ID2 0.37212390
 8:    B      ID2  0.1765568
 9:    C      ID2 0.21214252
10:    D      ID2  0.5995658
11:    E      ID2  0.6470602
12: fill      ID2       gray

This is almost the data frame I want. However, what I really want is shown below:

head(dat_long,12)
       x variable      value fill
 1:    A      ID1 0.26550866 gray
 2:    B      ID1  0.2059746 gray
 3:    C      ID1 0.93470523 gray
 4:    D      ID1  0.4820801 gray
 5:    E      ID1  0.8209463 gray
 6:    A      ID2 0.37212390 gray
 7:    B      ID2  0.1765568 gray
 8:    C      ID2 0.21214252 gray
 9:    D      ID2  0.5995658 gray
10:    E      ID2  0.6470602 gray

The fill may not always be "gray". But it should be the same color across all five rows for a given ID variable (ID1, ID2, etc).

What would you kindly advise to me to accomplish this goal?

  • 2
    Try with `melt` and use `id.var` as 'ID" and 'fill' i.e. `library(data.table);melt(setDT(data), id.var = c("ID", "fill"))[order(ID)]` – akrun Nov 03 '16 at 05:47

2 Answers2

0

Untested code: datt %>% gather(key = key, value = value, -fill, -ID).

tchakravarty
  • 10,736
  • 12
  • 72
  • 116
0
library(dplyr)
library(plotly)
library(data.table)
library(reshape2)

set.seed(1)
data <- data.frame(ID = paste0("ID",1:10), A = runif(10), B = runif(10), C =    runif(10),
               D = runif(10), E = runif(10),
               fill = factor(rep("gray", 10), levels = c("gray", palette)))
data$ID <- as.character(data$ID)

install.packages("gtools")
library(gtools)

data2 <- melt(data, id.vars = c("ID", "fill"),measure.vars = c("A","B","C","D","E"))
data3 <- data2[mixedorder(data2$ID),]

> head(data3,15)
     ID fill variable     value
 1: ID1 gray        A 0.2655087
 2: ID1 gray        B 0.2059746
 3: ID1 gray        C 0.9347052
 4: ID1 gray        D 0.4820801
 5: ID1 gray        E 0.8209463
 6: ID2 gray        A 0.3721239
 7: ID2 gray        B 0.1765568
 8: ID2 gray        C 0.2121425
 9: ID2 gray        D 0.5995658
10: ID2 gray        E 0.6470602
11: ID3 gray        A 0.5728534
12: ID3 gray        B 0.6870228
13: ID3 gray        C 0.6516738
14: ID3 gray        D 0.4935413
15: ID3 gray        E 0.7829328

The reason for using gtools for ordering is because, incase of a factor variable with levels ID1,ID2 and ID10- the default ordering is ID1,ID10 ,ID2 because its done character by character. Therefore I had to use mixedorder from gtools.

I hope this answers your question

joel.wilson
  • 8,243
  • 5
  • 28
  • 48