3

This should be a simple exercise with reshape2 package in R but somehow I'm not seeing it.

Imagine I have data:

df <- data.frame(A = rnorm(4), B = rnorm(4))

which looks like:

       A          B

1  2.3729531 -0.9252266
2  0.9848229 -0.1152347
3  2.1234409  0.9035180
4 -0.5771637  1.2755104

long_form <- melt(df)

which looks like

  variable      value
1        A  2.3729531
2        A  0.9848229
3        A  2.1234409
4        A -0.5771637
5        B -0.9252266
6        B -0.1152347
7        B  0.9035180
8        B  1.2755104

How do I transform long_form back into df?

I can do this by adding another column first,

long_form = data.frame(id = c(1:4, 1:4), long_form) dcast(long_form, id ~ variable)

and then drop the id column to recover df; but it just seems like I should be able to do this without explicitly adding an id column to index the replicate A's and B's.

cboettig
  • 12,377
  • 13
  • 70
  • 113

1 Answers1

2

You could do

dcast(melt(df), 1:4 ~ variable)

which is somewhat shorter.

eddi
  • 49,088
  • 6
  • 104
  • 155
  • 1
    nice. Is there something less than `1:dim(subset(long_form, variable=="A"))[1]` if I don't know the number of replicates? – cboettig May 23 '13 at 19:23
  • 1
    You can always do simple stuff like `dcast(long_form, seq_along(df[,1])~variable)[,-1]` or `dcast(long_form, 1:(nrow(long_form)/length(unique(long_form$variable)))~variable)[,-1]` – Dinre May 23 '13 at 19:27
  • @Dinre that supposes we know `df`, in which case we don't need an inverse transform. – cboettig May 23 '13 at 19:29
  • Sorry... accidentally hit enter when typing... wasn't finished producing the whole comment. – Dinre May 23 '13 at 19:30
  • hmm.. a bit shorter: `dcast(long_form, 1:table(long_form$variable)[1] ~ variable)` – cboettig May 23 '13 at 19:31
  • @cboettig see [here](http://stackoverflow.com/a/16502889/1478381) for an alternative answer to your comment question – Simon O'Hanlon May 23 '13 at 19:34
  • @SimonO101 thanks. Though the call to `ave` in that example appears a bit counter-intuitive. I feel like @eddi's approach is a bit more transparent; (though `table` isn't particularly satisfactory in that regard either) – cboettig May 23 '13 at 19:39
  • @cboettig eh? I didn't post the link as an alternative to eddi's approach (which is exactly what I would do). I psoted it becuase after eddi answered your question you asked another question in the comments about creating the ID sequence if you don't know the replicates beforehand. i.e. if you don't know 1:4 – Simon O'Hanlon May 23 '13 at 19:44
  • @cboettig - here's another approach for the enumeration - `seq_len(nrow(long_form)/nlevels(long_form[,1]))` (assuming you have factors) – eddi May 23 '13 at 19:46
  • @SimonO101 Yup, I realize that. I think my use of `table` in my comment above is a more concise answer to deal with not knowing the number of replicates (it also makes it relatively clear we are assuming A and B have the same number of replicates by just using table()[1]). But still not completely transparent. – cboettig May 23 '13 at 19:46