4

I want to go from long to wide format. My data kind of looks like this:

day=c(1,2,3,4,5,6,1,2,3,4,5,6)
site=c('a', 'a', 'a', 'a', 'a', 'a', 'b', 'b', 'b', 'b', 'b', 'b')
value.1=c(1,2,5,7,5,3,9,4,2,8,1,8)
value.2=c(5,4,7,6,2,4,6,9,4,2,5,6)
data=data.frame(day,site,value.1,value.2)
> data
   day site value.1 value.2
1    1    a       1       5
2    2    a       2       4
3    3    a       5       7
4    4    a       7       6
5    5    a       5       2
6    6    a       3       4
7    1    b       9       6
8    2    b       4       9
9    3    b       2       4
10   4    b       8       2
11   5    b       1       5
12   6    b       8       6

I want to switch it into wide format based on site. So it looks like this

> data
  day a.value.1 a.value.2 b.value.1 b.value.2
1   1         1         5         9         6
2   2         2         4         4         9
3   3         5         7         2         4
4   4         7         6         8         2
5   5         5         2         1         5
6   6         3         4         8         6

I feel like I should be able to do this with the reshape package, but I cant figure it out

I would love some help with this. Thank you

Chloee Robertson
  • 195
  • 2
  • 12
  • Look at `recast` in "reshape2", and wait for "data.table" version 1.9.8, which will let `dcast.data.table` get the functionality you describe in a much more efficient manner than `recast` does. – A5C1D2H2I1M1N2O1R2T1 Jan 22 '15 at 18:15

4 Answers4

4

You could do this in base R

reshape(data, idvar='day', timevar='site',direction='wide')
#    day value.1.a value.2.a value.1.b value.2.b
#1   1         1         5         9         6
#2   2         2         4         4         9
#3   3         5         7         2         4
#4   4         7         6         8         2
#5   5         5         2         1         5
#6   6         3         4         8         6
akrun
  • 874,273
  • 37
  • 540
  • 662
4

One could use the tidyr & dplyr packages:

library("tidyr")
library("dplyr")
data %>% 
  gather(labs, values, value.1:value.2) %>% 
  unite(site2, site,labs, sep = ".") %>% 
  spread(site2, values) 

Though I must admit there's not much reason to, reshape works quite well. I just like tidyr and dplyr.

Andrew Taylor
  • 3,438
  • 1
  • 26
  • 47
2

You're having trouble with dcast because your data isn't fully long, it's already partly wide. If you melt to a longer format, dcast works just fine:

> data.long = melt(data, id = c("day", "site"))
> head(data.long)
  day site variable value
1   1    a  value.1     1
2   2    a  value.1     2
3   3    a  value.1     5
4   4    a  value.1     7
5   5    a  value.1     5
6   6    a  value.1     3
> dcast(data.long, day ~ site + variable)
  day a_value.1 a_value.2 b_value.1 b_value.2
1   1         1         5         9         6
2   2         2         4         4         9
3   3         5         7         2         4
4   4         7         6         8         2
5   5         5         2         1         5
6   6         3         4         8         6
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
  • my real data is longer... this was just a sample – Chloee Robertson Jan 22 '15 at 17:40
  • 1
    I'm not talking about the number of rows, I'm talking about long vs wide format. You have two columns with values, but truly "long" data would have only one value column---and this is the starting place `dcast` assumes. – Gregor Thomas Jan 22 '15 at 17:45
0

Yes, this can be done using the 'reshape' package:

library(reshape)

# Melt the data to obtain the 'value' labels
dM <- melt(data, id.vars=c('day', 'site'))

# Now concatenate the site and 'value' labels
dM$siteVal <- paste(dM$site, dM$variable, sep=".")

# Cast the data using site-value labels
dSite <- cast(dM, day ~ siteVal)
raghu
  • 406
  • 2
  • 3