0

I have some data that look like this:

head(t)
       sub   trialnum   block.x    lat.x      block.y  lat.y  diff
1       1       10          3      1355          5      1337   18
2       1       11          3      1324          5      1470 -146
3       1       12          3      1861          5      1690  171
4       1       13          3      3501          5      1473 2028
5       1       14          3      1566          5      1402  164
6       1       15          3      1380          5      1539 -159

What I would like to do is reformat the data in R such that the values of "trialnum" (there are 20 of them) are the new columns, "sub" is the row values, and each cell has the "diff" value. For example

          trialnum1      trialnum2      trialnum3...
sub
 1      
 2      
 3      
 .
 .
 .

Any help would be much appreciated. Although the answer is probably simple, I've been struggling with this problem for some time.

user2917781
  • 273
  • 2
  • 10
  • "diff" is a variable I've created by taking the difference between "lat.x" and "lat.y" – user2917781 May 17 '16 at 17:53
  • 4
    Probably be good if you stated that in the question. Seems kind of important. – Rich Scriven May 17 '16 at 17:53
  • 2
    What have you tried already? Did you google "reshape r"? If so, what did you try that didn't work? – Tchotchke May 17 '16 at 17:57
  • I've tried (a) just various ways of aggregating the data and (b) reshape by doing dcast(data, sub ~ trialnum). The outcome of this function didn't put the "diff" values I need into the cells. – user2917781 May 17 '16 at 18:00
  • 1
    Use either the `tidyr` or `reshape2` packages. There are plenty of duplicate reshape questions on SO. Plus there are vignettes that you can find if you google them. – shrgm May 17 '16 at 18:03
  • Possible duplicate of [Reshape data from long to wide format R](http://stackoverflow.com/questions/5890584/reshape-data-from-long-to-wide-format-r) – jogo May 24 '16 at 13:04

2 Answers2

2

Base package. We transpose column diff with the function t(x), then create the desired column names.

df <- data.frame(t(t[, 7]))
# Using the trialnum column
colnames(df) <- paste0(colnames(t[2]), t[, 2])
# or just the number of rows
colnames(df) <- paste0(colnames(t[2]), 1:nrow(t))

Output:

   trialnum10 trialnum11 trialnum12 trialnum13 trialnum14 trialnum15
1         18       -146        171       2028        164       -159
  trialnum1 trialnum2 trialnum3 trialnum4 trialnum5 trialnum6
1        18      -146       171      2028       164      -159
mpalanco
  • 12,960
  • 2
  • 59
  • 67
1

With dplyr and tidyr, first get rid of the columns you don't want, then spread trialnum and diff.

library(dplyr)
library(tidyr)

t %>% select(-block.x:-lat.y) %>%    # get rid of extra columns so t will collapse
    mutate(trialnum = paste0('trialnum', trialnum)) %>%    # fix values for column names
    spread(trialnum, diff)    # spread columns

#   sub trialnum10 trialnum11 trialnum12 trialnum13 trialnum14 trialnum15
# 1   1         18       -146        171       2028        164       -159

Data

t <- structure(list(sub = c(1L, 1L, 1L, 1L, 1L, 1L), trialnum = 10:15, 
                    block.x = c(3L, 3L, 3L, 3L, 3L, 3L), lat.x = c(1355L, 1324L, 
                    1861L, 3501L, 1566L, 1380L), block.y = c(5L, 5L, 5L, 5L, 
                    5L, 5L), lat.y = c(1337L, 1470L, 1690L, 1473L, 1402L, 1539L
                    ), diff = c(18L, -146L, 171L, 2028L, 164L, -159L)), .Names = c("sub", 
               "trialnum", "block.x", "lat.x", "block.y", "lat.y", "diff"), row.names = c(NA, 
               -6L), class = "data.frame")
alistaire
  • 42,459
  • 4
  • 77
  • 117