3

I have a dataframe that looks like this

+---------+--------+--------+--------+-------+
|   ID    | week1_t| week1_a| week2_t|week2_a|
+---------+--------+--------+--------+-------+
|    1    | 12     | 22     |  17    |   4   |
|    1    | 15     | 32     |  18    |   5   |
|    1    | 24     | 12     |  29    |   6   |
|    2    | 45     | 11     |  19    |   8   |
|    2    | 23     | 33     |  20    |   10  |
+---------+--------+--------+--------+-------+

There are 48 columns (week 1 - 24) with "t" and "a" suffixes. I want to consolidate all the week columns into a single "week" column like this:

+---------+--------+--------+--------
|   ID    | week   |  t     |  a    |
+---------+--------+--------+--------
|    1    | 1     | 22      |  17   |
|    1    | 2     | 32      |  18   |
|    1    | 3     | 12      |  19   | 
|    1    | 5     | 33      |  20   | 
+---------+--------+--------+-------

How do I go about making this transformation in R? I can't think of a way of doing this outside multiple if statements and for loops.

Data

dd <- read.table(header = TRUE, text = "ID week1_t week1_a  week2_t week2_a
  1      12      22       17       4   
  1      15      32       18       5   
  1      24      12       29       6   
  2      45      11       19       8   
  2      23      33       20      10")
rawr
  • 20,481
  • 4
  • 44
  • 78
Parseltongue
  • 11,157
  • 30
  • 95
  • 160
  • Package dplyr has all the functions you need. Here's a handy cheat sheet: https://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf – Mekki MacAulay Jun 10 '16 at 20:09
  • 1
    please show what you tried. and dont use that table format. – rawr Jun 10 '16 at 20:20
  • 4
    `reshape(dd, dir = 'long', varying = lapply(c('t','a'), grep, names(dd)), timevar = 'week')` – rawr Jun 10 '16 at 20:21

2 Answers2

6

You can use data.table melt:

library(data.table)
setDT(dd)
melt(dd, id = 1, measure=patterns("_t$", "_a$"), value.name = c("t", "a"), 
     variable.name = "week")

    ID week  t  a
 1:  1    1 12 22
 2:  1    1 15 32
 3:  1    1 24 12
 4:  2    1 45 11
 5:  2    1 23 33
 6:  1    2 17  4
 7:  1    2 18  5
 8:  1    2 29  6
 9:  2    2 19  8
10:  2    2 20 10

As you can read from ?melt:

measure.vars also now accepts a list of character/integer vectors to melt into multiple columns - i.e., melt into more than one value columns simultaneously. Use the function patterns to provide multiple patterns conveniently. See the examples section

Psidom
  • 209,562
  • 33
  • 339
  • 356
0

With tidyr/dplyr:

library(dplyr)
library(tidyr)

       # add row index so later spreading indexed correctly
dd %>% add_rownames() %>% 
    # melt to long format
    gather(week, value, -ID, -rowname) %>% 
    # separate week number from variable suffix
    separate(week, c('week', 'var')) %>% 
    # reduce week number to actual number
    mutate(week = extract_numeric(week)) %>% 
    # spread a and t values back to wide form
    spread(var, value) %>% 
    # clean up
    select(-rowname)

# Source: local data frame [10 x 4]
# 
#       ID  week     a     t
#    (int) (dbl) (int) (int)
# 1      1     1    22    12
# 2      1     2     4    17
# 3      1     1    32    15
# 4      1     2     5    18
# 5      1     1    12    24
# 6      1     2     6    29
# 7      2     1    11    45
# 8      2     2     8    19
# 9      2     1    33    23
# 10     2     2    10    20
alistaire
  • 42,459
  • 4
  • 77
  • 117