0

Is there a way to reshape wide to long in R based on the first part of column name? I have the following data:

id |  Jan_shoulder | Jan_head | Jan_knee | Feb_shoulder | Feb_head | Feb_knee
1  |     yes       |    no    |    yes   |    no        |   no     |  no
2  |     no        |    no    |    no    |    yes       |   yes    |  no

And I want to convert it so that each row corresponds to a unique id and month, such as:

id |  month | shoulder | head | knee 
1  |  Jan   |    yes   |  no  |  yes
1  |  Feb   |    no    |  no  |  no
2  |  Jan   |    no    |  no  |  no
2  |  Feb   |    yes   |  yes |  no
Agustín Indaco
  • 550
  • 5
  • 17
  • Some values in your expected output is not in sync with the input 'shoulder' 1st should be 'yes' according to input –  Aug 17 '19 at 05:07

2 Answers2

2

Using dplyr and tidyr, we can gather the data to long format, separate column name into different columns and spread them to wide format.

library(dplyr)
library(tidyr)

df %>%
  gather(key, value, -id) %>%
  separate(key, into = c("month", "part"), sep = "_") %>%
  spread(part, value)

#  id month head knee shoulder
#1  1   Feb   no   no       no
#2  1   Jan   no  yes      yes
#3  2   Feb  yes   no      yes
#4  2   Jan   no   no       no
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

it can be changed into 'long' format directly with pivot_longer from the dev version of tidyr

library(dplyr)
library(tidyr) # ‘0.8.3.9000’
library(stringr)
df1 %>%
    rename_at(-1, ~ str_replace(., "(\\w+)_(\\w+)", "\\2_\\1")) %>% 
    pivot_longer(-id, names_to = c(".value", "month"), names_sep='_')
# A tibble: 4 x 5
#     id month shoulder head  knee 
#  <int> <chr> <chr>    <chr> <chr>
#1     1 Jan   yes      no    yes  
#2     1 Feb   no       no    no   
#3     2 Jan   no       no    no   
#4     2 Feb   yes      yes   no   

Or with melt from data.table

library(data.table)
name1 <- unique(sub("_.*", "", names(df1)[-1]))
melt(setDT(df1), measure = patterns("head", "shoulder", "knee"), 
        value.name = c("head", "shoulder", "knee"),
        variable.name = "month")[, month := name1[month]][]
#   id month head shoulder knee
#1:  1   Jan   no      yes  yes
#2:  2   Jan   no       no   no
#3:  1   Feb   no       no   no
#4:  2   Feb  yes      yes   no

Or in base R with reshape

reshape(df1, direction = 'long', idvar = 'id', 
        varying = list(c(2, 5), c(3, 6), c(4, 7)))

data

df1 <- structure(list(id = 1:2, Jan_shoulder = c("yes", "no"), Jan_head = c("no", 
"no"), Jan_knee = c("yes", "no"), Feb_shoulder = c("no", "yes"
), Feb_head = c("no", "yes"), Feb_knee = c("no", "no")),
  class = "data.frame", row.names = c(NA, 
-2L))