2

Similar questions have been asked but they all refer to gathering multiple columns in one key column.

I need multiple columns in two keys.

This is the dataframe I have:

ID  ... measure_A.1 measure_A.2 measure_B.1 measure_B.2
1       8.25        23.5        4       5
2       8.6         22.5        3       4

If I use the following code I get this:

 df %>% 
 gather(key = measure_A, value = "score_A", measure_A.1, measure_A.2) %>%
 gather(key = measure_B, value = "score_B", measure_B.1, measure_B.2)
 
ID  ... measure_A   score_A     measure_B   score_B
1       measure_A.1 8.25        measure_B.1 4   
1       measure_A.1 8.25        measure_B.1 4
1       measure_A.2 23.5        measure_B.2 5
1       measure_A.2 23.5        measure_B.2 5
2       measure_A.1 8.6         measure_B.1 3
2       measure_A.1 8.6         measure_B.1 3
2       measure_A.2 22.5        measure_B.2 4
2       measure_A.2 22.5        measure_B.2 4

what I want is this:


ID  ... measure_A   score_A     measure_B   score_B
1       measure_A.1 8.25        measure_B.1 4   
1       measure_A.2 23.5        measure_B.2 5
2       measure_A.1 8.6         measure_B.1 3
2       measure_A.2 22.5        measure_B.2 4

It seems to me, that i have to reduce something but i don't know how to do this in combination with the gather command. I found a solution with filter, but I don't understand how that works.

Dharman
  • 30,962
  • 25
  • 85
  • 135
  • Making the df reproducible will help with answers. Also, look into tidyr::pivot_longer() – daileyco Aug 25 '21 at 15:37
  • 1
    Possible duplicate of https://stackoverflow.com/questions/61570710/how-to-use-pivot-longer-to-reshape-from-wide-type-data-to-long-type-data-with-mu – akrun Aug 26 '21 at 15:57

1 Answers1

2

This gives the required data in long format with all the values.

library(tidyverse)

df %>%
  pivot_longer(cols = -ID, 
               names_to = c('.value', 'num'),
               names_sep = '\\.') 

#     ID num   measure_A measure_B
#  <int> <chr>     <dbl>     <int>
#1     1 1          8.25         4
#2     1 2         23.5          5
#3     2 1          8.6          3
#4     2 2         22.5          4

After that we need to do some manipulation to get the exact desired output.

df %>%
  pivot_longer(cols = -ID, 
               names_to = c('.value', 'num'),
               names_sep = '\\.') %>%
  rename_with(~sub('measure', 'score', .), starts_with('measure')) %>%
  mutate(measure_A = str_c('MeasureA', num, sep = '.'), 
         measure_B = str_c('MeasureB', num, sep = '.')) %>%
  select(-num)

#     ID score_A score_B measure_A  measure_B 
#  <int>   <dbl>   <int> <chr>      <chr>     
#1     1    8.25       4 MeasureA.1 MeasureB.1
#2     1   23.5        5 MeasureA.2 MeasureB.2
#3     2    8.6        3 MeasureA.1 MeasureB.1
#4     2   22.5        4 MeasureA.2 MeasureB.2
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Thank you very much! But something is not quite working in the first step. The variable Num is created, but not "filled" with the values 1 and 2. It remains empty. Any idea what this could be? – Mirko Saunders Aug 26 '21 at 08:26
  • It would be easier to help if your provide your data in a reproducible format to work with. – Ronak Shah Aug 26 '21 at 10:16