2

Using R how do I pivot the columns into rows to get the required structure in df3 from df2, while extracting new information from the previous dataframe:

df2<- structure(list(A=c("A_1_01", "A_1_01", "A_1_01"), B=c("A", "A", "A"), C=c("1", "1", "1"), D=c("inside", "eating", "sleeping"), "1"=c("1","1","0"), "2"=c("1","0","0"), "3"=c("0","0","1"), "4"=c("0","1","1"), "1_Location"=c("I","I", "I"), "2_Location"=c("I","I", "I"), "3_Location"=c("O","O", "O"), "4_Location"=c("O","O", "O")), class= "data.frame", row.names = c(NA,-3L))
df3<- structure(list(H=c("1","2","3","4","1","2","3","4","1","2","3","4"),
                     A=c("A_1_01", "A_1_01", "A_1_01","A_1_01", "A_1_01",
                         "A_1_01","A_1_01", "A_1_01", "A_1_01","A_1_01",
                         "A_1_01", "A_1_01"),
                     B=c("A", "A", "A","A", "A", "A","A", "A", "A","A", "A", "A"),
                     C=c("1", "1", "1","1", "1", "1","1", "1", "1","1", "1", "1"),
                     D=c("inside","inside","inside","inside",
                         "eating","eating","eating","eating",
                         "sleeping","sleeping","sleeping","sleeping"),
                     Value=c(1,1,0,0,1,0,0,1,0,0,1,1),
                     Location=c("I","I","O","O","I","I","O","O","I","I","O","O")),
                class= "data.frame", row.names = c(NA,-12L))

Thank you

M--
  • 25,431
  • 8
  • 61
  • 93
Pegi
  • 167
  • 7
  • 2
    @akrun I thought the same thing and was toiling with closing or answering, I see we agree on this. I wonder if it would be better (generically-speaking) to make sure that at least one of the dupe-link answers provides this kind of unique handling? The use of `names_pattern=` (as well as `data.table::patterns`) is becoming more and more useful, I think it should be available on _that_ link (since it is the link to which many questions like this are eventually dupe-closed). I know I'd upvote a question added there that adds context to that question as a whole, it's a useful addition. – r2evans Mar 06 '23 at 19:15
  • @r2evans I think here, some columns have only digits as column names, thus it may be better to standardize before reshaping. Anyway, from the OP's comments, the dupe link couldn't solve their issue – akrun Mar 06 '23 at 19:17
  • 1
    If there's no coding effort shown in the question to even attempt to use `pivot_longer` then I'm suspect that question shows sufficient research. Perhaps the title could be changed to something more descriptive that would help future users with a similar problem. The issue really seems to be irregular column names. – MrFlick Mar 06 '23 at 20:05

1 Answers1

3

Try with pivot_longer after adding a suffix '_Value' to the digit only column names

library(stringr)
library(tidyr)
library(dplyr)
df2 %>% 
 rename_with(~ str_c(.x, "_Value"), matches("^\\d+$")) %>% 
 pivot_longer(cols = contains("_"), names_to  = c("H", ".value"), 
    names_pattern = "(\\d+)_(.*)")

-output

# A tibble: 12 × 7
   A      B     C     D        H     Value Location
   <chr>  <chr> <chr> <chr>    <chr> <chr> <chr>   
 1 A_1_01 A     1     inside   1     1     I       
 2 A_1_01 A     1     inside   2     1     I       
 3 A_1_01 A     1     inside   3     0     O       
 4 A_1_01 A     1     inside   4     0     O       
 5 A_1_01 A     1     eating   1     1     I       
 6 A_1_01 A     1     eating   2     0     I       
 7 A_1_01 A     1     eating   3     0     O       
 8 A_1_01 A     1     eating   4     1     O       
 9 A_1_01 A     1     sleeping 1     0     I       
10 A_1_01 A     1     sleeping 2     0     I       
11 A_1_01 A     1     sleeping 3     1     O       
12 A_1_01 A     1     sleeping 4     1     O           
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Where is H here? as I mentioned in df3. Also is there another version which is more understandable please? – Pegi Mar 06 '23 at 18:59
  • and also what if I have columns with a minus like -1, -2, -3. with this code, they will be removed – Pegi Mar 06 '23 at 19:24
  • 1
    @Pegi that was not showed in the example, so it is not clear. – akrun Mar 06 '23 at 19:29
  • 1
    @pegi change the regex pattern from ```^\\d+$``` to this```^-?\\d+$``` and you'd get your negative and positive columns. – M-- Mar 06 '23 at 23:37
  • 1
    Thank you so much, everyone. This code solved my problem: ```df2 %>% rename_with(~ str_c(.x, "_Value"), matches("^-?\\d+$")) %>% pivot_longer(cols = contains("_"), names_to = c("H", ".value"), names_pattern = "(^-?\\d+)_(.*)")``` – Pegi Mar 07 '23 at 08:41