-2

Hi i have the data like this

date    2020/06/10  2020/06/10  2020/06/10  2020/06/11  2020/06/11  2020/06/11  
id          x           y           z           x           y           z           
10432       0           0           0           0           0           0           
10668       0           0           0           0           0           0           
11088       0           0           0           0           0           0   

And i want my output like this

id          date    x   y   z
10432   2020/06/10  0   0   0
10432   2020/06/11  0   0   0
10668   2020/06/10  0   0   0
10668   2020/06/11  0   0   0
11088   2020/06/10  0   0   0
11088   2020/06/11  0   0   0

I would like to get the output in R

Aditya
  • 13
  • 5
  • 1
    Does this answer your question? [Reshaping data.frame from wide to long format](https://stackoverflow.com/questions/2185252/reshaping-data-frame-from-wide-to-long-format) – NelsonGon Jun 27 '20 at 10:47
  • Could you add what has already failed? – NelsonGon Jun 27 '20 at 10:48

1 Answers1

0

This data is extremely messy, but with some work I have managed to get your desired output.


This is what the sample data you provided looks like after I imported it using read_table2() from the readr package:

library(readr) # 1.3.1

df <- read_table2("date    2020/06/10  2020/06/10  2020/06/10  2020/06/11  2020/06/11  2020/06/11  
id          x           y           z           x           y           z           
10432       0           0           0           0           0           0           
10668       0           0           0           0           0           0           
11088       0           0           0           0           0           0 ")

df[, 8] <- NULL
> df
# A tibble: 4 x 7
  date  `2020/06/10` `2020/06/10_1` `2020/06/10_2` `2020/06/11` `2020/06/11_1` `2020/06/11_2`
  <chr> <chr>        <chr>          <chr>          <chr>        <chr>          <chr>         
1 id    x            y              z              x            y              z             
2 10432 0            0              0              0            0              0             
3 10668 0            0              0              0            0              0             
4 11088 0            0              0              0            0              0   

Note that in R you can't have repeating column names so it changes them slightly. We have to resolve this. (You might not have to do these steps exactly, I cannot know precisely what your columns are called without you providing the data with dput().)

Now I paste together the current column names with the 'id' row, and then tidy the column names so that we can work with them later on:

names(df) <- c("id", paste(df[1, -1], names(df)[-1], sep = "_"))

library(stringr) # 1.4.0

names(df)[-1] <- str_remove(names(df)[-1], "_[1-9]$") # this gets rid of
# the "_1" or "_2" R added automatically when I imported the data

We now no longer need the first row as its contents have become repeated in the column names.

df <- df[-1,]
> df
# A tibble: 3 x 7
  id    `x_2020/06/10` `y_2020/06/10` `z_2020/06/10` `x_2020/06/11` `y_2020/06/11` `z_2020/06/11`
  <chr> <chr>          <chr>          <chr>          <chr>          <chr>          <chr>         
1 10432 0              0              0              0              0              0             
2 10668 0              0              0              0              0              0             
3 11088 0              0              0              0              0              0 

Now we can actually work on this df and go from wide to long format:

library(tidyr) # 1.1.0

df %>% pivot_longer(-id, names_to = c(".value", "date"),
                      names_pattern = "(.)_(..../../..)")
# A tibble: 6 x 5
  id    date       x     y     z    
  <chr> <chr>      <chr> <chr> <chr>
1 10432 2020/06/10 0     0     0    
2 10432 2020/06/11 0     0     0    
3 10668 2020/06/10 0     0     0    
4 10668 2020/06/11 0     0     0    
5 11088 2020/06/10 0     0     0    
6 11088 2020/06/11 0     0     0  

So there we have it - a tidying data solution using the readr, stringr and tidyr packages from tidyverse - a very nice example, I think, of what tidyverse was precisely developed for! :)


DATA

Should you need it, here is the tidy df that we have just before we implement pivot_longer() on it:

structure(list(id = c("10432", "10668", "11088"), `x_2020/06/10` = c("0", 
"0", "0"), `y_2020/06/10` = c("0", "0", "0"), `z_2020/06/10` = c("0", 
"0", "0"), `x_2020/06/11` = c("0", "0", "0"), `y_2020/06/11` = c("0", 
"0", "0"), `z_2020/06/11` = c("0", "0", "0")), row.names = c(NA, 
-3L), class = c("tbl_df", "tbl", "data.frame"))
kmacierzanka
  • 747
  • 4
  • 17
  • Thanks mate!! i could run till dropping the first row and i'm stuck here `df %>% pivot_longer(-id, names_to = c(".value", "date"), names_pattern = "(.)_(..../../..)")` and I have 10 ids , i have xy,xy ad, xyz instead of x,y,z . Do help me out to resolve it. – Aditya Jun 27 '20 at 13:36
  • I am not too good at regex, but try if this works `names_pattern = "([a-z]{1, 6})_(..../../..)"`. Please give some sample data using `dput(head(df))` so that I, and others, can see how exactly you have the column names in R – kmacierzanka Jun 27 '20 at 20:51
  • Some helpful regex info: [CRAN regex vignette](https://cran.r-project.org/web/packages/stringr/vignettes/regular-expressions.html) and [this SO thread](https://stackoverflow.com/questions/15472764/regular-expression-to-allow-spaces-between-words) – kmacierzanka Jun 28 '20 at 07:19
  • thanks alot , the regex worked , i have doubt regarding date thing when imported the file the date got converted to character and when i finally tried to convert to date format using this `as.Date(as.numeric(df$date),origin="1970-01-01") ` I got all the values of the date same ,something like this `2090-06-12` and its repeated entire row, do you have any idea y its displaying repeated wrong date? – Aditya Jun 28 '20 at 07:58
  • You should post that as a new question, but I'm sure there are answers for it already on SO. Consider using the `lubridate` package – kmacierzanka Jun 28 '20 at 08:00
  • Thanks , If i have to reverse the same back to original form how can i do it, can you post a few references!! – Aditya Jun 29 '20 at 07:26
  • [Documentation](https://tidyr.tidyverse.org/reference/pivot_wider.html) for `pivot_wider()` – kmacierzanka Jun 29 '20 at 07:28
  • could you please look into this . https://stackoverflow.com/questions/63967053/complex-transposing-using-reshape-function-from-wide-to-long?noredirect=1#comment113116362_63967053 Thanks in advance – Aditya Sep 20 '20 at 17:15