0

Hello I have the following dataframe:

        Date          S1    Mavg1          STdev1         S2     Mavg2        STdev2
1 02/12/1999 -0.000217052  0.002862195 -0.002390842  2.2025825 -1.17258213 -0.3057015
2 03/12/1999 -0.004882038 -0.015920939 -0.007014382 -0.4040079 -0.69496488  0.1000167
3 06/12/1999 -0.001445954  0.005077610  0.000000000  1.1573779 -1.64268166 -0.3847015
4 07/12/1999 -0.000907952 -0.005403168  0.002027728 -0.3165827 -1.16466940 -1.0113501
5 08/12/1999  0.000881220  0.011375226 -0.013524439  0.5154455 -0.51725208  1.0942177
6 09/12/1999  0.001043752  0.013400502  0.017033342  1.9368608 -0.05587143  0.5203337

I need to reshape it to have something like this:

         Date  S variable        value
1  02/12/1999 S1    value -0.000217052
2  03/12/1999 S1    value -0.004882038
3  06/12/1999 S1    value -0.001445954
4  07/12/1999 S1    value -0.000907952
5  08/12/1999 S1    value  0.000881220
6  09/12/1999 S1    value  0.001043752
7  02/12/1999 S2    value  2.202582500
8  03/12/1999 S2    value -0.404007900
9  06/12/1999 S2    value  1.157377900
10 07/12/1999 S2    value -0.316582700
11 08/12/1999 S2    value  0.515445500
12 09/12/1999 S2    value  1.936860800
13 02/12/1999 S1     mavg  0.002862195
14 03/12/1999 S1     mavg -0.015920939
15 06/12/1999 S1     mavg  0.005077610
16 07/12/1999 S1     mavg -0.005403168
17 08/12/1999 S1     mavg  0.011375226
18 09/12/1999 S1     mavg  0.013400502
19 02/12/1999 S2     mavg -1.172582130
20 03/12/1999 S2     mavg -0.694964880
21 06/12/1999 S2     mavg -1.642681660
22 07/12/1999 S2     mavg -1.164669400
23 08/12/1999 S2     mavg -0.517252080
24 09/12/1999 S2     mavg -0.055871430
25 02/12/1999 S1   stddev -0.002390842
26 03/12/1999 S1   stddev -0.007014382
27 06/12/1999 S1   stddev  0.000000000
28 07/12/1999 S1   stddev  0.002027728
29 08/12/1999 S1   stddev -0.013524439
30 09/12/1999 S1   stddev  0.017033342
31 02/12/1999 S2   stddev -0.305701500
32 03/12/1999 S2   stddev  0.100016700
33 06/12/1999 S2   stddev -0.384701500
34 07/12/1999 S2   stddev -1.011350100
35 08/12/1999 S2   stddev  1.094217700
36 09/12/1999 S2   stddev  0.520333700

I know I should use melt but I have no idea how to do this.

I need to sacrifice some words/characters for the internet gods to let me post this question ;) .................

J.Doe
  • 166
  • 1
  • 9
  • What have you tried so far? Can you provide your example in an [easy-to-paste form](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)? – Roman Luštrik Dec 30 '17 at 20:13
  • this is the easy to paste form and yes I did try that one it didn't work – J.Doe Dec 30 '17 at 20:20

2 Answers2

4

I think it should be the right way. I have used the tidyverse approach:

mydf %>% 
    gather(key=Variable,value=Value,c(S1, S2, Mavg1, Mavg2,STdev1,STdev2)) %>% 
    mutate(S=case_when(
        Variable=="S1"|Variable=="Mavg1"|Variable=="STdev1" ~ "S1",
        Variable=="S2"|Variable=="Mavg2"|Variable=="STdev2" ~ "S2"
    ),
    Variable=str_replace_all(string=Variable,pattern="S[1|2]",replacement="Value"),
    Variable=str_replace_all(string=Variable,pattern="[[:digit:]]",replacement="")
    ) %>% 
    select(Date,S,Variable,Value)

This is the output:

         Date  S Variable        Value
1  02/12/1999 S1    Value -0.000217052
2  03/12/1999 S1    Value -0.004882038
3  06/12/1999 S1    Value -0.001445954
4  07/12/1999 S1    Value -0.000907952
5  08/12/1999 S1    Value  0.000881220
6  09/12/1999 S1    Value  0.001043752
7  02/12/1999 S2    Value  2.202582500
8  03/12/1999 S2    Value -0.404007900
9  06/12/1999 S2    Value  1.157377900
10 07/12/1999 S2    Value -0.316582700
11 08/12/1999 S2    Value  0.515445500
12 09/12/1999 S2    Value  1.936860800
13 02/12/1999 S1     Mavg  0.002862195
14 03/12/1999 S1     Mavg -0.015920939
15 06/12/1999 S1     Mavg  0.005077610
16 07/12/1999 S1     Mavg -0.005403168
17 08/12/1999 S1     Mavg  0.011375226
18 09/12/1999 S1     Mavg  0.013400502
19 02/12/1999 S2     Mavg -1.172582130
20 03/12/1999 S2     Mavg -0.694964880
21 06/12/1999 S2     Mavg -1.642681660
22 07/12/1999 S2     Mavg -1.164669400
23 08/12/1999 S2     Mavg -0.517252080
24 09/12/1999 S2     Mavg -0.055871430
25 02/12/1999 S1    STdev -0.002390842
26 03/12/1999 S1    STdev -0.007014382
27 06/12/1999 S1    STdev  0.000000000
28 07/12/1999 S1    STdev  0.002027728
29 08/12/1999 S1    STdev -0.013524439
30 09/12/1999 S1    STdev  0.017033342
31 02/12/1999 S2    STdev -0.305701500
32 03/12/1999 S2    STdev  0.100016700
33 06/12/1999 S2    STdev -0.384701500
34 07/12/1999 S2    STdev -1.011350100
35 08/12/1999 S2    STdev  1.094217700
36 09/12/1999 S2    STdev  0.520333700
Scipione Sarlo
  • 1,470
  • 1
  • 17
  • 31
  • does it answer the question? look at the number of rows in the output. It is supposed to be 36 rows instead of 60 rows. See my solution, has 36 rows in the output – Sathish Dec 30 '17 at 22:04
  • 1
    @Sathish I think you are definitely right, I have reconsidered my code. Now the output it's correct. Many thanks – Scipione Sarlo Dec 30 '17 at 22:34
1
library('data.table') 
df2 <- melt(df1, id.vars = c("Date"))
# assign values to column "S" by finding the numbers at the end of values in variable column
# and paste it with "S"
df2[, S := paste0("S", unlist(lapply( regmatches(variable, gregexpr('[0-9]+', variable )), tail, 1 ))) ]
# find and replace S1 and S2 with "value" in variable column
df2[variable %in% c("S1", "S2"), variable := "value"]
df2[, variable := gsub("[0-9]", "", variable)]
df2
#          Date variable        value  S
# 1: 02/12/1999    value -0.000217052 S1
# 2: 03/12/1999    value -0.004882038 S1
# 3: 06/12/1999    value -0.001445954 S1
# 4: 07/12/1999    value -0.000907952 S1
# 5: 08/12/1999    value  0.000881220 S1
# 6: 09/12/1999    value  0.001043752 S1
# 7: 02/12/1999     Mavg  0.002862195 S1
# 8: 03/12/1999     Mavg -0.015920939 S1
# 9: 06/12/1999     Mavg  0.005077610 S1
# 10: 07/12/1999     Mavg -0.005403168 S1
# 11: 08/12/1999     Mavg  0.011375226 S1
# 12: 09/12/1999     Mavg  0.013400502 S1
# 13: 02/12/1999    STdev -0.002390842 S1
# 14: 03/12/1999    STdev -0.007014382 S1
# 15: 06/12/1999    STdev  0.000000000 S1
# 16: 07/12/1999    STdev  0.002027728 S1
# 17: 08/12/1999    STdev -0.013524439 S1
# 18: 09/12/1999    STdev  0.017033342 S1
# 19: 02/12/1999    value  2.202582500 S2
# 20: 03/12/1999    value -0.404007900 S2
# 21: 06/12/1999    value  1.157377900 S2
# 22: 07/12/1999    value -0.316582700 S2
# 23: 08/12/1999    value  0.515445500 S2
# 24: 09/12/1999    value  1.936860800 S2
# 25: 02/12/1999     Mavg -1.172582130 S2
# 26: 03/12/1999     Mavg -0.694964880 S2
# 27: 06/12/1999     Mavg -1.642681660 S2
# 28: 07/12/1999     Mavg -1.164669400 S2
# 29: 08/12/1999     Mavg -0.517252080 S2
# 30: 09/12/1999     Mavg -0.055871430 S2
# 31: 02/12/1999    STdev -0.305701500 S2
# 32: 03/12/1999    STdev  0.100016700 S2
# 33: 06/12/1999    STdev -0.384701500 S2
# 34: 07/12/1999    STdev -1.011350100 S2
# 35: 08/12/1999    STdev  1.094217700 S2
# 36: 09/12/1999    STdev  0.520333700 S2
#           Date variable        value  S

Data:

library('data.table') 
df1 <- fread('id       Date          S1    Mavg1          STdev1         S2     Mavg2        STdev2
1 02/12/1999 -0.000217052  0.002862195 -0.002390842  2.2025825 -1.17258213 -0.3057015
      2 03/12/1999 -0.004882038 -0.015920939 -0.007014382 -0.4040079 -0.69496488  0.1000167
      3 06/12/1999 -0.001445954  0.005077610  0.000000000  1.1573779 -1.64268166 -0.3847015
      4 07/12/1999 -0.000907952 -0.005403168  0.002027728 -0.3165827 -1.16466940 -1.0113501
      5 08/12/1999  0.000881220  0.011375226 -0.013524439  0.5154455 -0.51725208  1.0942177
      6 09/12/1999  0.001043752  0.013400502  0.017033342  1.9368608 -0.05587143  0.5203337', header = TRUE)
df1[, id := NULL ]
Sathish
  • 12,453
  • 3
  • 41
  • 59