-1

Change below data

pos BZ_SP   BZ_SP_m1    BZ_SP_m2    CL_SP   CL_SP_m1    CL_SP_m2
1   -300000 2   3   2540544 1   2
2   0   0   0   -118621 3   4

to look this way

CurveGroup  SpreadId    SpreadMonth1    SpreadMonth2    Position
BZ_SP   1   2   3   -300000
CL_SP   1   1   2   2540544
BZ_SP   2   0   0   0
CL_SP   2   3   4   -118621
jazzurro
  • 23,179
  • 35
  • 66
  • 76
RBeginner
  • 11
  • 3
  • 1
    What have you tried so far? Providing a [minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) will help you work through your challenge. We are willing to help you solve a specific problem, not to do the work for you. – Kevin Arseneau Jan 15 '18 at 01:20
  • I tried the melt option data_test_SP <- melt(readxl::read_excel("C:/Risk_R/Postion_20171110.xlsx", sheet.names[1]), id.vars=1, value.name="Position", variable.name="CurveGroup") but I am not able to slice the data in the format I required – RBeginner Jan 15 '18 at 01:34
  • Please add your attempts to your question. – Kevin Arseneau Jan 15 '18 at 01:35

1 Answers1

1

gather the input into long form and then separate the variable into Curvegroup and suffix. spread it back out to wide form. Rename and rearrange the columns.

library(dplyr)
library(tidyr)

DF %>%
   gather(variable, value, -pos) %>% 
   separate(variable, c("CurveGroup", "suffix"), sep = 5, fill = "right") %>% 
   spread(suffix, value) %>%
   select(CurveGroup, SpreadId = "pos", SpreadMonth1 = "_m1", SpreadMonth2 = "_m2", 
     Position = "V1")

giving:

  CurveGroup SpreadId SpreadMonth1 SpreadMonth2 Position
1      BZ_SP        1            2            3  -300000
2      CL_SP        1            1            2  2540544
3      BZ_SP        2            0            0        0
4      CL_SP        2            3            4  -118621

Note: The input DF in reproducible form is:

DF <- structure(list(pos = 1:2, BZ_SP = c(-300000L, 0L), BZ_SP_m1 = c(2L, 
  0L), BZ_SP_m2 = c(3L, 0L), CL_SP = c(2540544L, -118621L), CL_SP_m1 = c(1L, 
  3L), CL_SP_m2 = c(2L, 4L)), .Names = c("pos", "BZ_SP", "BZ_SP_m1", 
  "BZ_SP_m2", "CL_SP", "CL_SP_m1", "CL_SP_m2"), 
  class = "data.frame", row.names = c(NA, -2L))

Update: Simplified.

G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341