To change tabular data to a different format in R

Question

Change below data

pos BZ_SP   BZ_SP_m1    BZ_SP_m2    CL_SP   CL_SP_m1    CL_SP_m2
1   -300000 2   3   2540544 1   2
2   0   0   0   -118621 3   4

to look this way

CurveGroup  SpreadId    SpreadMonth1    SpreadMonth2    Position
BZ_SP   1   2   3   -300000
CL_SP   1   1   2   2540544
BZ_SP   2   0   0   0
CL_SP   2   3   4   -118621

What have you tried so far? Providing a [minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) will help you work through your challenge. We are willing to help you solve a specific problem, not to do the work for you. — Kevin Arseneau, Jan 15 '18 at 01:20
I tried the melt option data_test_SP <- melt(readxl::read_excel("C:/Risk_R/Postion_20171110.xlsx", sheet.names[1]), id.vars=1, value.name="Position", variable.name="CurveGroup") but I am not able to slice the data in the format I required — RBeginner, Jan 15 '18 at 01:34

G. Grothendieck · Answer 1 · 2018-01-15T04:38:06.297

gather the input into long form and then separate the variable into Curvegroup and suffix. spread it back out to wide form. Rename and rearrange the columns.

library(dplyr)
library(tidyr)

DF %>%
   gather(variable, value, -pos) %>% 
   separate(variable, c("CurveGroup", "suffix"), sep = 5, fill = "right") %>% 
   spread(suffix, value) %>%
   select(CurveGroup, SpreadId = "pos", SpreadMonth1 = "_m1", SpreadMonth2 = "_m2", 
     Position = "V1")

giving:

  CurveGroup SpreadId SpreadMonth1 SpreadMonth2 Position
1      BZ_SP        1            2            3  -300000
2      CL_SP        1            1            2  2540544
3      BZ_SP        2            0            0        0
4      CL_SP        2            3            4  -118621

Note: The input DF in reproducible form is:

DF <- structure(list(pos = 1:2, BZ_SP = c(-300000L, 0L), BZ_SP_m1 = c(2L, 
  0L), BZ_SP_m2 = c(3L, 0L), CL_SP = c(2540544L, -118621L), CL_SP_m1 = c(1L, 
  3L), CL_SP_m2 = c(2L, 4L)), .Names = c("pos", "BZ_SP", "BZ_SP_m1", 
  "BZ_SP_m2", "CL_SP", "CL_SP_m1", "CL_SP_m2"), 
  class = "data.frame", row.names = c(NA, -2L))

Update: Simplified.

This is exactly what I a looking for. Thanks Grothendieck. – RBeginner Jan 15 '18 at 02:45 — RBeginner, Jan 15 '18 at 02:45

To change tabular data to a different format in R

1 Answers1