Overview
So, I'm looking to tidy my data frame. I have found a solution to my problem but it seems highly inefficient when I am working with my large dataset. Currently my code gathers my data frame, applies a separate function to split the ticker from the metric, and then spreads the data appropriately. See the example below
Data frame
structure(list(date = c("2009-07-01", "2009-07-02", "2009-07-06",
"2009-07-07", "2009-07-08"), PRED.Open = c(0.5, 0.5, 0.7, 0.7,
0.7), PRED.High = c(0.5, 0.6, 0.7, 0.7, 0.7), PRED.Low = c(0.5,
0.5, 0.5, 0.7, 0.7), PRED.Close = c(0.5, 0.6, 0.5, 0.7, 0.7),
PRED.Volume = c(0L, 300L, 200L, 0L, 0L), PRED.Adjusted = c(0.5,
0.6, 0.5, 0.7, 0.7), GDM.Open = c(1041.02002, 1085.109985,
1052.02002, 1011.429993, 1006.630005), GDM.High = c(1097.790039,
1085.109985, 1052.02002, 1029.290039, 1006.630005), GDM.Low = c(1041.02002,
1038.540039, 995.450012, 1005.280029, 948.73999), GDM.Close = c(1085.109985,
1052.02002, 1011.429993, 1006.630005, 966.22998), GDM.Volume = c(0L,
0L, 0L, 0L, 0L), GDM.Adjusted = c(1085.109985, 1052.02002,
1011.429993, 1006.630005, 966.22998), NBL.Open = c(29.885,
29.325001, 27.370001, 27.485001, 26.815001), NBL.High = c(30.35,
29.325001, 27.545, 27.610001, 27.18), NBL.Low = c(29.83,
28.07, 26.825001, 26.605, 25.745001)), row.names = c(NA,
-5L), class = "data.frame")
Current Solution
df <- df %>% gather(c(2:ncol(df)), key = "ticker", value = "val")
df <- separate(df, col = "ticker", into = c("ticker", "metric"), sep = "\\.") %>%
ungroup() %>%
spread(key = "metric", value = "val") %>%
arrange(ticker, date)
Desired Outcome
Question
Is there a more efficient way to accomplish this?