0

Context, I am looking to melt a wide time-series data frame into a long data frame. This way I can plot the data in ggplot2 and build a stacked area graph. The time series is not regular (missing some weekends and holidays)

The current data frame looks like

df
    date        item_1    item_2     item_3 ...
1 1992-03-23      8.63     7.609     1.6546 ...
2 1992-03-24      7.98     7.634     1.6533 ... 
...

How do I convert the above data frame into

    date        variable    value
1 1992-03-23    item_1       8.63
2 1992-03-23    item_2      7.609
3 1992-03-23    item_3     1.6546
2 1992-03-24    item_1       7.98

using the following code I get the error

> melted_df = melt(df)
Using as id variables
Error in as.Date.numer(value): 'origin' must be supplied
LascieL
  • 47
  • 6
  • 1
    Possible duplicate of [Reshaping data.frame from wide to long format](http://stackoverflow.com/questions/2185252/reshaping-data-frame-from-wide-to-long-format) – Ronak Shah Mar 16 '17 at 16:56

2 Answers2

1

you have to specify the id.vars as second argument in the melt function and it works:

require(reshape)
df <- data.frame(date = as.Date(c("1992-03-23", "1992-03-24")),
                 item_1 = c(8.63, 7.98),
                 item_2 = c(7.609, 7.634),
                 item_3 = c(1.6546, 1.6533))

melt(df, "date")

you will get:

        date variable  value
1 1992-03-23   item_1 8.6300
2 1992-03-24   item_1 7.9800
3 1992-03-23   item_2 7.6090
4 1992-03-24   item_2 7.6340
5 1992-03-23   item_3 1.6546
6 1992-03-24   item_3 1.6533

hope this helps

Codutie
  • 1,055
  • 13
  • 25
  • This worked. I needed to specify "date" in the formula. – LascieL Mar 16 '17 at 17:03
  • how do I specify a subset? e.g. new_data_frame (where: variable == item_1 or variable == item_2) – LascieL Mar 16 '17 at 19:01
  • i would do this with `require(dplyr)` and then just: `df_new <- melt(df, "date") %>% filter(variable == "item_1" | variable == "item_2")` – Codutie Mar 16 '17 at 19:37
0

Or with gather

library(tidyverse)

df <- data.frame(date = as.Date(c("1992-03-23", "1992-03-24")),
                 item_1 = c(8.63, 7.98),
                 item_2 = c(7.609, 7.634),
                 item_3 = c(1.6546, 1.6533))

df %>% gather(variable, value, -date)

gives,

        date variable  value
1 1992-03-23   item_1 8.6300
2 1992-03-24   item_1 7.9800
3 1992-03-23   item_2 7.6090
4 1992-03-24   item_2 7.6340
5 1992-03-23   item_3 1.6546
6 1992-03-24   item_3 1.6533
Dan
  • 11,370
  • 4
  • 43
  • 68