2

I want to plot a line graph, with x axis time (hh:mm)(this is a specific time interval), and y axis average price.

I want the graph to display multiple lines (one line for each day).

Currently my data frame is like this (I have some other variables as well but I am not using them in my graph. the relevant ones are below):

AV.PRICE DATE        TIME
180      2014-01-20  13H 0M 0S
179      2014-01-20  13H 1M 0S
175      2014-01-20  13H 2M 0S
179      2014-01-20  13H 3M 0S

...and so on, the dates continue but the times only take on values between 13:00 and 15:00 each day

The DATE class is date, AV.PRICE is num, TIME is period (used lubridate)

if my question isnt clear, this is what i'm looking for, plotting a date-agnostic graph on a time only axis, except i'm using ggplot2 in r: plotting data for different days on a single HH:MM:SS axis

EDITED:
when i try to plot the original df with ggplot, it does not recognize the time variable. ggplot(df, aes(x=TIME, y=AV.PRICE, group = DATE)) + geom_line() gives error: cannot compare Period to Duration

dput

structure(list(AV.PRICE = c(178.841368677043, 178.837478586724, 
178.811640304183, 178.8395125, 178.858236768802, 178.860812464589
), DATE = structure(c(16098, 16098, 16098, 16098, 16098, 16098
), class = "Date"), TIME = structure(c(0, 0, 0, 0, 0, 0), year = c(0, 
0, 0, 0, 0, 0), month = c(0, 0, 0, 0, 0, 0), day = c(0, 0, 0, 
0, 0, 0), hour = c(13, 13, 13, 13, 13, 13), minute = c(0, 1, 
2, 3, 4, 5), class = structure("Period", package = "lubridate"))), .Names = c("AV.PRICE", 
"DATE", "TIME"), row.names = c(NA, 6L), class = "data.frame")
Community
  • 1
  • 1
shoestringfries
  • 279
  • 4
  • 18
  • 1
    it looks like your data already is in long format, what do you want to do with `melt` – Nate Dec 30 '16 at 18:50
  • when i try to plot the original df with ggplot, it does not recognize the time variable. ggplot(df, aes(x=TIME, y=AV.PRICE, group = DATE)) + geom_line() gives error: cannot compare Period to Duration: – shoestringfries Dec 30 '16 at 19:12
  • Maybe you can post the `dput(df)` in your question to see if there's somebody can help you with this. – Psidom Dec 30 '16 at 19:34
  • A fully reproducible example would be even nicer. – Roman Luštrik Dec 30 '16 at 19:49
  • if i may ask, what is missing from my example? i have a sample of the data, i specified the classes of the variables, i provided the code i tried and the error message. do you need other information? – shoestringfries Dec 30 '16 at 19:56
  • Using `dput()` to share the data would be very nice, it would make it copy/paste-able. As-is, to get it into R, I'd copy your data as text, then use `read.table` to import it (though the defaults won't work due to the spaces in the duration data), then have to muck about with converting in to the right class. If you `dput(head(your_data))` it's just a simple copy/paste. You should also share enough data to make it interesting, the few rows you show are all on the same day which makes the "mulitple dates" crux of your question moot. – Gregor Thomas Dec 30 '16 at 20:02
  • ok i simplified the df a bit and did dput(head(df)) and pasted the output in the section – shoestringfries Dec 30 '16 at 20:14
  • there are hundreds of entries for just one day, hence i was only able to put the data for one day. i don't know if it helps for me to skip over them so you can see the next day.. – shoestringfries Dec 30 '16 at 20:16
  • if my question isnt clear, this is what i'm looking for, plotting a date-agnostic graph on a time only axis, except i'm using ggplot2 in r: http://stackoverflow.com/questions/27603593/plotting-data-for-different-days-on-a-single-hhmmss-axis – shoestringfries Dec 30 '16 at 20:27
  • Your question is clear. Unfortunately the Period class doesn't seem to work with `dput` :( I think the best way still is to spoof it, as in this question: [ggplot time series group by dates](http://stackoverflow.com/q/32976820/903061). There isn't a built in `scale_x_duration` which is what would be needed to make your attempt work. I would create an `x` column that is a standard POSIX datetime with all the same date but your individual durations. Use that on the `x` axis, but still group/color by your actual `Date`. – Gregor Thomas Dec 30 '16 at 20:51

1 Answers1

4

I think you problem is with having df$TIME class = "Period", instead of time. All you need to do it coerce it back to POSIXt.

df <- data.table::fread("AV.PRICE DATE        TIME
180      2014-01-20  13H0M0S
179      2014-01-20  13H1M0S
175      2014-01-20  13H2M0S
179      2014-01-20  13H3M0S
182      2014-01-21  13H0M0S
181      2014-01-21  13H1M0S
177      2014-01-21  13H2M0S
181      2014-01-21  13H3M0S")

I added an extra value for Date with shifted AV.PRICE for display purposes. Sorry I couldn't get your dput to load properly or I would have started from there. Also fread doesn't like spaces in the input, but you could use something like collapse and as.character on your actual df$TIME column.

df$TIME %<>% lubridate::parse_date_time("HMS") # make it class = "POSIXt

ggplot(data = df, aes(x = TIME, y = AV.PRICE, color = DATE)) +
    geom_line() 

enter image description here

You could play around with + scale_x_time() if you want too, but the default looks like it is rendering your desired output.

Nate
  • 10,361
  • 3
  • 33
  • 40
  • Thanks for the tip, I ended up using parse_date_time2 because some failed to parse and had to first coerce TIME into character. also all the days ended up being 01-01-0000.. so this kind of worked: df$TIME <- parse_date_time2(as.character(df$TIME), orders="HMS") i'm not sure why parse_date_time2 works... it just seems to have fewer args – shoestringfries Dec 31 '16 at 18:12