0

I am trying to create a multiple graphs[6 graphs or more at once] using a single or minimum command in R.

Let's look at the hourly data first:-

 str(ZZZ)
'data.frame':   291960 obs. of  9 variables:
 $ TRADE_DT  : POSIXct, format: "2014-11-01" "2014-11-01" "2014-11-01" "2014-11-01" ...
 $ YEAR      : int  2014 2014 2014 2014 2014 2014 2014 2014 2014 2014 ...
 $ MONTH     : int  11 11 11 11 11 11 11 11 11 11 ...
 $ hour_num  : int  1 1 1 1 1 1 1 1 1 1 ...
 $ source    : Factor w/ 5 levels "AB","EF","EI",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ LSE_CD    : int  116 116 116 116 116 116 135 135 135 135 ...
 $ utility_cd: Factor w/ 6 levels "CPL","SHARY",..: 1 2 3 4 5 6 1 4 5 6 ...
 $ load      : num  12.834 0.502 31.436 13.948 31.314 ...
 $ total_load: num  13.929 0.524 35.864 14.77 33.161 ...

dput(head(ZZZ))

structure(list(TRADE_DT = structure(c(1414818000, 1414818000, 
1414818000, 1414818000, 1414818000, 1414818000), class = c("POSIXct", 
"POSIXt"), tzone = ""), YEAR = c(2014L, 2014L, 2014L, 2014L, 
2014L, 2014L), MONTH = c(11L, 11L, 11L, 11L, 11L, 11L), hour_num = c(1L, 
1L, 1L, 1L, 1L, 1L), source = structure(c(1L, 1L, 1L, 1L, 1L, 
1L), .Label = c("AB", "EF", "EI", "IB", "ST"), class = "factor"), 
LSE_CD = c(116L, 116L, 116L, 116L, 116L, 116L), utility_cd = structure(1:6,       .Label = c("CPL", 
"SHARY", "TNMP", "TXRL", "TXTU", "WTU"), class = "factor"), 
load = c(12.83423, 0.501589, 31.435567, 13.947688, 31.314148, 
2.237439), total_load = c(13.928702, 0.524432, 35.864181, 
14.770245, 33.161105, 2.417721)), .Names = c("TRADE_DT", 
"YEAR", "MONTH", "hour_num", "source", "LSE_CD", "utility_cd", 
"load", "total_load"), row.names = c(NA, 6L), class = "data.frame")

I am interested to overlay my sources( AB, EI, EF etc....) based on each utility. For 6 utilities, it should produce 6 graphs, where each graph will have 5 lines(or 2 or 3 as necessary). 1 graph for each utility & each graph should have multiple lines based on source. Sounds simple but i haven't been able to make it happen when data is in this format.

i was able to overlay multiple lines in graphs...

I was however able to pull it off when i had my sources ( 5 factors) turned into 5 different columns and remove the HOUR out of the picture & sum it daily.

str(YYY)

'data.frame':   102 obs. of  5 variables:
 $ TRADE_DT: POSIXct, format: "2014-01-01" "2014-01-02" "2014-01-03" ...
 $ AB      : num  289 336 356 258 316 ...
 $ EI      : num  306 347 370 282 335 ...
 $ IB      : num  282 325 299 250 307 ...
 $ EF      : num  304 348 367 281 335 ...

ggplot(YYY, aes(TRADE_DT)) + 
  geom_line(aes(y = AB, colour = "AB")) + 
  geom_line(aes(y = EI, colour = "EI")) +
  geom_line(aes(y = IB, colour = "IB")) +
  geom_line(aes(y = EF, colour = "EF")) 

But, the above method didn't separate the graphs by utility_cd or LSE_cd as i desired and also i had to get rid of hour. I have seen people use "by command"in SAS tocreate these multiple graphs at once.

Is there a magical "command" in R for this type of deal? I will output all my graphs in one big pdf which i can handle on my own.

if anyone could share the secret of producing multiple graphs with those criteria, i would really appreciate it. Also, when i had 24 hourly data, the line doesn't look like lines, they looked like they were connected to one another by a slanted horizontal line.

Thanks again!

Best, Gyve

Community
  • 1
  • 1
Gyve
  • 57
  • 1
  • 7

1 Answers1

1

Please provide dput(head(YOUR DATA SET)) instead of the str as str is not very good to reproduce your data. How to make a great R reproducible example?

Hope i this helps:
1. Plotting you data For ggplot you need a molten data-set in the wording of the reshape2-package

Taking your second data-set:

YYY <- data.frame(TRADE_DT = seq(as.Date("2014-01-01"),as.Date("2014-01-05"), length.out = 5),
           AB = c(289,336,356,258,316),
           EI = c(306,347,370,282,335),
           IB = c(282,325,299,250,307),
           EF = c(304,348,367,281,335))

Now we use melt to shape it to our needs:

require(reshape2)
YYY_molten <- melt(YYY,"TRADE_DT")
> head(YYY_molten)
    TRADE_DT variable value
1 2014-01-01       AB   289
2 2014-01-02       AB   336
3 2014-01-03       AB   356
4 2014-01-04       AB   258
5 2014-01-05       AB   316
6 2014-01-01       EI   306

Now you can use ggplot

require(ggplot2)
ggplot(YYY_molten, aes(x = TRADE_DT, y = value, col = variable)) + geom_line()

2. plotting by utility
Assuming _utility_cd_ is the column with the utility-data you can do something like:

ZZZ_split <- split(ZZZ, f = ZZZ$utility_cd)
lapply(ZZZ_split, function(subset){
  # function that melts and plots your subset/utility
})

If i interpret your str right it should be:

lapply(ZZZ_split, function(subset){
  print(ggplot(subset, aes(x=TRADE_DT, y=LSE_CD, col = source)) + geom_line())
})
Community
  • 1
  • 1
Rentrop
  • 20,979
  • 10
  • 72
  • 100