0

I have a dataframe that has 3 numerical values

T
kWh
Month

The T variable holds values between 1 and 48 and the kWh variable is discrete.

I'm trying to display a line graph that shows the average kWh at each T for every Month by having an individual line for each Month (average kWh on the Y axis and T on the X axis.)

My approach was the following

grouped_by_t = group_by(df, T)
summarised_by_t = summarise(grouped_by_t, Average=mean(kWh, na.rm=TRUE), Month=Month)

And finally plotting it like so

ggplot(data=summarised_by_t, aes(x=T, y=Average, group=Month, color=Month)) + geom_line()

Unfortunately, this just displayed one line with a colour gradient at the edge of the plot which isn't what I want.

Kermitty
  • 71
  • 4
  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Sep 14 '20 at 03:55
  • Since you want the average "at each `T` for every `Month`", I think you would want to group by both ` T` and `Month` instead of just `T`. The size of your summary data frame be close to 576 rows (assuming that you have a value for distinct combination of `T` and `Month` and that only 12 months are being used). – statstew Sep 14 '20 at 03:59

1 Answers1

1

In the absence of a sample of your dataset, df, I've generated complete nonsense data that at least should be able to give a plot along the lines of what you are looking to accomplish.

Although it can definitely be done by summarizing your dataset beforehand, you can also utilize stat_summary() to do the averaging for you based on setting the aesthetic for color. In this case, stat_summary() will use that aesthetic as the grouping aesthetic for computing the mean.

Here's the code to generate my horrible dataset and the plot code. Note as well I included setting the levels of df$Month before plotting, otherwise ggplot2 will default to representing the months in alphabetical order.

library(ggplot2)

set.seed(1234)
df <- data.frame(
  Month=rep(month.name, each=200),
  T=sample(1:48, 2400, replace=TRUE),
  kWh=rnorm(2400, 500, 140))

df$Month <- factor(df$Month, levels=month.name)

ggplot(df, aes(x=T, y=kWh, color=Month)) +
  stat_summary(geom='line', fun=mean)

enter image description here

For just a bit more context in case you were unaware, the function set for fun= value returns a value that is used for the y aesthetic. In plain text, the stat_summary() function here is set to draw a geom_line represented as the mean value of df$kWh for each df$T, separated by color for each df$Month.

chemdork123
  • 12,369
  • 2
  • 16
  • 32