4

I am a beginner with R and I would like to ask you for help.

TASK: I would like to make a graph representing a hourly demand of water during the day. The graph consists of several curves of different days (for instance, see the link here).

I devided data of each day into sublists:

    > head(aaa)
    [[1]]
                   by60min  consumption
    1  2018-07-01 00:05:00            0
    2  2018-07-01 01:05:00            0
    3  2018-07-01 02:05:00            0
    4  2018-07-01 03:05:00            0
    ....
    [[2]]
                   by60min  consumption
    25 2018-07-02 00:05:00            0
    26 2018-07-02 01:05:00            0
    27 2018-07-02 02:05:00            0
    28 2018-07-02 03:05:00            0

Sometimes, there were no water consumption and I would like avoid plotting these days into the graph. And here I have been stuck. I do not know how to do it. My idea is to delete all days where consumption is zero and then plot non-zero days, but I was not able to do it. Is there any idea how to do it (plotting non-zero days or/and how to delete sublists from the list)?

Thank you very much in advance.

Luboš

addition:

# 1st step - tibble:
    aaa <- as.tibble(aaa)
    aaa
# A tibble: 1,487 x 2
    by60min             consumption
    <fct>                     <dbl>
    1 2018-07-01 00:05:00         0
    2 2018-07-01 01:05:00         0
    3 2018-07-01 02:05:00         0
    4 2018-07-01 03:05:00         0
    5 2018-07-01 04:05:00         0
    6 2018-07-01 05:05:00         0
    7 2018-07-01 06:05:00         0
    8 2018-07-01 07:05:00     0.101
    9 2018-07-01 08:05:00     0.167
   10 2018-07-01 09:05:00     0.267
   # ... with 1,477 more rows

# 2nd step - plot:
    aaa %>%
      mutate(day = factor(day(ymd_hms(by60min))),
             hour = factor(hour(ymd_hms(by60min)))) %>%
      group_by(day) %>%
      filter(sum(consumption) > 0) %>%
      ggplot(mapping = aes(x = hour, y = consumption, 
                           col = day, 
                           show.legend = FALSE)) +
      geom_line(show.legend = FALSE)

# OUTPUT (the picture below) - bar graph instead of line chart - why?
# please NOTE that akt_spotreba == consumption 

enter image description here

dput(aaa) # I inserted only first three rows
structure(list(by60min = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 
7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 
20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 
haraslub
  • 151
  • 1
  • 10
  • 1
    Hello Lubos and welcome to SO. Can you give [a minimal, reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)? – markus Sep 23 '18 at 17:09
  • I do not understand the exact type of graph you want. In the link you have a graph with hours from `0` to `24` and one line per month. Are the lines values hourly averages? – Rui Barradas Sep 23 '18 at 17:17

1 Answers1

3

Here's a tidyverseapproach, using a simple example dataset, based on what you have provided.

l1 = data.frame(by60min = c("2018-07-01 00:05:00","2018-07-01 01:05:00","2018-07-01 02:05:00"),
                consumption = 0)

l2 = data.frame(by60min = c("2018-07-02 00:05:00","2018-07-02 01:05:00","2018-07-02 02:05:00"),
                consumption = c(0,2,30))

l3 = data.frame(by60min = c("2018-07-03 00:05:00","2018-07-03 01:05:00","2018-07-03 02:05:00"),
                consumption = c(10,8,2))

l = list(l1,l2,l3)

Your original data look like:

[[1]]
by60min consumption
1 2018-07-01 00:05:00           0
2 2018-07-01 01:05:00           0
3 2018-07-01 02:05:00           0

[[2]]
by60min consumption
1 2018-07-02 00:05:00           0
2 2018-07-02 01:05:00           2
3 2018-07-02 02:05:00          30

[[3]]
by60min consumption
1 2018-07-03 00:05:00          10
2 2018-07-03 01:05:00           8
3 2018-07-03 02:05:00           2
library(tidyverse)
library(lubridate)

map_df(l, data.frame) %>%                         # combine list element to one dataframe
  mutate(day = factor(date(ymd_hms(by60min))),    # get day from date
         hr = hour(ymd_hms(by60min))) %>%         # get hour from date
  group_by(day) %>%                               # for each day
  filter(sum(consumption) > 0) %>%                # calculate sum of consumption and remove days where this is 0
  ungroup() %>%
  ggplot(aes(hr, consumption, col=day))+          # plot lines
  geom_line()

The output plot:

enter image description here

AntoniosK
  • 15,991
  • 2
  • 19
  • 32
  • Wow, I am impressed! It works and the result is precisely what I desired. Thank you so much, AntoniosK! – haraslub Sep 25 '18 at 15:45
  • Antonio, I would like to ask you for a help again. After you provided me with the solution What would your code if you were working with data in tibble and not with lists? Thank you very much in avdance! – haraslub Oct 04 '18 at 07:45
  • The solution should be similar, as the first command creates a data frame out of those lists. – AntoniosK Oct 04 '18 at 07:58
  • Unfortunately it does not work. When I use the same code for data which I transferred into tibble, it does not plot the lines but the bars instead. My code is following: mist_data2 %>% mutate(day = factor(day(ymd_hms(by60min))), hour = factor(hour(ymd_hms(by60min)))) %>% group_by(day) %>% filter(sum(consumption) > 0) %>% ggplot(mapping = aes(x = hour, y = consumption, col = day, show.legend = FALSE)) + geom_line(show.legend = FALSE) – haraslub Oct 08 '18 at 16:52
  • I think it will be better if you update your question with the data format you mentioned (tibble) and I'll see how to change the code. It's important to post an example of data that represent your actual dataset. – AntoniosK Oct 08 '18 at 16:55
  • I did it. I hope it will help. – haraslub Oct 08 '18 at 18:31
  • What is `aaa` and what is `mist_data2`? I can't run your code without data. – AntoniosK Oct 08 '18 at 18:44
  • aaa = mist_data2, sorry for inconvenience caused – haraslub Oct 08 '18 at 18:52
  • Can you use `dput()` on the first 20 rows of your data and post the output? I need to be able to use exactly the same data format as you. – AntoniosK Oct 08 '18 at 19:04
  • Now it should be completed. Thank you for your time, I appreciate it a lot. – haraslub Oct 09 '18 at 14:57