0

I have a large data frame that contains a High, Low, and High-Low, for every column. I also have a Base column. I want to create a graph for each set of prefixes so that the line graph has A_H, A_L, A_HL, and Base, and then the same for all of the other matching prefixes.

date     A_H B_H C_H D_H A_L B_L C_L D_L A_HL B_HL C_HL D_HL Base
2/1/18    6   4   6   4   2   3   5   8   9    2    3    5    3
2/2/18    2   4   7   6   5   8   3   9   11   12   5    9    5
2/3/18    8   6   8   9   6   9   7   9   13   13   6    7    5

I have tried multiple approaches without them working.

GraphList <- c("A", "B", "C", "D")
for (i in seq_along(GraphList)){
    plot <- ggplot(df, aes(date)) +
        geom_line(aes(y=Base, colour='Base')) +
        geom_line(aes(y=paste0(i,"High"), colour='High')) +
        geom_line(aes(y=paste0(i,"Low"), colour='Low')) +
        geom_line(aes(y=paste0(i,"LS"), colour='LS')) 
    print(plot)

But when I do the above the graphs do not paste the name prefixes from the list, it just spits out 1H and 1L, 2H and 2L, etc. as flat lines in their respective graphs.

I also tried

plot <- ggplot(df, aes(date)) +
        geom_line(aes(y=Base, colour='Base')) +
        geom_line(aes(y=df[, grepl("_H", colnames(df))], colour='High')) +
        geom_line(aes(y=df[, grepl("_L", colnames(df))], colour='Low')) +
        geom_line(aes(y=df[, grepl("_LS", colnames(df))], colour='LS')) 
    print(plot)

Using this method I got the error

Don't know how to automatically pick the scale for object of type tbl_df/tbl/data.frame. Defaulting to continuous

Error: aesthetics must be either length 1 or the same as the data (63): y, colour, x

Thank you in advance.

Kskiaskd
  • 35
  • 5
  • 1
    Please make your problem [reproducible](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) by including a sample dataset that adequately represents your actual data (since you say it's large). Offhand I'd guess you should pre-process your data by converting it from wide to long format, before passing it into `ggplot()`. – Z.Lin Mar 17 '19 at 15:35
  • This sample is a representation of my actual data. the actual data merely has more dates included and more letters (i.e. A-CA) going across. What would pre-processing from wide to long format do? – Kskiaskd Mar 17 '19 at 15:46

1 Answers1

1

First, we can have ggplot do a lot of the work for us if the data are reshaped into "long" format:

df <- read.table(text = 'date     A_H B_H C_H D_H A_L B_L C_L D_L A_HL B_HL C_HL D_HL Base
2/1/18    6   4   6   4   2   3   5   8   9    2    3    5    3
                 2/2/18    2   4   7   6   5   8   3   9   11   12   5    9    5
                 2/3/18    8   6   8   9   6   9   7   9   13   13   6    7    5', header = T, stringsAsFactors = F)

library(tidyverse)
library(lubridate)

df.long <- df %>% 
  tidyr::gather(variable, value, -date, -Base) %>% 
  separate(variable, into = c('variable', 'measure'), sep = '_') %>% 
  mutate(date = mdy(date))

         date Base variable measure value
1  2018-02-01    3        A       H     6
2  2018-02-02    5        A       H     2
3  2018-02-03    5        A       H     8
4  2018-02-01    3        B       H     4
5  2018-02-02    5        B       H     4
6  2018-02-03    5        B       H     6
7  2018-02-01    3        C       H     6
8  2018-02-02    5        C       H     7
9  2018-02-03    5        C       H     8
10 2018-02-01    3        D       H     4

df.long moves "Base" into its own column, with its values repeated for each level of "variable" (A, B, C, D) and "measure" (H, L, HL). I've also converted the "date" column to proper Date data, which again will allow ggplot to do more work for us.

For a start, we could have all of these in one faceted plot:

g <- ggplot(data = df.long, aes(x = date, y = value, color = measure)) +
  geom_line() +
  geom_line(aes(y = Base), color = 'black') +
  facet_grid(facets = ~variable)
print(g)

enter image description here

Or we could use a loop to create several separate plot objects:

plots <- list()
for (i in unique(df.long$variable)) {
  plots[[i]] <- ggplot(data = filter(df.long, variable == i), aes(x = date, y = value, color = measure)) +
    geom_line() +
    geom_line(aes(y = Base), color = 'black')
}

plots[[1]]

enter image description here

jdobres
  • 11,339
  • 1
  • 17
  • 37