4

How can I create a line graph with ggplot 2 where the x variable is either categorical or a factor, the y variable is numeric and the group variable is categorical? I have tried just + geom_point() with the variables as stated above and it works, but + geom_line() does not.

I have already reviewed posts such as: Creating line graph using categorical data, ggplot2 bar plot with two categorical variables, and No line in plot chart despite + geom_line(), but none of them answer my question.

Before I go into code and examples, (1) Yes I absolutely must have the x-variable and group variable as a character or factor, (2) No, I do not want a bar graph or just geom_point().

The example below provides the coefficients of multiple independent variables from three different example regressions run using different variations on the dependent variable. While the code below shows a work around that I figured out (i.e. creating a int variable named 'test' to use in place of the chr variable containing the names of the independent variables form the regression), I need to instead be able to preserve the chr names of the independent variables.

Here is what I have:

library(dplyr)
library(ggplot2)
library(plotly)
library(tidyr)

var_names <- c("ST1", "ST2", "ST3", 
               "EFI1", "EFI2", "EFI3", "EFI4", 
               "EFI5", "EFI6")

####Dataset1####
reg <- c(26441.84, 20516.03, 12936.79, 17793.22, 18837.48, 15704.31, 17611.14, 17360.59, 14836.34)
r_adj <- c(30473.17, 35221.43, 29875.98, 30267.31, 29765.9, 30322.86, 31535.66, 30955.29, 29828.3)
a_adj <- c(19588.63, 31163.79, 22498.53, 27713.72, 25703.89, 28565.34, 29853.22, 29088.25, 25213.02)

df1 <- data.frame(var_names, reg, r_adj, a_adj, stringsAsFactors = FALSE)
df1$test <- c(1:9)

df2 <- gather(df1, key = "series_type", value = "value", c(2:4))

fig7 <- ggplot(df2, aes(x = test, y = value, color = series_type)) + geom_line() + geom_point()
fig7

Ultimately I want something that looks like the plot below, but with the independent variable names in place of the 'test' variable.

Example Plot

jazzurro
  • 23,179
  • 35
  • 66
  • 76
pseudorandom
  • 142
  • 1
  • 1
  • 10
  • For x in aes(), you want to have `x = factor(test)`, I think. – jazzurro Mar 05 '20 at 01:43
  • @jazzurro As it turns out, it doesn't matter if the x-variable is a factor or a character (I had tried both ways with no success). What does matter is making sure to include the 'group = series_type' argument within aes(). I'd thought that color = series_type would cover that, but you have to include group = series_type in order for geom_line() to work. – pseudorandom Mar 05 '20 at 17:55

1 Answers1

7

You can convert var_names into a factor and set the levels in the order of appearance (otherwise it will be assigned alphanumerically and the x axis will be out of order). Then just add series_type to the group parameter in the plot.

df2 <- gather(df1, key = "series_type", value = "value", c(2:4)) %>%
  mutate(var_names = factor(var_names, levels = unique(var_names)))

ggplot(df2, aes(x = var_names, y = value, color = series_type, group = series_type)) + geom_line() + geom_point()

enter image description here

Ritchie Sacramento
  • 29,890
  • 4
  • 48
  • 56
  • 1
    So as it turns out the piece that I was missing was adding 'group = series_type' to the aes() of ggplot(). Adding the group argument works regardless of whether the x-variable is a chr or a factor. – pseudorandom Mar 05 '20 at 17:51