0

I am trying to plot a line graph with multiple lines in different colors, but not having much luck. My data set consists of 10 states and the voting turnout rates for each state from 9 elections (so the states are listed in the left column, and each subsequent column is an election year from 1980-2012 with the voting turnout rate for each of the 10 states). I would like to have a graph with the year on the X axis and the voting turnout rate on the Y axis, with a line for each state.

I found this previous answer (Plotting multiple lines from a data frame in R) to a similar question but cannot seem to replicate it using my data. Any ideas/suggestions would be immensely appreciated!

Community
  • 1
  • 1
klynn
  • 11
  • 2
  • 5

2 Answers2

0

Use tidyr::gather or reshape::melt to transform the data to a long form.

## Simulate data
d <- data.frame(state=letters[1:10],
                '1980'=runif(10,0,100),
                '1981'=runif(10,0,100),
                '1982'=runif(10,0,100))

library(dplyr)
library(tidyr)
library(ggplot2)

## Transform to a long df
e <- d %>% gather(., key, value, -state) %>% 
  mutate(year = as.numeric(substr(as.character(key), 2, 5))) %>%
  select(-key) 

## Plot
ggplot(data=e,aes(x=year,y=value,color=state)) +
  geom_point() +
  geom_line()

enter image description here

scoa
  • 19,359
  • 5
  • 65
  • 80
  • Thanks! Everything was working fine until I got to the e$key command: d <- data.frame(state=letters[1:10], + '1980'=runif(10,0,100), + '1984'=runif(10,0,100), + '1988'=runif(10,0,100), + '1992'=runif(10,0,100), + '1996'=runif(10,0,100), + '2000'=runif(10,0,100), + '2004'=runif(10,0,100), + '2008'=runif(10,0,100), + '2012'=runif(10,0,100)) library(tidyr) library(ggplot2) e <- gather(d,key,value,-state) e$key <- as.numeric(sub("^X","",as.character(e$key))) Error in `$<-.data.frame`(`*tmp*`, "key", value = numeric(0)) : replacement has 0 rows, data has 90 – klynn Jul 19 '15 at 16:32
  • having numbers as variable names is tricky. Can you post the output of `names(d)` where d is the name of your actual data frame (not the one I made up for the answer) ? – scoa Jul 19 '15 at 16:44
  • Sure - this is what happens: d <- GenElecTurnoutCaucusStates(state=letters[1:10], + '1980'=runif(10,0,100), + '1984'=runif(10,0,100), + '1988'=runif(10,0,100), + '1992'=runif(10,0,100), + '1996'=runif(10,0,100), + '2000'=runif(10,0,100), + '2004'=runif(10,0,100), + '2008'=runif(10,0,100), + '2012'=runif(10,0,100)) Error: could not find function "GenElecTurnoutCaucusStates" – klynn Jul 19 '15 at 16:52
  • @klynn type `names(GenElecTurnoutCaucusStates)` – scoa Jul 19 '15 at 16:54
  • Here is a piped version that builds on @scoa's: `d %>% gather(., key, value, -state) %>% mutate(., year = as.numeric(substr(key, 2, 5))) %>% select(., -key) %>% qplot(data = ., x=year, y=value, color=state, geom="line") %>% print`. – ulfelder Jul 19 '15 at 17:10
0

Please include your data, or sample data, in your question so that we can answer your question directly and help you get to the root of the problem. Pasting your data is simplified by using dput().

Here's another solution to your problem, using scoa's sample data and the reshape2 package instead of the tidyr package:

# Sample data
d <- data.frame(state = letters[1:10],
                '1980' = runif(10,0,100),
                '1981' = runif(10,0,100),
                '1982' = runif(10,0,100))

library(reshape2)
library(ggplot2)

# Melt data and remove X introduced into year name
melt.d <- melt(d, id = "state")
melt.d[["variable"]] <- gsub("X", "", melt.td[["variable"]])

# Plot melted data
ggplot(data = melt.d,
       aes(x = variable, 
           y = value, 
           group = state, 
           color = state)) +
  geom_point() +
  geom_line()                         

Produces: sample plot

Note that I left out the as.numeric() conversion for year from scoa's example, and this is why the graph above does not include the extra x-axis ticks that scoa's does.

rer
  • 1,198
  • 2
  • 13
  • 24