24

Trying to use ggplot to plot multiple lines into one graph, but not sure how to do so with my dataset. Not sure whether I need to change the datastructure or not (transpose?)

Data looks like this:

Company   2011   2013
Company1  300    350
Company2  320    430
Company3  310    420

I also tried it transposed:

Year   Company1  Company2  Company3
2011   300       320       310 
2013   350       430       420

And for this I can plot 1 of the values using;

ggplot(data=df, aes(x=Year, y=Company1)) + geom_line(colour="red") + geom_point(colour="red", size=4, shape=21, fill="white")

But I don't know how to combine all the companies as I don't have an object 'Company' anymore to group on. Any suggestions?

Jaap
  • 81,064
  • 34
  • 182
  • 193
Chrisvdberge
  • 1,824
  • 6
  • 24
  • 46

3 Answers3

56

You should bring your data into long (i.e. molten) format to use it with ggplot2:

library("reshape2")
mdf <- melt(mdf, id.vars="Company", value.name="value", variable.name="Year")

And then you have to use aes( ... , group = Company ) to group them:

ggplot(data=mdf, aes(x=Year, y=value, group = Company, colour = Company)) +
    geom_line() +
    geom_point( size=4, shape=21, fill="white")

enter image description here

Beasterfield
  • 7,023
  • 2
  • 38
  • 47
  • `df` won't overwrite `stats::df`, `R` knows which one you're referring to by context. Try it yourself: `df <- data.frame(A=1:10) ; df(df$A, 1, 5)` – Señor O Jun 17 '13 at 14:54
  • 1
    @SeñorO That's what I read already multiple times but you are right, it's not overwritten. So as note to myself: Never pass information to someone else without having checked them by myself :-) I edited my question. – Beasterfield Jun 17 '13 at 15:04
  • 1
    It can still be a good idea to avoid `df` as a variable name - if you use it often (like I do, against my own advice) then sometimes when you forget to define it, you get the cryptic error "Error in df$foo : object of type 'closure' is not subsettable" instead of something better like "Error: object 'df' not found". – Ken Williams Jan 25 '16 at 04:45
18

Instead of using the outrageously convoluted data structures required by ggplot2, you can use the native R functions:

tab<-read.delim(text="
Company 2011 2013
Company1 300 350
Company2 320 430
Company3 310 420
",as.is=TRUE,sep=" ",row.names=1)

tab<-t(tab)

plot(tab[,1],type="b",ylim=c(min(tab),max(tab)),col="red",lty=1,ylab="Value",lwd=2,xlab="Year",xaxt="n")
lines(tab[,2],type="b",col="black",lty=2,lwd=2)
lines(tab[,3],type="b",col="blue",lty=3,lwd=2)
grid()
legend("topleft",legend=colnames(tab),lty=c(1,2,3),col=c("red","black","blue"),bg="white",lwd=2)
axis(1,at=c(1:nrow(tab)),labels=rownames(tab))

R multiple lines plot

Federico Giorgi
  • 10,495
  • 9
  • 42
  • 56
1

The answer by @Federico Giorgi was a very good answer. It helpt me. Therefore, I did the following, in order to produce multiple lines in the same plot from the data of a single dataset, I used a for loop. Legend can be added as well.

plot(tab[,1],type="b",col="red",lty=1,lwd=2, ylim=c( min( tab, na.rm=T ),max( tab, na.rm=T ) )  )
for( i in 1:length( tab )) { [enter image description here][1]
lines(tab[,i],type="b",col=i,lty=1,lwd=2)
  } 
axis(1,at=c(1:nrow(tab)),labels=rownames(tab))
Estatistics
  • 874
  • 9
  • 24