2

I have a bunch of dataframes and I want to plot 2 columns of each dataframe on the same ggplot. I already have a plot from another function, coloured in blue and red and I want the new ones to be added to it. Although the way I'm trying works on the console, I can't get to save the function, call it and have it work. The error I get is :

Discrete value supplied to continuous scale.

So, the dataframes are in my environment and named BEFMORN1 to BEFMORN9. The initial plot is test_plot.

The first part that gives me the test_plot works.

test_plot<-ggplot()+geom_point(data=yy4, aes(x=Time, y=Dist), colour="red")+geom_point(data=zz4, aes(x=Time, y=Dist), colour="blue")
test_plot<-test_plot+scale_x_continuous(name="Time (Seconds from the beginning)")
test_plot<-test_plot+scale_y_continuous(name="Distance (Metres from the beginning)")

The second part will be the new function

plot_all_runs<-function(r,test_plot) {
for (i in 1:(length(r[[1]]))) {
z<-as.data.frame(mget(ls(pattern=paste0("BEFMORN",i))))
test_plot2<-test_plot+geom_point(data=z, aes_string(x=names(z)[12], y=names(z)[17]))
}print(test_plot2)
}

r is a list of 6 lists of different dataframes, so BEFMORN came from r[[1]]. BEFNOON will come from r[[2]] etc. So my plan is to have 6 identical functions with different arguments in paste0.

I'm using aes_string(x=names(z)[12] because the data frames z will have different column names in each iteration.

Does someone understand why I'm getting an error? I have played around with the scales (removing them from the initial plot or adding them again in the next one) but no improvement.

EDIT: All columns to be plotted have been transformed to numeric. Others are factors and integers.

EXAMPLE

BEFMORN1<-data.frame(BEFMORN1.Time=seq(0:10, 0.5), BEFMORN1.Dist=1:20)
BEFMORN2<-data.frame(BEFMORN2.Time=seq(0:13, 0.5), BEFMORN2.Dist=c(1:8,8,8,9,10,13,13,13,13.5,14,14,14 14:20))
yy4<-data.frame(Time=seq(0:10, 0.5). Dist=c(1:8,8,8,9,10,13,14:20))
ZZ4<-data.frame(Time=seq(0:12, 0.5). Dist=c(1:8,8,8,9,9.5,10,10.5,12,12.5,13,14:20))

test_plot<-ggplot()+geom_point(data=yy4, aes(x=Time, y=Dist), colour="red")+geom_point(data=zz4, aes(x=Time, y=Dist), colour="blue")

plot_all_runs<-function(test_plot) {
for (i in 1:9) {
z<-as.data.frame(mget(ls(pattern=paste0("BEFMORN",i))))
test_plot2<-test_plot+geom_point(data=z, aes_string(x=names(z)[12], y=names(z)[17]))
}print(test_plot2)
}
Nikos
  • 21
  • 3
  • 1
    The reason for the error is usually because the data are in wide rather than long format and ggplot misinterprets them as discrete variables. Other than that it is pretty difficult to help without a reproducible example (http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Do you want to add multiple curves to a single plot or print out multiple plots? – biomiha Jul 11 '16 at 21:06
  • multiple curves to a single plot – Nikos Jul 11 '16 at 21:13
  • 2
    In that case merge the data frames into one (no need for loops - there are plenty of better options), transform the data into long format and colour by whatever variable defines your curves. – biomiha Jul 11 '16 at 21:16
  • Thanks for the answers! I need a different curve for each dataframe, though. Imagine the test_plot has plotted their average curve and now I need to plot the initial ones as well. If I merge the data frames how will I differentiate between the different dataframe? – Nikos Jul 11 '16 at 21:31
  • @Nikos The same you differentiate between anything in a data set: with a variable in another column. – joran Jul 11 '16 at 21:36
  • I see. I'll try this out! Thanks both! – Nikos Jul 11 '16 at 21:41

1 Answers1

1

An example of generating the long format @biomiha and @joran suggested:

library(ggplot2)

BEFMORN1<-data.frame(Time=seq(0,10, 0.5)
                     , Dist=1:21, Group = "BEFMORN1")
BEFMORN2<-data.frame(Time=seq(0,13, 0.5)
                     , Dist=c(1:8,8,8,9,10,13,13,13,13.5,14,14,14,14:21)
                     , Group = "BEFMORN2")
yy4<-data.frame(Time=seq(0,10, 0.5)
                , Dist=c(1:8,8,8,9,10,13,14:21)
                , Group = "yy4")
zz4<-data.frame(Time=seq(0,12, 0.5)
                , Dist=c(1:8,8,8,9,9.5,10,10.5,12,12.5,13,14:21)
                , Group = "zz4")


allData <-
  rbind(BEFMORN1, BEFMORN2, yy4, zz4)


ggplot(allData
       , aes(x = Time
             , y = Dist
             , col = Group)) +
  geom_point()

Note that if your data are already in place, adding a "Group" column may need to be done with a bit more care. However, the general principle is the same. If you want, you can use any of the scale_color_* functions to change the default colors, including scale_color_manual if you want to set them yourself.

Mark Peterson
  • 9,370
  • 2
  • 25
  • 48
  • Yep, this worked. It needed some extra attention in all the subsetting (I had a list of 6 lists of different numbers of dataframes) and two for-loops to make the new variable automatically for every dataframe in every list of the list! Thanks all for helping! (My example is too complicated and specific to be valuable in this thread, so I won't upload it. If anyone wants to see it, I will make the effort to explain.) – Nikos Jul 13 '16 at 21:00