0

Here is the data set, df.test:

   MLSpredictions BPLPredictions
1        1.392213      0.8326201
2        1.392213      0.8662049
3        1.448370      0.9011444
4        1.448370      1.0146486
5        1.448370      0.9374932
6        1.448370      0.9374932
7        1.448370      0.9011444
8        1.448370      1.0981538
9        1.448370      1.0555757
10       1.506792      1.0555757
11       1.506792      1.1424492
12       1.506792      1.0555757
13       1.567570      1.0981538
14       1.567570      1.0981538
15       1.567570      1.1424492
16       1.567570      1.1424492
17       1.567570      1.1885314
18       1.567570      1.1424492
19       1.567570      1.1885314
20       1.630800      1.2364723

I know that GGPlot requires you to include all of the information in the same data frame which I believe I have done above.

Here is my starting point:

ggplot(df.test, aes(x = 1:20, y = , color = ))

Since my column names are different, I'm not sure what to put for "y". I've been looking all over for sample data frames that would be used in this instance but I'm coming up empty.

Please advise.

[EDIT] I would like to come up with a plot that has two lines with two different colors in the same plot.

madsthaks
  • 377
  • 1
  • 6
  • 16
  • 5
    possible [duplicate](http://stackoverflow.com/questions/17150183/r-plot-multiple-lines-in-one-graph). And [this post](http://stackoverflow.com/questions/35586520/when-creating-a-multiple-line-plot-in-ggplot2-how-do-you-make-one-line-thicker) gives information about `aes(x)` – cuttlefish44 Aug 15 '16 at 03:42
  • The question lacks information on what you are trying to come up with. Can you please provide on what you are trying to plot within your graph? – Pj_ Aug 15 '16 at 04:53
  • `ggplot(df.test, aes(x=MSLpredictions, y=BPLPredictions)) + geom_line()` will give you one line for all of the data you posted. To get, say, two lines with different colors, you'd need a categorical column with two different values. For example, `df.test$group = rep(c("A","B"), 10)`, then, `ggplot(df.test, aes(x=MSLpredictions, y=BPLPredictions, color=group)) + geom_line()` – eipi10 Aug 15 '16 at 04:53
  • @eipi10 OP can also do rbind() on the two columns, create a new data frame and plot the graph. Will that be a wrong approach? – Pj_ Aug 15 '16 at 05:01
  • @Pj_, I'm not sure what you mean. How would you use `rbind` here? – eipi10 Aug 15 '16 at 05:08
  • 1
    I guess what OP wants is `library(reshape2); library(ggplot2); cbind(df.test, id=1:20) %>% melt(id.vars="id") %>% ggplot(aes(x = id, y = value, colour = variable)) + geom_line()` – cuttlefish44 Aug 15 '16 at 05:18
  • As @eipi10 stated, I'm looking to have two lines with different colors. In your example, I wouldn't exactly want `x = MLSPredictions` unless that is how the syntax is written. The x values would be 1:20 for both Prediction sets. Also, when you write `df.test$group = rep(c("A","B"), 10)`, is that essentially creating two columns inside the _group_ column in the data fram – madsthaks Aug 15 '16 at 19:24

1 Answers1

1

ggplot expects the input data to be in so-called "long" format. In a long data set, 1 column contains the actual data values (whatever they may be), and all other columns tell us characteristics of those data points, such as what type of measure the values might be, what group they're a part of, etc. A long version of your data might look like:

index         variable      value
    1   MLSpredictions   1.392213
    2   MLSpredictions   1.392213
  ...              ...        ...
    1   BPLPredictions  0.8326201
    2   BPLPredictions  0.8662049
  ...              ...        ...

And you could then get your intended plot with:

my.plot <- ggplot(data = long.data, aes(x = index, y = value, color = variable)) +
           geom_line()

There are a few ways to convert your "wide" data into long format, one of which would be:

library(dplyr)
library(tidyr)

df$index <- 1:20
long.data <- gather(df, variable, value, -index)
jdobres
  • 11,339
  • 1
  • 17
  • 37
  • Thnaks, that definitely helps. Now, to make things a little more complex, say I have 4 more columns that are the upr and lwr bounds of the confidence interval for both sets of predictions, how would that work. I know how to create the confidence interval ribbon using ggplot, just not sure how to plot them all on the same plot. – madsthaks Aug 16 '16 at 00:37
  • @user3552144 consider asking as a separate question – Cyrus Mohammadian Aug 16 '16 at 03:00
  • I agree that it's probably best asked separately. Briefly, for every ggplot aesthetic you want to use, you need a column. So in this case, your data would be "longish", and contain another column for the error term. Assuming this column was called "error", your `aes` would look something like `aes(x = index, y = value, ymin = value - error, ymax = value + error)`. – jdobres Aug 16 '16 at 14:59