It seems you're looking for a basic understanding of how ggplot
works here and how to format and access your data to produce (in this case) a line plot with multiple series of y values (sharing the same x value). A reasonable place to start would be with the multitude of tutorials available online. Here's a crash course on how to get what I believe you are looking to do.
Prepare your data
The first step toward creating a plot is to prepare your data. From your description, it seems the data is formatted for you with the same x values applied to different y values. It's not clear to me if those exist in different datasets or if they are are in the same data frame as different columns; however, the approach here is similar in that the goal is to arrive at a data frame which respects Tidy Data Principles. Take the following example dataset:
# using these libraries for this example
library(dplyr)
library(tidyr)
library(ggplot2)
library(circular)
# example dataset. Each y# column has values for the same x column (locus)
df <- data.frame(
locus=c(1:500),
y1=rnorm(500,1,0.05),
y2=cos(rad(1:500)),
y3=sin(rad(1:500)),
y4=rnorm(500,-1.3,0.1)
)
In our example dataset, df
, we have a problem in that each of the y columns contain two pieces of information:
- The type of y: y1 contains one series, y2 another, etc...
- The value of y: each number represents the value of that type of y at a particular x value (locus)
The goal is to have a column for x values, a column for y values, and a column for y type. For this, we need to gather together the y1, y2, y3... columns, and separate these terms. For this, I'm using dplyr
and tidyr
and the gather()
function.
# combine all y1, y2, y3 columns to create a type and value column. Ignore the locus col.
df <- df %>% gather(key='type_of_y', value='value_of_y', -locus)
Now you have a dataset that is "long". This can also be done using melt()
from the package reshape2
or pivot_longer()
from the tidyverse
packages.
Plot your data
The plot is pretty easy to create now. You specify the dataframe, df
, and then within the aesthetics, we would assign x=locus
and y=value_of_y
. In order to create separate lines for each original y column, you set color=type_of_y
. That makes a separate line for each differentiated by color. You can also use linetype=type_of_y
to set a different linetype for each type of y. If you combine both, you'll see the combined effects of both. The simple case for different color lines is here:
ggplot(df, aes(x=locus, y=value_of_y, color=type_of_y)) +
geom_line() + geom_point(size=0.3)
