0

I am trying to create a line graph with multiple variables.

To begin with, this is what my data looks like: https://figshare.com/s/d42f7f6d348aecac3f00

I have called this "coverage_data".

First, I create my data.frame:

data_long <- gather(coverage_data, key = "variable", value = "value", -one_of("SLOV_position", "Segment"))

The order of the variables in data_long is S1, S2, S3, S4, S5 and C100.

I then plot this:

ggplot(data=data_long, aes(x=SLOV_position, y=value, colour=variable)) +
xlab("UMAV genome position") + 
ylab("Read depth (log scale)") +
scale_y_continuous(trans='log10', labels = comma) +
ggtitle("Segment") +
theme_classic(base_size = 12) +
geom_line(size=1) +
scale_x_continuous(breaks = scales::pretty_breaks(n = 3), labels = comma) +
theme(plot.title = element_text(size = 12, hjust = 0.5), axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5, size = 8), axis.title.x = element_text(margin = margin(t = 10))) +
facet_grid(~ Segment, scales="free")

Which works great, and plots my data like so:

data_long_plot

However, ggplot orders the variables as C100, S1, S2, S3, S4 and S5. Why does it put C100 first? And how can I get it to display the variables in the original order?

I have looked at a lot of answers to similar questions, but none of them seem to work for what I am doing. For example, many suggest manually re-ordering the variables, but when I look up the variables in data_long, they are in the right order. Also, there are so many of them because data_long has hundreds of lines, so I am unsure how to do something like that manually anyway.

I'm sorry if this is a very obvious question that has been answered before, but I just can't seem to figure it out no matter how many answers I look at. Thank you very much for your help.

Axeman
  • 32,068
  • 8
  • 81
  • 94
  • Welcome to Stack Overflow! Could you make your problem reproducible by sharing a sample of your data so others can help (please do not use `str()`, `head()` or screenshot)? You can use the [`reprex`](https://reprex.tidyverse.org/articles/articles/magic-reprex.html) and [`datapasta`](https://cran.r-project.org/web/packages/datapasta/vignettes/how-to-datapasta.html) packages to assist you with that. See also [Help me Help you](https://speakerdeck.com/jennybc/reprex-help-me-help-you?slide=5) & [How to make a great R reproducible example?](https://stackoverflow.com/q/5963269) – Tung Oct 14 '19 at 03:21

1 Answers1

0

Change current variable variable from charactor into factor as:

data_long$variable <- factor(data_long$variable, levels = c("S1", "S2", "S3", "S4", "S5", "C100"))

Then, run your ggplot.

Zhiqiang Wang
  • 6,206
  • 2
  • 13
  • 27