0

This is an Image of my dataset:

This is my dataset. There are 500 cell lines.

I would like to get something like this:

library("ggplot2")

ggplot(prismData,
  aes(x = Cell_line,
  y = Aneuploidy.Aneuploidy.score,
  fill = prismData$Aneuploidy.Ploidy)) +
  geom_col(position = "dodge")

The results: enter image description here

Jack
  • 3
  • 2
  • Hi Jack! Welcome to StackOverflow – Mark Jul 11 '23 at 08:25
  • 2
    Welcome to stack overflow. A few notes on asking a question that's likely to get good responses - please don't include data or code as images if possible, it makes it impossible for people to copy/paste it to replicate your issue. What you have shown is a line chart, not a column chart, so try `geom_line()` to start with. – Paul Stafford Allen Jul 11 '23 at 08:25
  • 1
    One thing you can do which will get your question answered 10x faster is posting the data and/or code you've used within your question (enough that we can produce something). Obviously, it's impossible to load a screenshot of a dataframe into R – Mark Jul 11 '23 at 08:26
  • run `dput(prismData)` and paste the results into your question – Mark Jul 11 '23 at 08:27
  • Please do not use image to describe your question. But I think what you really missed here is that your values for `fill` aesthetic is continuous, so there could be something wrong. – Liang Zhang Jul 11 '23 at 08:30
  • Hi Mark, thank you for your quick answer. I tried Dput put it gives me multiple plots where the cell lines are not shown in the x axis. – Jack Jul 11 '23 at 08:31
  • wait, is prismData the input dataframe? – Mark Jul 11 '23 at 08:32
  • Maybe read [this](https://stackoverflow.com/a/5963610/5996475) will help you make a better question. – Liang Zhang Jul 11 '23 at 08:33
  • @Mark Yes That is the input data – Jack Jul 11 '23 at 08:36
  • @LiangZhang Thank you for your advice! I will look into it. :) – Jack Jul 11 '23 at 08:36
  • thank you for doing that, but as it stands, the table you added to your question doesn't cover even the `Aneuploidy.Aneuploidy.score` columns, so I couldn't reproduce your thing even if I really really wanted to! – Mark Jul 11 '23 at 08:37
  • ah wait, A is Aneuploidy.Aneuploidy.score? – Mark Jul 11 '23 at 08:37
  • 1
    @Mark Yes I changed it to A and B for the readability but that makes it abit confusing sorry! – Jack Jul 11 '23 at 08:38
  • General approach for a line chart with secondary y axis is addressed here: https://stackoverflow.com/a/63540829/16730940 – Paul Stafford Allen Jul 11 '23 at 08:42
  • @Jack it's all good! – Mark Jul 11 '23 at 08:56

1 Answers1

0

Here is my attempt at it:

# I re-added the old column titles
prism_data <- read.table(text="
Cell-Line   Aneuploidy.Aneuploidy.score   Aneuploidy.Ploidy
ACH-000001  26  3.52    
ACH-000007  8   2.28    
ACH-000008  22  2.63    
ACH-000012  21  2.74    
ACH-000013  21  3.13    
ACH-000014  8   3.07", header=TRUE)

library(tidyverse)

prism_data %>% 
  mutate(Cell_num = as.numeric(str_extract(`Cell.Line`, "\\d+"))) %>% # you could probably get the same effect by converting the cell line column into a factor column, but I chose this way instead (getting the number part of the cell line string)
  ggplot(aes(x = Cell_num)) + 
  geom_line(aes(y = Aneuploidy.Aneuploidy.score), colour = "blue") + 
  geom_point(aes(y = Aneuploidy.Aneuploidy.score), shape = 17, size = 3, colour = "blue") + 
  geom_line(aes(y = Aneuploidy.Ploidy * 10), colour = "red") + # the ploidy data needs to be scaled up, so that it is roughly of the same magnitude as the score data
  geom_point(aes(y = Aneuploidy.Ploidy * 10), shape = 15, size = 3, colour = "red") +
  scale_y_continuous(sec.axis = sec_axis(~./10, name = "Aneuploidy.Ploidy")) +  # adding a second column
  theme_bw()

plot of data I wasn't able to get the x axis names the same (are they the names of cell lines?)

Mark
  • 7,785
  • 2
  • 14
  • 34
  • 1
    Thanks alot! That looks perfect. The first column is the names of the cells but that is not a big issue. Thank you very much. – Jack Jul 11 '23 at 09:02