1

I'm very new to R and have written the following code for a histogram using ggplot2:

library(ggplot2)
graph1 <- ggplot(data = data1, aes(data1$Chr.position));

graph1 + geom_histogram()

(Chr.position == Chromosome position and contains roughly 273 mutations on Chromosome 2 linked with heart disease, ranging from position 179395822 to position 179658211.)

This code gives the following histogram:

enter image description here

This is great!(I actually did a thing using R!!), but when I want to alter the 'binwidth' using the following code:

graph1 + geom_histogram(binwidth = 0.04)

Rstudio gets stuck on this command, it doesn't freeze but takes in excess of half an hour to load the histogram (if at all) and when it does finally load it is just a blank chart with no bars with following error:

In loop_apply(n, do.ply) : position_stack requires constant width: output may be incorrect

structure(list(Chr.position = c(179604264L, 179591957L, 179558736L, 179498055L, 179506963L, 179506963L, 179497076L, 179478864L, 179472127L, 179458075L, 179456704L, 179455162L, 179454957L, 179444661L, 179442324L, 179433758L, 179433213L, 179428871L, 179425091L, 179424036L, 179412902L, 179412245L, 179410544L, 179406990L, 179406990L, 179410799L, 179485012L, 179477004L, 179471841L, 179457392L, 179457005L, 179444429L, 179441649L, 179441015L, 179440067L, 179424398L, 179422457L, 179417723L, 179413187L, 179408239L, 179404491L, 179404286L, 179401029L, 179456704L, 179456704L, 179456704L, 179456704L, 179456704L, 179456704L, 179456704L, 179456704L, 179456704L, 179456704L, 179456704L, 179456704L, 179456704L, 179456704L, 179456704L, 179456704L, 179456704L, 179456704L, 179456704L, 179456704L, 179456704L, 179452435L, 179477885L, 179477885L, 179477885L, 179454576L, 179454576L, 179438874L, 179438874L, 179438874L, 179438874L, 179438874L, 179438874L, 179438874L, 179438874L, 179438874L, 179438874L, 179438874L, 179438874L, 179438874L, 179438874L, 179403522L, 179403522L, 179403522L, 179428871L, 179428871L, 179428871L, 179428871L, 179424496L, 179424496L, 179424496L, 179424496L, 179424496L, 179424496L, 179424496L, 179424496L, 179443339L, 179443339L, 179477885L, 179477885L, 179477885L, 179477885L, 179477885L, 179477885L, 179434009L, 179434009L, 179419765L, 179419765L, 179419765L, 179658211L, 179433665L, 179433665L, 179433665L, 179455112L, 179455112L, 179455112L, 179455112L, 179455112L, 179455112L, 179413187L, 179413187L, 179453427L, 179453427L, 179463684L, 179429849L, 179430371L, 179429468L, 179442793L, 179497039L, 179497039L, 179424782L, 179424782L, 179422725L, 179422725L, 179422231L, 179422231L, 179658189L, 179658189L, 179422725L, 179422725L, 179414153L, 179472209L, 179472209L, 179440319L, 179432420L, 179469738L, 179469738L, 179632576L, 179632576L, 179632576L, 179458085L, 179458085L, 179458085L, 179458085L, 179458085L, 179403566L, 179403566L, 179403566L, 179403566L, 179470359L, 179470359L, 179470359L, 179470359L, 179466263L, 179428086L, 179462634L, 179462634L, 179400405L, 179433407L, 179433407L, 179433407L, 179433407L, 179478861L, 179478861L, 179478861L, 179478861L, 179456704L, 179456704L, 179456704L, 179456704L, 179477169L, 179477169L, 179477169L, 179422249L, 179422249L, 179481600L, 179481600L, 179452411L, 179452411L, 179442238L, 179442238L, 179442238L, 179427963L, 179427963L, 179427963L, 179427963L, 179427963L, 179416530L, 179416531L, 179456704L, 179456704L, 179456704L, 179418418L, 179418418L, 179418418L, 179418418L, 179456704L, 179456704L, 179469477L, 179469477L, 179469477L, 179469477L, 179469477L, 179426073L, 179426074L, 179452242L, 179430544L, 179456704L, 179456704L, 179435468L, 179435468L, 179485829L, 179605063L, 179441870L, 179423314L, 179423314L, 179416474L, 179416474L, 179395822L, 179605941L, 179605941L, 179634455L, 179442238L, 179442238L, 179411339L, 179414506L, 179456704L, 179605063L, 179487411L, 179487411L, 179487411L, 179487411L, 179487411L, 179487411L, 179487411L, 179487411L, 179644174L, 179644174L, 179472155L, 179472155L, 179472155L)), .Names = "Chr.position", row.names = c(NA, 254L), class = "data.frame")

pogibas
  • 27,303
  • 19
  • 84
  • 117
Ningman
  • 89
  • 1
  • 8
  • 1
    It would be wise if you can `dput` your dataset, else see [here](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) on how to create a reproducible example. – mnm Mar 15 '18 at 13:27
  • 1
    I can't help without a reproducible example, but you don't need to `$` notation in `ggplot2`. i.e. just say `graph1 <- ggplot(data1, aes(x = Chr.position))` – Jack Brookes Mar 15 '18 at 14:03
  • @Ashish , hope that helps – Ningman Mar 15 '18 at 15:13

1 Answers1

0

It doesn't work for a couple of reasons:

  1. Scale is just too large (range from start to end is ~200kb) and you try to divide your data into 0.04 size bins. When your machine computes results you can't see them because on the 200kb scale they are a couple of lines with size 0.04.
  2. It doesn't make sense: if you operate on genome units (ie., base pairs) then how can you have 0.04 of a base pair?

How I would present data like that:

geom_density

library(ggplot2)
ggplot(data1, aes(Chr.position)) +
    geom_point() +
    labs(x = "Position in chromosome2",
         y = "Mutation density")

enter image description here

geom_point

# Count frequency of hits
data2 <- data.frame(table(data1))
data2$position <- as.numeric(as.character(data2$data1))
# Plot result
ggplot(data2, aes(position, Freq)) +
    geom_point()  +
    labs(x = "Position in chromosome2",
         y = "Number of mutations")

enter image description here

pogibas
  • 27,303
  • 19
  • 84
  • 117
  • great thankyou! Ye if I'm honest not really too sure what bins were when I was attempting this, I was just trying to adapt an example given in a book I am using, but thanks again! – Ningman Mar 15 '18 at 15:40
  • just quickly, why is the following line necessary: data2$position <- as.numeric(as.character(data2$data1)) – Ningman Mar 15 '18 at 16:13
  • @Ningman check `table(data1)` it turns numeric chromosome coordinates into column names. We use `as.numeric(as.character(...))` to turn them back to numbers. Try plotting without it and you'll get it. – pogibas Mar 15 '18 at 16:15