0

Currently I have a dataset like this:

X           observation.ID range.ID Center_Point range.low range.high falls.in.range   V4
       1:              1        1    242601532  11323785   11617177          FALSE KLF4
       2:              1        2    242601532  12645605   13926923          FALSE KLF4
       3:              1        3    242601532  14750216   15119039          FALSE KLF4
       4:              1        4    242601532  18102157   19080189          FALSE KLF4
       5:              1        5    242601532  29491029   30934636          FALSE KLF4
      ---                                                                              
13558714:             83        1      7974990   2940166    7172793          FALSE OCT4
13558715:             83        2      7974990   7880008   13098461           TRUE OCT4
13558716:             83        3      7974990  13556427   13843364          FALSE OCT4
13558717:             83        4      7974990  14113371   15137286          FALSE OCT4
13558718:             83        5      7974990  15475619   19472504          FALSE OCT4

There are four nominal variables in column V4 that are transcription factors. I did a cross join to see if these TF factors fall in a particular series of ranges of data. Whether or not their center_points (median) fall in that range is designated by a boolean values in the falls.in.range column. I am looking to generate a histogram where the x-axis is the four transcription factors (V4) and the y- axis is the frequencies of them falling in the set ranges I am checking.

How would I take into account the true vs. false values in the falls.in.range column when generating a histogram?

zx8754
  • 52,746
  • 12
  • 114
  • 209
Satchmo
  • 153
  • 1
  • 7
  • So the plot will be based on last 2 columns, `falls.in.range` and `V4`? Is this not a barplot? Also, [this post](http://stackoverflow.com/questions/24480031/roll-join-with-start-end-window) might be relevant instead of cross-join. – zx8754 Mar 09 '16 at 19:48
  • hello try making your dataset available easily, by using the function `dput` for example. try this [post](http://stackoverflow.com/questions/2546016/how-to-define-fill-colours-in-ggplot-histogram) – DJJ Mar 09 '16 at 20:01

1 Answers1

1

Hist works for a numeric vector

hist(df$V4[df$falls.in.range==True])

but this wont work as df$V4 isnt numeric. What you want is barplot rather than a histogram

barplot(table(df$V4[df$falls.in.range==True]))
user5219763
  • 1,284
  • 12
  • 19