Making a histogram of boolean values separated by nominal groups

Question

Currently I have a dataset like this:

X           observation.ID range.ID Center_Point range.low range.high falls.in.range   V4
       1:              1        1    242601532  11323785   11617177          FALSE KLF4
       2:              1        2    242601532  12645605   13926923          FALSE KLF4
       3:              1        3    242601532  14750216   15119039          FALSE KLF4
       4:              1        4    242601532  18102157   19080189          FALSE KLF4
       5:              1        5    242601532  29491029   30934636          FALSE KLF4
      ---                                                                              
13558714:             83        1      7974990   2940166    7172793          FALSE OCT4
13558715:             83        2      7974990   7880008   13098461           TRUE OCT4
13558716:             83        3      7974990  13556427   13843364          FALSE OCT4
13558717:             83        4      7974990  14113371   15137286          FALSE OCT4
13558718:             83        5      7974990  15475619   19472504          FALSE OCT4

There are four nominal variables in column V4 that are transcription factors. I did a cross join to see if these TF factors fall in a particular series of ranges of data. Whether or not their center_points (median) fall in that range is designated by a boolean values in the falls.in.range column. I am looking to generate a histogram where the x-axis is the four transcription factors (V4) and the y- axis is the frequencies of them falling in the set ranges I am checking.

How would I take into account the true vs. false values in the falls.in.range column when generating a histogram?

So the plot will be based on last 2 columns, `falls.in.range` and `V4`? Is this not a barplot? Also, [this post](http://stackoverflow.com/questions/24480031/roll-join-with-start-end-window) might be relevant instead of cross-join. — zx8754, Mar 09 '16 at 19:48
hello try making your dataset available easily, by using the function `dput` for example. try this [post](http://stackoverflow.com/questions/2546016/how-to-define-fill-colours-in-ggplot-histogram) — DJJ, Mar 09 '16 at 20:01

user5219763 · Accepted Answer · 2016-03-09T21:24:12.557

1

Hist works for a numeric vector

hist(df$V4[df$falls.in.range==True])

but this wont work as df$V4 isnt numeric. What you want is barplot rather than a histogram

barplot(table(df$V4[df$falls.in.range==True]))

edited Mar 09 '16 at 21:24

answered Mar 09 '16 at 20:20

user5219763

1,284
12
19

assuming dataframe is called df – user5219763 Mar 09 '16 at 20:20
This should work, but the histogram function requires numeric values. Would I then just need to convert the booleans to numeric? – Satchmo Mar 09 '16 at 20:47
Missed that v4 wasn't numeric. What you want is a barplot. Answer edited to work (probably). – user5219763 Mar 09 '16 at 21:26

Making a histogram of boolean values separated by nominal groups

1 Answers1