2

I want to create a histogram from my data set of the frequency of students who have had broken bones. The values are either 0 or 1. I.E:

[1] 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0
[38] 1 1 0 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
[75] 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 1 0 0 0 0 0 0 1
[112] 1 1 0 0 1 1 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 1 0 0 0
[149] 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 1 1 0 0 0 0 1 0 0 0
[186] 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 0 0
[223] 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 0 0 1 1 0 1 0 1 0 1 1 0 0 0 0 0 0 0 0 1
[260] 1 0 0 0 0 0 0 1 1 1 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0
[297] 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 0 0 0 0 0 1 1 1 0 0 0 0 1
[334] 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 1 0 0 0 0
[371] 0 0 0 0 0 0 0 0 0 0 0 0

However the scale on the axis axis of the graph has increments of 0.2. I just want either 0 or 1 as the data is categorical. Would anyone please kindly tell me how to rectify this?

user3069564
  • 159
  • 1
  • 3
  • 11

1 Answers1

1

What you need is a combination of assigning the appropriate values to the breaks argument and the xaxp argument in ?hist. Consider:

  # this just gives me your data:
my.data <- "
0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0
 1 1 0 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 1 0 0 0 0 0 0 1
 1 1 0 0 1 1 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 1 0 0 0
 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 1 1 0 0 0 0 1 0 0 0
 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 0 0
 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 0 0 1 1 0 1 0 1 0 1 1 0 0 0 0 0 0 0 0 1
 1 0 0 0 0 0 0 1 1 1 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0
 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 0 0 0 0 0 1 1 1 0 0 0 0 1
 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 1 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0"
my.data <- unlist(strsplit(my.data, " "))
my.data <- gsub("\\n", "", my.data)
my.data <- as.numeric(my.data)

hist(my.data, breaks=c(-.5, .5, 1.5), xaxp=c(0,1,1))

breaks is used to define exactly 2 bins, and xaxp is used to change the number and placement of the tick marks on the x axis (for more on how xaxp works, see this excellent answer: R, change the spacing of tick marks on the axis of a plot?) Here is the resulting figure:

enter image description here

On a different note, it is not clear how informative a histogram is for data like this (or perhaps even ever, see: assessing-approximate-distribution-of-data-based-on-a-histogram on stats.SE). You might just was well try:

> table(my.data)
my.data
  0   1 
296  86 
Community
  • 1
  • 1
gung - Reinstate Monica
  • 11,583
  • 7
  • 60
  • 79