optimize histogram in R

Question

Here is my code and plotted diagram. The current issue is almost all data are clustered into the first slot of histogram. Is there a way to make histogram more granular, for example, 500 or 1000 per bucket? Thanks.

library(ggplot2)

df <- read.csv('~/Downloads/foo.tsv', sep='\t', header=F, stringsAsFactors=FALSE)
names(df) <- c('foo')
df$foo <- as.numeric(df$foo)
goodValue <- df$foo
hist(goodValue,main="Some distribution",xlab="Spending")

regards, Lin

Take a look at the help of `hist`. There is a `breaks` option which controls the width of the bins. — JACKY88, Jul 30 '16 at 03:11
Thanks @PatrickLi, works better now. A further question is, how do display appropriate value besides each bucket (or every few buckets)? Currently, there is only 500,000, 1,000,000 and 1,5000,000 there and is not very intuitive. — Lin Ma, Jul 30 '16 at 04:15
It is woeful for R that the default bindwidth is determined by Sturges' rule (1929) which has been shown to be oversmoothed for larger samples. Scott's rule (1979), is the second method listed, and is more advisable. That said, for exploratory purposes, it is helpful to see the data in higher granularity. — shayaa, Jul 30 '16 at 06:28
I think the question is somehow ill-posed. 1. The binning question is actually answered by the second argument of the function `hist`- if the help would have been used it would already be answered. 2. It is not possible to replicate the example — Daniel, Jul 30 '16 at 06:28
@Daniel, thanks and vote up. I can post a new question. BTW, what do you mean binning question? — Lin Ma, Jul 30 '16 at 22:51
@shayaa, thanks. A further question is, how do display appropriate value besides each bucket (or every few buckets)? Currently, there is only 500,000, 1,000,000 and 1,5000,000 there and is not very intuitive. — Lin Ma, Aug 03 '16 at 19:29
take a look at `geom_text`. You can use the other answer I provided on your similar question. — shayaa, Aug 03 '16 at 19:30
Thanks @shayaa, vote up. Do you mean your sample here => http://stackoverflow.com/questions/38679412/show-only-0-90-or-0-95-percentile? Not found any tex label features. Please feel free to correct me if I am wrong. — Lin Ma, Aug 04 '16 at 22:40
@shayaa, post my confusions here => http://stackoverflow.com/questions/38778813/geom-text-works-for-histogram-in-r, your guidance is appreciated. :) — Lin Ma, Aug 04 '16 at 23:20

optimize histogram in R

0 Answers0