I'm working on my Masters dissertation and need help with the programming part.
I want to generate a histogram which plots the density on the amount of shares bought/ sold by corporate insiders.
The problem is that the variable "Amount" is very broad and has extreme values of 2,589,704. These values are way higher than the mean of 38,000 and the median of 900. The min is 1.
Therefore I want to generate a histogram that has variable breaks.
My code looks like this:
hist(myInside$Amount,
breaks=c(min(myInside$Amount), seq(1000, 10000, 1000), max(myInside$Amount)),
xlab="Amounts of shares bought/ sold",
xlim=c(1,2589704),
col="blue",
freq=FALSE
)
The result looks like this:
There is only a tiny line close to the zero in the left corner. The rest is empty and I simply do not know why.
Does anybody has an improvement so that the classes of the histogram do match the data properly? I wanted something like 11 classes from 1 to 10,000 because most of the data is in this range and the rest should be aggregated in the last class, so that everything higher than 10,000 is in the last class.
Thanks a lot for your help everybody.