Note: I found a similar question, for which there was an answer explaining the problem. However, I'm looking for an answer, as opposed to a reason why it's difficult (which I fully understand).
I have data for which I want to create a histogram. This data has a count of 10000 for the bin [0, 200) and a count of 1 for several bins such as [30000, 30200). Both bins are important and need to be visible. For this, I can perform a histogram with the log1p scale.
contig_len <- read.table(data_file, header = FALSE, sep = ",", col.names=c("Length"))
ggplot(contig_len, aes(x = Length)) + geom_histogram(binwidth=200) +
scale_y_continuous(trans="log1p")
This works perfectly! But now, I want to categorise the items in the histogram, as follows:
ggplot(contig_len, aes(x = Length, fill = Prevalence)) +
geom_histogram(binwidth=200, alpha=0.5, position="stack") +
scale_y_continuous(trans = "log1p")
This doesn't work, however, as the stacking is performed without taking the log scale into account. Has anyone found a way around this problem? My data looks like this:
head(contig_len)
Length Prevalence
1 606 Repetitive (<5)
2 888 Non-Repetitive
3 192 Repetitive (<9)
4 9830 Non-Repetitive
5 506 Non-Repetitive
6 850 Non-Repetitive