1

I'm plotting a facet wrap chart using histograms and I note that there are some long vertical lines that affect the rest of the chart, which I would like to show more of. I know that can manually delete data from the lines and add an up-arrow to indicate how far they actually reach, but I was wondering if R has a feature that would do it with minimal (or no) impact on the data. I have some code, but the data I'm plotting is over 400,000 records and I'm not sure how to implement a display with Rnorm.

Code:

library(ggplot2)
library(gridExtra)
library(scales)
#_______________________________________________________________________________________________________________
dat <- read.table("C:/Projects/....T2S4.txt",
                    sep="\t", header=TRUE)
df<-data.frame(dat)
df$dist_f <- factor(df$dist, levels=c('Unused','Deducted','Carryover','Used'),ordered=TRUE)

ggplot(df)+
  geom_histogram(aes(x=points,fill=type),bins=50,position="dodge") +
  facet_wrap(.~dist_f,scales="free")+
  labs(x="Points",y="Number of Members")+
  scale_fill_manual(values=c("gray", "indianred4"))+
  theme(axis.title.y = element_text(size="14",margin = margin(t = 0, r = 10, b = 0, l = 0)),axis.title.x = element_text(size="14"),
        axis.text.x=element_text(size=10),axis.text.y=element_text(size=12),legend.title=element_blank(),legend.position=c(0.85, 0.90),
        legend.box.background = element_rect(),legend.box.margin = margin(2, 2, 2, 2),legend.text= element_text(size=12))
grid.rect(width = 1.0, height = 1.0, gp = gpar(lwd = 2.5, col = "black", fill = NA))

Chart:

enter image description here

Angus
  • 355
  • 2
  • 12
  • I don't think there's anything built in to handle this the way you're describing. But it's probably possible to make a small function that extracts the "exceptional" points (the appropriate method will depend on your specific use case), gives them a "topped off" value, and adds an arrow/notation as you describe. Can you articulate what rule would work best for your data? i.e. "any largest bin that is >50% higher than the 2nd biggest bin should be shown 20% higher, with a note about the true value" – Jon Spring Feb 23 '20 at 19:47
  • 2
    you can check out [the answer here](https://stackoverflow.com/questions/10504804/put-a-break-in-the-y-axis-of-a-histogram) or [here](https://stackoverflow.com/questions/7194688/using-ggplot2-can-i-insert-a-break-in-the-axis/7195107#7195107)... – Wimpel Feb 23 '20 at 19:49
  • Thanks Wimpel, but I read that plotrix doesn't work with ggplot. Maybe ggforce will work, but I like the plotrix feature better. – Angus Feb 23 '20 at 19:57

0 Answers0