I'm new to R and ggplot2. I have a data frame where I would like to plot a histogram over one of the variables together with a subset of the same variable. Basically, what I want to do is the following
ggplot(df, aes(x = w, fill = area)) +
geom_histogram(binwidth = 1, position="dodge")
where area would be the all the data points in my df vs all points with area > 0. I cannot find the correct way to format my data frame to make this happen. At the moment this only gives the distributions area > 0 vs area = 0.
Thanks.
EDIT: How it works now
w = runif(50,min=1,max=5)
area = c(rep(0,25), runif(25))
df = data.frame(w, area)
### Wrong
for (i in 1:50){
if (df$area[i] > 0) {
df$size[i] <- "big"
}else {
df$size[i] <- "small"
}
}
ggplot(df, aes(x = w, fill = size)) +
geom_histogram(binwidth = 1, position="dodge")
How can I partition the data frame in a way that lets me plot the distribution of all data points vs the big ones?