I'd like to make a density plot in rpy2 (using ggplot2) that has y-values representing fractional counts, such that the y axis can be interpreted as "fraction of data points" that have a particular value. My code is:
df = pandas.melt(pandas.DataFrame({"x": np.random.rand(1000),
"y": list(np.random.rand(20)) + [np.nan] * 980}))
# pandas dataframe to R
r_df = make_r_df(df)
r.pdf("plot.pdf")
p = ggplot2.ggplot(r_df) + \
ggplot2.geom_density(aes_string(x="value",
y="..count../..sum..(..count..)")) + \
ggplot2.facet_wrap(Formula("~ variable"))
p.plot()
x
has more points than y
and the resulting plot shows that the density for y
is uniformly lower -- this does not make sense if the y axis is normalized to the number of points. It seems like y=..count../..sum..(..count..)
is somehow not interpreted. How can I get this to work? thanks.