ECDF plot in ggplot2 without expanding count variable

Asked Sep 17 '19 at 09:03

Active Sep 17 '19 at 09:03

Viewed 287 times

I have a dataframe which looks like

Height Count
173      2
184      3
193      1

Usually, to plot an empirical cumulative distribution function, one: 1) expands the dataframe by using e.g. splitstackshape's expandRows function to obtain the following:

2) plots the ECDF using the output variable.

However, my dataframe is very large and expandRows yields an error of the 'cannot allocate vector of size X' type. Yet, what I would be interested in is an (even approximate) plot of the ECDF, rather than an expanded dataframe.

Is there a workaround to achieve this?

asked Sep 17 '19 at 09:03

Jackk

I'm not exactly sure what your expected output is, can you give an example with the data? You might want to look at `geom_step` from `ggplot2`, – kath Sep 17 '19 at 09:28
1

Possible duplicate of . In particular see the given [link](https://github.com/NicolasWoloszko/stat_ecdf_weighted). – Stéphane Laurent Sep 17 '19 at 09:29
To get the output variable, you could do: `rep(df$Height, df$Count)`; and if you need it in a dataframe: `data.frame(output = rep(df$Height, df$Count))` – Jaap Sep 17 '19 at 10:46

ECDF plot in ggplot2 without expanding count variable

0 Answers0