I'd like to divide my data into 100 frequency bins, and then select a random observation from each frequency bin.
I have a data frame containing words and their frequencies in a corpus, like so:
word | frequency
---- | ---------
a | 72387
and | 112091
that | 87164
to | 71474
the | 98422
etc.
I know that I can bin the data using the cut
function, but I'm not sure how to then select one word randomly from each frequency bin.