I have a dataframe of 70.000 rows which I want to reduce to 10.000. I know the cost is huge data loss, but I have my reasons. I want the cut-down to be evenly distributed throughout the data set, not just removing the first or last 60.000 rows. Is there a way to do this? If it's to any help, my dataframe looks like this:
ID username text date
1 @calr lorem ipsum... 2012-05-05
2 @mart lorem ipsum... 2012-05-05
3 @falk lorem ipsum... 2012-05-05
4 @grif lorem ipsum... 2012-05-05