0

In Python, I have a DataFrame that looks like the following, all the way down to about 5000 samples:

enter image description here

I was wondering, in pandas, how do I remove 3 out of every 4 data points in my DataFrame?

Gary
  • 2,137
  • 3
  • 23
  • 41

2 Answers2

4

To obtain a random sample of a quarter of your DataFrame, you could use

test4.sample(frac=0.25)

or, to specify the exact number of rows

test4.sample(n=1250))

If your purpose is to build training, validation, and testing data sets, then see this question.

unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • It's not related to machine learning, but rather, related to this question: https://stackoverflow.com/questions/45337886/valueerror-cannot-copy-sequence-to-array-axis-in-python-for-matplotlib-animatio/45342706#45342706 – Gary Jul 27 '17 at 20:49
1

If you want to select every 4th point, then you can do the following. This will select rows 0, 4, 8, ...:

test4.iloc[::4, :]['Accel']
nanojohn
  • 572
  • 1
  • 3
  • 13