Pandas: starting from the last row and ascending, drop all the rows that fulfill a multicolumn condition

Question

Here's df.head():

n        Ideal Time      Obs Time        dt  current_x  current_y  current_z      xdot  ...  est_stop_pt_y  est_stop_pt_z  est_stopping_dist_x  est_stopping_dist_y  est_stopping_dist_z  overshoot_dist  prev_safe_speed
                                                                                     ...                                                                                                                              
13000      52.000  1.634575e+09  0.004199   0.450249  -0.779854   1.007519  0.000004  ...      -0.779856       1.007518         8.248364e-07         2.123009e-06         1.098734e-06       -0.265144              1.0
13001      52.004  1.634575e+09  0.003862   0.450249  -0.779854   1.007519 -0.000021  ...      -0.779855       1.007518         3.638152e-06         1.185327e-06         6.202958e-07       -0.265145              1.0
13002      52.008  1.634575e+09  0.004137   0.450249  -0.779854   1.007519  0.000008  ...      -0.779857       1.007518         1.575013e-06         2.736756e-06         7.076864e-07       -0.265143              1.0
13003      52.012  1.634575e+09  0.004046   0.450249  -0.779854   1.007519 -0.000002  ...      -0.779855       1.007518         4.503393e-07         1.165414e-06         1.197243e-06       -0.265145              1.0
13004      52.016  1.634575e+09  0.003942   0.450249  -0.779854   1.007519  0.000013  ...      -0.779854       1.007518         2.393685e-06         1.689638e-07         7.067140e-07       -0.265146              1.0

I want to remove all the lines starting from the end such that xdot, ydot, AND zdot (not sure why these col names aren't showing in head()) are less than 1E-4. I think I have an idea of how to do this: I think I need to have a new bool column with a true or false for this condition for each row, and then somehow create groups and keep everything except the last group. I'm new to pandas though and I'm stumbling on the actual implementation. Can anyone advise? I'm starting off looking at case_when based on this post, although this doesn't seem great because I don't actually want to save the conditional column.

Corralien · Answer 1 · 2021-10-21T06:26:09.507

0

Use a boolean mask as you supposed:

df = df[~df.filter(regex='[xyz]dot').lt(1e-4).all(axis=1)]

Example:

# Minimal setup
>>> df
       xdot      ydot      zdot
0  0.000961  0.000091  0.000060
1  0.000087  0.000079  0.000727
2  0.000046  0.000936  0.000078
3  0.000053  0.000028  0.000082  # drop row because all values are less than 1e-4

>>> df[~df.filter(regex='[xyz]dot').lt(1e-4).all(axis=1)]
       xdot      ydot      zdot
0  0.000961  0.000091  0.000060
1  0.000087  0.000079  0.000727
2  0.000046  0.000936  0.000078

edited Oct 21 '21 at 06:26

answered Oct 20 '21 at 19:38

Corralien

109,409
8
28
52

Got called away for a different task at work; will try this out shortly – adamcircle Oct 21 '21 at 13:40

Pandas: starting from the last row and ascending, drop all the rows that fulfill a multicolumn condition

1 Answers1