1

There is a lot of data can be more than 9,000,000. I would like to delete all the rows that correspond to the case where the data value is less than the condition in a specific column.

Due to the large amount of data, it takes a long time to write the loop for and if statements.

Suppose my dataframe looks like this:

           x         y         z         c         a         T
0      0.000  -252.396     0.000    40.676    51.159    84.641
1   1383.800     1.000  -252.396     0.000    40.676    61.947
2     84.641  1404.800     2.000  -252.396     0.000    40.676
3     74.532    84.641  1394.800     3.000  -252.396     0.000
4     40.676    85.319    84.641  1367.700     4.000  -252.396
5      0.000    40.676    97.904    84.641  1363.800     5.000
6   -252.396     0.000    40.676   108.691    84.641  1348.500
7      6.000  -252.396     0.000    40.676   121.276    84.641
8   1421.600     7.000  -252.396     0.000    40.676   135.659
9     84.641  1455.300     8.000  -252.396     0.000    40.676
10   148.244    84.641  1529.700     9.000  -252.396     0.000

I want to delete any rows where column 'T' is less than 800

piRSquared
  • 285,575
  • 57
  • 475
  • 624
J_SH
  • 51
  • 3

1 Answers1

0

Setup

df = pd.DataFrame(
    [[0.0, -252.396, 0.0, 40.676, 51.159, 84.641],
    [1383.8, 1.0, -252.396, 0.0, 40.676, 61.947],
    [84.641, 1404.8, 2.0, -252.396, 0.0, 40.676],
    [74.532, 84.641, 1394.8, 3.0, -252.396, 0.0],
    [40.676, 85.319, 84.641, 1367.7, 4.0, -252.396],
    [0.0, 40.676, 97.904, 84.641, 1363.8, 5.0],
    [-252.396, 0.0, 40.676, 108.691, 84.641, 1348.5],
    [6.0, -252.396, 0.0, 40.676, 121.276, 84.641],
    [1421.6, 7.0, -252.396, 0.0, 40.676, 135.659],
    [84.641, 1455.3, 8.0, -252.396, 0.0, 40.676],
    [148.244, 84.641, 1529.7, 9.0, -252.396, 0.0]],
    columns=['x', 'y', 'z', 'c', 'a', 'T']
)

query

df.query('T >= 800')

         x    y       z        c       a       T
6 -252.396  0.0  40.676  108.691  84.641  1348.5

This should be pretty fast.


boolean mask

df[df['T'] >= 800]

         x    y       z        c       a       T
6 -252.396  0.0  40.676  108.691  84.641  1348.5
piRSquared
  • 285,575
  • 57
  • 475
  • 624