How to select rows in pandas dataframe based on condition

Question

I have a huge data and my python pandas dataframe looks like this:

HR	SBP	DBP	SepsisLabel	PatientID
92	120	80	0	0
98	115	85	0	0
93	125	75	0	0
95	130	90	0	1
102	120	80	1	1
109	115	75	1	1
94	135	100	0	2
97	100	70	0	2
85	120	80	0	2
88	115	75	0	3
93	125	85	1	3
78	130	90	1	3
115	140	110	0	4
102	120	80	0	4
98	140	110	0	4

I want to select only those rows based on PatientID which have SepsisLabel = 1. Like PatientID 0, 2, and 4 don't have sepsis label 1. So, I don't want them in new dataframe. I want PatientID 1 and 3, which have SepsisLabel = 1 in them.

I hope you can understand what I want to say. If so, please help me with a python code. I am sure it needs some condition along with iloc() function (I might be wrong).

Regards.

Does this answer your question? [How to select rows from a DataFrame based on column values](https://stackoverflow.com/questions/17071871/how-to-select-rows-from-a-dataframe-based-on-column-values) — George Sotiropoulos, May 04 '21 at 07:53

jezrael · Accepted Answer · 2021-05-04T07:52:17.603

Use GroupBy.transform with GroupBy.any for test if at least one True per groups and filtering by boolean indexing:

df1 = df[df['SepsisLabel'].eq(1).groupby(df['PatientID']).transform('any')]

Or filter all groups with 1 and filter them in Series.isin:

df1 = df[df['PatientID'].isin(df.loc[df['SepsisLabel'].eq(1), 'PatientID'])]

If small data or performance not important is possible use DataFrameGroupBy.filter:

df1 = df.groupby('PatientID').filter(lambda x: x['SepsisLabel'].eq(1).any())

print (df1)
     HR  SBP  DBP  SepsisLabel  PatientID
3    95  130   90            0          1
4   102  120   80            1          1
5   109  115   75            1          1
9    88  115   75            0          3
10   93  125   85            1          3
11   78  130   90            1          3

HR	SBP	DBP	SepsisLabel	PatientID
92	120	80	0	0
98	115	85	0	0
93	125	75	0	0
95	130	90	0	1
102	120	80	1	1
109	115	75	1	1
94	135	100	0	2
97	100	70	0	2
85	120	80	0	2
88	115	75	0	3
93	125	85	1	3
78	130	90	1	3
115	140	110	0	4
102	120	80	0	4
98	140	110	0	4

HR	SBP	DBP	SepsisLabel	PatientID
92	120	80	0	0
98	115	85	0	0
93	125	75	0	0
95	130	90	0	1
102	120	80	1	1
109	115	75	1	1
94	135	100	0	2
97	100	70	0	2
85	120	80	0	2
88	115	75	0	3
93	125	85	1	3
78	130	90	1	3
115	140	110	0	4
102	120	80	0	4
98	140	110	0	4

How to select rows in pandas dataframe based on condition

1 Answers1

HR	SBP	DBP	SepsisLabel	PatientID
92	120	80	0	0
98	115	85	0	0
93	125	75	0	0
95	130	90	0	1
102	120	80	1	1
109	115	75	1	1
94	135	100	0	2
97	100	70	0	2
85	120	80	0	2
88	115	75	0	3
93	125	85	1	3
78	130	90	1	3
115	140	110	0	4
102	120	80	0	4
98	140	110	0	4