To get the True row if I have both True & False rows for the same Item using pandas

Question

I have a dataframe with this data.

import pandas as pd

data = {'Item':['2', '1', '2'],
    'IsAvailable':['True', 'False', 'False']}
df = pd.DataFrame(data)
================================

Item  |  IsAvailable
---------------------
  2   |     True
  1   |     False
  2   |     False

In the dataframe, I have data like above shown. As you can see I have both True as well as False for Item 2. In that case I want to have a single record with just True.

Expected output:

Item  |  IsAvailable
---------------------
  2   |     True
  1   |     False

Please help in writing the condition for this using python pandas.

Thanks

Oleg O · Answer 1 · 2020-03-05T10:39:18.443

1

Since bool is also kind of int:

df = df.sort_values('IsAvailable').drop_duplicates(subset=['Item'], keep='last')

This will reorder your items though, of course. Funny thing: it works even when you have True/False strings.

edited Mar 05 '20 at 10:39

answered Mar 05 '20 at 10:01

Oleg O

1,005
6
11

jezrael · Answer 2 · 2020-03-05T09:44:24.110

0

I think you need first replace strings True and False to boolean if necessary and then get first row with True per groups by DataFrameGroupBy.idxmax for indices and selecting by DataFrame.loc:

df['IsAvailable'] = df['IsAvailable'].map({'True':True, 'False':False})

df = df.loc[df.groupby('Item', sort=False)['IsAvailable'].idxmax()]
print (df)
  Item  IsAvailable
0    2         True
1    1        False

edited Mar 05 '20 at 09:44

answered Mar 05 '20 at 09:38

jezrael

822,522
95
1,334
1,252

braml1 · Answer 3 · 2020-03-05T09:51:12.967

0

If you just want the first occurence: Edit: as per @jezrael, you may want to map your strings to booleans first

df['IsAvailable'] = df['IsAvailable'].replace({'True':True, 'False':False})
dfOut = df.drop_duplicates(subset="Item", keep='first')
print(dfOut)

  Item IsAvailable
0    2        True
1    1       False

edited Mar 05 '20 at 09:51

answered Mar 05 '20 at 09:40

braml1

584
3
13

It not test `True` and `False`, working because sample data only. – jezrael Mar 05 '20 at 09:43

Nanna · Answer 4 · 2020-03-05T11:09:41.383

0

Here is a solution where we check if the value True is one of the values assigned to each item. If so, the outcome is also True.

>>> df.groupby(['Item'])['IsAvailable'].apply(lambda x: 'True' in set(x))
Item
1    False
2     True
Name: IsAvailable, dtype: bool

If you want to keep the column name, use

>>> df.groupby(['Item'])['IsAvailable'].apply(lambda x: 'True' in set(x)).reset_index()
  Item  IsAvailable
0    1        False
1    2         True

edited Mar 05 '20 at 11:09

answered Mar 05 '20 at 09:50

Nanna

515
1
9
25

Your previous code works i.e, with any(x)... true in x is giving me the wrong output. – Suhas_mudam Mar 05 '20 at 10:22
Yes, I noticed that as well. any(x) gave true as long as there was some value in the list that x is. Perhaps it is interpreting 'False' as string, and bool('False') is True in python. But the current version works, (lambda x: True in x). – Nanna Mar 05 '20 at 10:28
The output is not as per my request. Please check once the expected output. – Suhas_mudam Mar 05 '20 at 10:49
I'm sorry you're right. My mistake is explained here: https://stackoverflow.com/a/21320011/8446061. See updated answer. – Nanna Mar 05 '20 at 11:10

To get the True row if I have both True & False rows for the same Item using pandas

4 Answers4