4

These are my dataframes.

df contains no values onlly column names,

P1  |P2 |P3



df4,

    Names   Std
0   Kumar   10
1   Ravi    5



mask=df4["Names"].str.contains(('|').join(df["P1"].values.tolist()),na=False)

Out[30]:
 0    True
 1    True
Name: Names, dtype: bool

Why it is giving True value when the "P!" column does not have any value in it ?

Pyd
  • 6,017
  • 18
  • 52
  • 109

1 Answers1

1

EDIT If need return Falses for empty column, you can add condition for check if column is not empty:

df = pd.DataFrame(columns=['P1','P2','P3'])
print (df)
Empty DataFrame
Columns: [P1, P2, P3]
Index: []

df4 = pd.DataFrame({'Names':['Kumar','Ravi']})

mask=df4["Names"].str.contains(('|').join(df["P1"].values.tolist()),na=False)
mask = mask & (not df['P1'].empty)
print (mask)
0    False
1    False
Name: Names, dtype: bool
df = pd.DataFrame({'P1':['Kumar']}, columns=['P1','P2','P3'])
print (df)
      P1   P2   P3
0  Kumar  NaN  NaN

df4 = pd.DataFrame({'Names':['Kumar','Ravi']})

mask=df4["Names"].str.contains(('|').join(df["P1"].values.tolist()),na=False)
mask = mask & (not df['P1'].empty)
print (mask)
0     True
1    False
Name: Names, dtype: bool
Pyd
  • 6,017
  • 18
  • 52
  • 109
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • ya, My question is why we are getting "True " when there is no match in the column – Pyd Jul 31 '17 at 05:13
  • Ok, what return `print (df['P1'].values)` ? – jezrael Jul 31 '17 at 05:14
  • nothing, as it does not have any values. It is an empty datacolumn with column name along df, P1 |P2 |P3 – Pyd Jul 31 '17 at 05:18
  • df["P1"].values Out[34]: array([], dtype=object) – Pyd Jul 31 '17 at 05:19
  • I think there is problem you compare with empty string, so get all `True`s – jezrael Jul 31 '17 at 05:24
  • what If I want "False " for the empty column ? – Pyd Jul 31 '17 at 05:26
  • So first dataframe is like `df = pd.DataFrame(columns=['P1','P2','P3', ''])` ? There is column name with empty string? – jezrael Jul 31 '17 at 05:34
  • no only 3 columns df.columns Index(['P1', 'P2', 'P3'], dtype='object') – Pyd Jul 31 '17 at 05:48
  • So `mask = df4['Names'].str.contains('|'.join(df.columns),na=False)` does not work? – jezrael Jul 31 '17 at 05:49
  • that is giving "False" but I want to compare only one particular column – Pyd Jul 31 '17 at 05:52
  • https://stackoverflow.com/questions/45413825/dropping-cell-if-it-is-nan-in-a-dataframe-in-python can you check this one – Pyd Jul 31 '17 at 10:55
  • pls check this one https://stackoverflow.com/questions/46907847/analyzing-a-dataframe-based-on-multiple-conditions?noredirect=1#comment80762432_46907847 – Pyd Oct 24 '17 at 10:20
  • hi, @Jezrael, can you check this one https://stackoverflow.com/questions/47012242/how-to-create-a-dataframe-to-generate-json-in-the-given-format?noredirect=1#comment80972199_47012242 – Pyd Oct 30 '17 at 10:29