Given a toy dataset as follows:
id room_type company_name
0 1 office ABC ltd
1 2 office retail
2 3 office xyz ltd
3 4 retail retail
4 5 parking toy store
5 6 hall NaN
If room_type
or company
columns contain retail, parking or hall
, then compare two columns, if they are not same, then returns a new column check
with string Invalid company name or room type
.
I would like to use code as follows since there are many other columns to check:
a = np.where(df['room_type'].str.contains('retail|parking|hall', na = False), 'Invalid company name or room type', None)
# b = np.where(df.area.str.contains('^\d+$', na = True), None,
# 'area is not a numbers')
f = (lambda x: ';'.join(y for y in x if pd.notna(y))
if any(pd.notna(np.array(x))) else np.nan )
df['check'] = [f(x) for x in zip(a)]
The expected result will like this:
id room_type company_name check
0 1 office ABC ltd NaN
1 2 office retail Invalid company name or room type
2 3 office xyz ltd NaN
3 4 retail retail NaN
4 5 parking toy store Invalid company name or room type
5 6 hall NaN Invalid company name or room type
How could I modify the condition a
code? Thanks for your help at advance.