0

I would like a function where if the area column has missing values (like NULL in SQL) the result is 'A' in the target 'wanted' variable.

I'm confused about use of None, isnull(), np.nan concepts in Python


raw_data = {'area': ['S','W',np.nan,np.nan], 'wanted': [np.nan,np.nan,'A','A']}
df = pd.DataFrame(raw_data, columns = ['area','wanted'])
df


def my_func(x):
    if (x) is None:
        return 'A'
    else:
        return np.nan


df['wanted2'] = df['area'].apply(my_func)

df
progster
  • 877
  • 3
  • 15
  • 27

2 Answers2

3

np.nan is not equal to None , infact NaN isnot equal to NaN as well (check np.nan == None) , hence you can utilize pd.isna() in your if condition:

def my_func(x):
    if pd.isna(x):
        return 'A'
    else:
        return np.nan


df['wanted2'] = df['area'].apply(my_func)

but you can vectorize this using np.where and series.isna() instead of using apply

df['wanted2'] = np.where(df['area'].isna(),'A',np.nan)
anky
  • 74,114
  • 11
  • 41
  • 70
0

You can use fill.na

df['wanted2'] = df.area.fillna('A')

In your code return np.nan if the value exists in area and 'A' otherwise.

Tom Ron
  • 5,906
  • 3
  • 22
  • 38