I am working on missing data for a credit risk analysis project. There are missing values in many columns of the Dataframe. Dataframe loan_data is as below:
[IN]: loan_data
[OUT]:
Emp_ID Emp_Name City_Name Salary Designation Emp_years Age
1 A Delhi 30,00,000 GM 15 45
2 B Mumbai NAN Clerk 2 22
3 c NAN NAN Peon 4 18
4 D Chennai 7,000 NAN 5 20
5 E NAN NAN NAN 4 50
and so on....
Now I want the only columns to be displayed should be those where I have NAN values and I want their sum(how many rows have NAN Values
For example,
[IN]:
def return_loan_data_missing(x):
if (x.isnull().sum()>0):
return x.isnull().sum()
return_loan_data_missing(loan_data)
[OUT]:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(),
a.item(), a.any() or a.all().
Output desired:
[OUT]:
City_Name 2
Salary 3
Designation 2
Right now output I am getting :
[IN]:
loan_data.isnull().sum()
[OUT]:
Emp_ID 0
Emp_Name 0
City_Name 2
Salary 3
Designation 2
Emp_years 0
Age 0
Please help