0

First time posting a Python related question here, please bear with me.

I have a panda dataframe with a lot of rows and columns and I would like to create a new Excel spreadsheet (using xlsxwriter) with the Nan values.

E.g. in the rows I have Name, Age, Job, Address, Country. I would like to create a new excel document only with the rows and columns where Job == Nan/blank.

I don't want to fill the nan using .fillna, just want to see every person in the dataset where Job is not listed.

I hope you can help me, and I'm sorry if my question is confusing, english is not my first language, and I'm really new at Python and Pandas.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214

2 Answers2

0

Subset the dataframe based on the condition and then write that dataframe to excel-

df_nan_job = df[pd.isnull(df.Job)]
df_nan_job.to_excel("filename.xlsx")
Shradha
  • 2,232
  • 1
  • 14
  • 26
0

If I understood your question, you may want to filter the dataframe to retains row with column Jobis Nan/blank.

Here is an example.

df = pd.DataFrame(np.random.rand(5, 5), columns = ['Name', 'Age', 'Job', 'Address', 'Country'])
print(df)



      Name       Age       Job   Address   Country
0  0.544757  0.874998  0.592347  0.752700  0.949128
1  0.661037  0.378120  0.235350  0.420469  0.737738
2  0.944081  0.864771  0.084817  0.029996  0.262594
3  0.047762  0.903625  0.940079  0.215649  0.631589
4  0.160029  0.838065  0.560037  0.978875  0.445476

You can now filter:

df['Job'] = ['a', 'b', np.nan, 'c', 'd']
df_sol = df[df['Job'].isnull()]
print(df_sol)

which returns:

       Name       Age  Job   Address   Country
2  0.944081  0.864771  NaN  0.029996  0.262594

Now you can save your file:

df_sol.to_excel("fpath.xlsx")
antoine
  • 662
  • 4
  • 10