How should I remove nan values from a dataframe in python?

Question

I've got an excel file and I created lists from its columns. The problem is the rows of the columns is not equal. Therefore, I have multiple 'nan' values at ends of the lists. I tried to delete them with dropna() method but there are still the 'nan' values. Here is my code:

import pandas as pd

excel_name = r'file_name.xlsx'
df = pd.read_excel(excel_name, engine='openpyxl')
df.dropna()

clomun_1 = list(df['clomun1'])
clomun_2 = list(df['clomun2'])
clomun_3 = list(df['clomun3'])
print(clomun_1)
print(clomun_2)
print(clomun_3)

output:

clomun_1 = ['value1', 'value2', 'value3', 'value4', 'nan', 'nan', 'nan', 'nan']
clomun_2 = ['value1', 'value2', 'value3', 'value4', 'value5', 'value6', 'nan', 'nan']
clomun_3 = ['value1', 'value2', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan']

I want to keep only values. I must delete "nan" elements.

Does this answer your question? [How to convert 'NaN' strings in a pandas Series to null values for dropna?](https://stackoverflow.com/questions/53716944/how-to-convert-nan-strings-in-a-pandas-series-to-null-values-for-dropna) — Amit Singh, Jun 02 '21 at 19:13
This was already answered on here: [Remove Nan](https://stackoverflow.com/questions/21011777/how-can-i-remove-nan-from-list-python-numpy) — stefan_aus_hannover, Jun 02 '21 at 19:15
I think you're reading literal "nan" strings from the spreadsheet. A numerical nan wouldn't have quotes around them. That's probably why dropna didn't drop them. — Ben Y, Jun 02 '21 at 19:17

Gun · Accepted Answer · 2021-06-02T19:48:10.777

3

Try this:

df = pd.read_excel(excel_name, engine='openpyxl', na_values=['nan']) #add na_values

clomun_1 = df['clomun1'].dropna().tolist()

print(clomun_1)

['value1', 'value2', 'value3', 'value4']

edited Jun 02 '21 at 19:48

answered Jun 02 '21 at 19:23

Gun

556
6
21

returns : [True, True, True, True] – Nirevezs Jun 02 '21 at 19:30
That's actually works. Thank you. You're the man. – Nirevezs Jun 02 '21 at 19:55

score 0 · Answer 2 · answered Jun 02 '21 at 19:22

0

You can use a lambda function to achieve this.

clomun_1_new= [x for x in clomun_1 if x!='nan']

repeat the same for other two lists.

answered Jun 02 '21 at 19:22

Abishek VK

506
3
12

Not working. 'nan ' values stille there after print(clomun_1_new) – Nirevezs Jun 02 '21 at 19:33
It is working for me. Try changing the line to: clomun_1_new= [x for x in clomun_1 if str(x) != 'nan'] – Abishek VK Jun 02 '21 at 19:54

How should I remove nan values from a dataframe in python?

2 Answers2