1

I have a dataframe which has nan or empty cell in specific column for example column index 2. unfortunately I don't have subset. I just have index. I want to delete the rows which has this features. in stackoverflow there are too many soluntions which are using subset

This is the dataframe for example:

12 125 36 45 665

15 212 12 65 62

65 9 nan 98 84

21 54 78 5 654

211 65 58 26 65

...

output:

12 125 36 45 665

15 212 12 65 62

21 54 78 5 654

211 65 58 26 65

Saeid Vaygani
  • 179
  • 1
  • 1
  • 8

3 Answers3

1

If need test third column (with index=2) use boolean indexing if nan is missing value np.nan or string nan:

idx = 2


df1 = df[df.iloc[:, idx].notna() & df.iloc[:, idx].ne('nan')]

#if no value is empty string or nan string or missing value NaN/None
#df1 = df[df.iloc[:, idx].notna() & ~df.iloc[:, idx].isin(['nan',''])]
print (df1)
     0    1     2   3    4
0   12  125  36.0  45  665
1   15  212  12.0  65   62
3   21   54  78.0   5  654
4  211   65  58.0  26   65

If nans are missing values:

df1 = df.dropna(subset=df.columns[[idx]])
print (df1)
     0    1     2   3    4
0   12  125  36.0  45  665
1   15  212  12.0  65   62
3   21   54  78.0   5  654
4  211   65  58.0  26   65
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

Not sure what you mean by

there are too many soluntions which are using subset

but the way to do this would be

df[~df.isna().any(axis=1)]
ignoring_gravity
  • 6,677
  • 4
  • 32
  • 65
  • I want to search according to specific column for example column 2 in row 35 has nan or empty cell. after find it delete the row 35. – Saeid Vaygani Sep 07 '22 at 10:22
0

You can use notnull()

df = df.loc[df[df.columns[idx]].notnull()]