Drop only cell values in Pandas where value is NAN

Question

I have a dataframe with multiple columns. Many of the cells have NaN values that I want to drop but only that cell, not the entire row or even the column just that cell. The DataFrame looks something like this

Column1 | Column2 | Column3 | ... |ColumnX
1       | NaN     | NaN     | ....| NaN
2       | NaN     | NaN     | ....| NaN
.
.
.
12       | 1      | NaN     | ....| NaN
13       | 2      | NaN     | ....| NaN
.
.
.
21       | 11     | 1        | ....| NaN
22       | 12     | 2        | ....| NaN

and so on.

The final output should look like

Column1 | Column2 | Column3  | ... |ColumnX
    1       | 1    | 1       | ... | 1
    2       | 2    | 2       | ... | 2
    .
    .
    .
    12       | 12  | 11      | ....| 11
    13       | 13  | 12      | ....| 12
    .
    .
    .
    21       | 21  | 21      | ....| 21
    22       | 22  | 22      | ....| 22

Any idea if this can be achieved?

Please share the expected output based on your input dataframe also. — Mayank Porwal, Nov 14 '21 at 07:40
you cant "drop a cell" ... you could change it to something else... — Joran Beasley, Nov 14 '21 at 07:43
@JoranBeasley yes that's what I thought and am doing but I was wondering if there was a way that I wasn't aware of — sanster9292, Nov 14 '21 at 07:44
Do you mean like [How to move Nan values to end in all columns](https://stackoverflow.com/questions/52621834/how-to-move-nan-values-to-end-in-all-columns)? Then dropna the nan rows? Or like [Remove NaN 'Cells' without dropping the entire ROW (Pandas,Python3)](https://stackoverflow.com/q/25941979/15497888)? — Henry Ecker, Nov 14 '21 at 07:45

score 1 · Answer 1 · answered Nov 14 '21 at 08:16

From your expected output, it does look you want to "count" the NaNs in each column, and substitute them with their occurrence number.

A quick way to achieve this could be the following:

you define a function which does the substitutions you need

import pandas as pd
import numpy as np

def sub(x):
    mask = [i for i,y in enumerate(x) if np.isnan(y)]
    x[mask] = [x + 1 for x range(len(mask))] # apply ANY transformation you need to x
    return x

you apply that to each column (let's define a simple dataframe first):

dt = pd.DataFrame({"col1":[1,2,3], "col2":[np.nan, np.nan, 1], "col3":[1,np.nan,2]})

dt.apply(sub)

Drop only cell values in Pandas where value is NAN

1 Answers1