Pandas get row number where value error occurs

Question

For like I have a df

name | point | age|
-------------------
Jon  | 20    | 26 |
Mike | abc   | 20 |

so I need to check if the point is int if not then I need to point out in which row and which column the Invalid data type is given

I need output like: Column 2, row 2

Do you really want "Column 2, row 2" or is `(1, 'point')` OK? — mozway, Aug 30 '23 at 13:12
My approach gives you the column name and the number in python numbering if you have a range index (if not first run `.reset_index(drop=True)`). I added an extra approach in my answer. — mozway, Aug 30 '23 at 13:43

mozway · Answer 1 · 2023-08-30T13:46:30.143

Assuming you want to identify all non-numeric cells:

m = df.apply(pd.to_numeric, errors='coerce').isna()

out = m.where(m).stack().index.tolist()

Output:

[(0, 'name'), (1, 'name'), (1, 'point')]

If you only want to identify invalid data in columns in which some values are numeric but not all:

m = df.apply(pd.to_numeric, errors='coerce').isna()
out = df.loc[:, ~m.all()].where(m).stack().index.tolist()

Output:

[(1, 'point')]

Finally, if you already have NaNs in the input and want to ignore those, use:

m = df.apply(pd.to_numeric, errors='coerce').isna() & df.notna()
out = df.loc[:, ~m.all()].where(m).stack().index.tolist()

Output:

[(1, 'point')]

Used input:

   name point   age
0   Jon    20   NaN
1  Mike   abc  20.0

If you need row/column numbers starting from 1:

tmp = df.set_axis(range(1, df.shape[0]+1)).set_axis(range(1, df.shape[1]+1), axis=1)

m = tmp.apply(pd.to_numeric, errors='coerce').isna()
out = tmp.loc[:, ~m.all()].where(m).stack().index.tolist()

print(out)
# [(2, 2)]

Pandas get row number where value error occurs

1 Answers1