Iterate pandas data frames by name of columns with pd.isnull

Asked Nov 03 '18 at 06:30

Active Nov 03 '18 at 06:43

Viewed 47 times

I can use for loop to print every row of the data frame.

import pandas as pd

data = {"col1": [1.1, 2.2, 3.3]
      , "col2": [4.4, None, 6.6]}
csv = pd.DataFrame(data)

for i in range(len(csv)):
    col1 = csv["col1"][i]
    col2 = csv["col2"][i]
    print(col1, col2)

But when I use pd.notnull to filter data frame, for loop will have an error.

csv = csv[pd.notnull(csv["col2"])]

print("csv:{}".format(csv))
for i in range(len(csv)):
    col1 = csv["col1"][i]
    col2 = csv["col2"][i]
    print(col1, col2)

The error is

KeyError: 1

Anyone know iterate data frames by the name of the columns after pd.notnull.

edited Nov 03 '18 at 06:43

asked Nov 03 '18 at 06:30

CYC

1

Problem is after removing NaNs rows is not possible looping by `for`, because it loops by range `0,1,2` and indices are `0,2` - `1` is missing because NaN row is removed – jezrael Nov 03 '18 at 06:34
The len(csv) is 2, it loops by range 0, 1, I think it will fine to iterate after removing NaN rows, but the error is happened – CYC Nov 03 '18 at 06:50
If use `csv["col1"][i]` then it select by indices, so by `0,2`. Your solution should working if create default indices by `csv = csv[pd.notnull(csv["col2"])].reset_index(drop=True)`. – jezrael Nov 03 '18 at 06:53
1

thanks for your explanation, I know the problem. – CYC Nov 03 '18 at 06:59

Iterate pandas data frames by name of columns with pd.isnull

0 Answers0