0

I can use for loop to print every row of the data frame.

import pandas as pd

data = {"col1": [1.1, 2.2, 3.3]
      , "col2": [4.4, None, 6.6]}
csv = pd.DataFrame(data)

for i in range(len(csv)):
    col1 = csv["col1"][i]
    col2 = csv["col2"][i]
    print(col1, col2)

But when I use pd.notnull to filter data frame, for loop will have an error.

csv = csv[pd.notnull(csv["col2"])]

print("csv:{}".format(csv))
for i in range(len(csv)):
    col1 = csv["col1"][i]
    col2 = csv["col2"][i]
    print(col1, col2)

The error is

KeyError: 1

Anyone know iterate data frames by the name of the columns after pd.notnull.

CYC
  • 285
  • 5
  • 16
  • 1
    Problem is after removing NaNs rows is not possible looping by `for`, because it loops by range `0,1,2` and indices are `0,2` - `1` is missing because NaN row is removed – jezrael Nov 03 '18 at 06:34
  • The len(csv) is 2, it loops by range 0, 1, I think it will fine to iterate after removing NaN rows, but the error is happened – CYC Nov 03 '18 at 06:50
  • If use `csv["col1"][i]` then it select by indices, so by `0,2`. Your solution should working if create default indices by `csv = csv[pd.notnull(csv["col2"])].reset_index(drop=True)`. – jezrael Nov 03 '18 at 06:53
  • 1
    thanks for your explanation, I know the problem. – CYC Nov 03 '18 at 06:59

0 Answers0