I have the e.g. [100,30] data frame, I would like to find which rows have more than 20 columns with values? Well all rows have 30 columns, but some of them have NaN values, therefore I set a limit of 20 columns and I wanna find which rows have values for the column after col.iloc = 20
For example, even though at row number 05 we have more Nan value, due to distribution I would like to find which rows have more than 3 columns either with the value of X or Nan (in the table below I want to find the index of rows number 1, 3, 4 and 7. because they have value for the columns after index.column=3, and my question is among the large data set how should I find them).
ID | Col.01 | Col.02 | Col.03 | Col.04 | Col.05 |
---|---|---|---|---|---|
01 | X | X | X | X | X |
02 | X | X | X | Nan | Nan |
03 | X | X | X | X | X |
04 | X | X | X | X | Nan |
05 | X | Nan | Nan | Nan | Nan |
06 | X | X | X | Nan | Nan |
07 | X | X | X | X | Nan |
My expected result:
I find out that the ID of the row: 1,3,4,7 are the rows have values in column>3, so I can delete them after that
ID | Col.01 | Col.02 | Col.03 | Col.04 | Col.05 |
---|---|---|---|---|---|
02 | X | X | X | Nan | Nan |
05 | X | Nan | Nan | Nan | Nan |
06 | X | X | X | Nan | Nan |
but in large dataframe It would not that easy