From DataFrame, to list. Python Pandas

Question

I have a dataframe that has some null values in different columns. I want to create a list from that dataframe 'data' where I can see only the columns that have non null values. Also I have created a list of missing_row_counts that include the number of null values that each column has. Here is the code that I have.

def non_zeros(series):
    """Returns a list of the index values in series for which
    the value is greater than 0.
    """ 
    for i in missing_row_counts:
      nonzero_row = i > 0 #need to fix 
    return nonzero_row

the code above runs but when I call it with: missing_cols = non_zeros(missing_row_counts) missing_cols it returns True where I am expecting a list of columns that have values all their columns

check this out they have discuussed the same issue as you https://stackoverflow.com/questions/47414848/pandas-select-all-columns-without-nan — Sachin Rajput, Dec 11 '20 at 18:15

score 0 · Accepted Answer · answered Dec 11 '20 at 18:13

0

non_zeros = list(series[series > 0].index.values)

answered Dec 11 '20 at 18:13

Laggs

386
1
5

1

Yes, thank you Laggs. It worked. I cross-checked with the data and it checks. Cheers – Tenoch Dec 11 '20 at 18:15

score 0 · Answer 2 · answered Dec 11 '20 at 18:13

You can do it with vectorization like this:

import pandas as pd
import numpy as np
data = pd.DataFrame({"a": [1,2], "b": [3, np.NaN]})
non_nan_columns = data.columns[data.isnull().sum(axis=1) == 0]

PLease supply a working example in your next question ;)

score 0 · Answer 3 · answered Dec 11 '20 at 18:20

nonzero_row = i > 0 will always return True or False, because you of comparison which you are making.

However, easier way to do this would be to df.isna().any(), illustrative example below:

df = pd.DataFrame({'a': [1,2,3], 'b':[np.nan, 2,3], 'c': [1,2,3]})
col_has_na = df.isna().any() #this checks if `any` column of df has na
print(col_has_na)
a    False
b     True
c    False
#In above output `a` and `c` does not have na, hence False, whereas its True for `b`

#Fiter out and get index of columns which have False value
    print(col_has_na[~col_has_na].index.tolist())
    ['a', 'c']

From DataFrame, to list. Python Pandas

3 Answers3