I have a pandas data frame with 83 columns and 4000 rows. I intend to use the data for a logistic regression and therefore want to narrow down my columns to those that have the least amount of missing data.
To do this I was thinking of ranking them based on the frequency of NaN observations. I tried a few things like
econ_balance["BG.GSR.NFSV.GD.ZS"].describe()
econ_balance["BG.GSR.NFSV.GD.ZS"].value_counts
econ_balance["BG.GSR.NFSV.GD.ZS"]["NaN"]
econ_balance["BG.GSR.NFSV.GD.ZS"][NaN]
None of which seem to work. I always tried googling to see if this question has been answered before but no luck.
Thanks in advance for the help
Josh