I am currently working with data set of more than 100 columns, out of those 100 columns, the first four columns provide me with essential information like label, description, target, department. Apart from those four the other columns, provide me data values. There are some rows for those essential information who data values are null. I want to delete all the rows who's all data value is null.
So, basically what i did. I did a very long way. First, i split the whole table up into two tables. Where df1 stored my essential information (Label, description, target, department) and df2 stored my data values. Now for df2, i did the isnull() approach and find out which index gave me the null value. I noted down the index, and concated the two table. After concating, i basically dropped the rows according to the index i noted down.
df1 = pd.read_excel('***.xlsx',skiprows = 5)
df2 = df1.iloc[:,4:]
df2[df2.isnull().all(axis=1)] (*Used this to note down the index of null value rows*)
df1.drop(df1.iloc[:,4:],axis=1,inplace = True) (*Used this to get rid of the data value columns and only leave behind the essential information columns*)
new_df = pd.concat([df1,df2],axis = 1)
new_df.drop(new_df.index[[430,431,432]],inplace = True)
The following approach did do the justice. However, i getting a feeling its very long way,so i was wondering if there is any shorter approach to it?. Really appreciate your help