I have a very wide df with a large number of columns. I need to get the count of non-null values per row for this in python.
Example DF -
+-----+----------+-----+-----+-----+-----+-----+-----+
| name| date|col01|col02|col03|col04|col05|col06|
+-----+----------+-----+-----+-----+-----+-----+-----+
|name1|2017-12-01|100.0|255.5|333.3| null|125.2|132.7|
|name2|2017-12-01|101.1|105.5| null| null|127.5| null|
I want to add a column with a count of non-null values in col01-col06 -
+-----+----------+-----+-----+-----+-----+-----+-----+-----+
| name| date|col01|col02|col03|col04|col05|col06|count|
+-----+----------+-----+-----+-----+-----+-----+-----+-----+
|name1|2017-12-01|100.0|255.5|333.3| null|125.2|132.7| 5|
|name2|2017-12-01|101.1|105.5| null| null|127.5| null| 3|
I was able to get this in a pandas df like this -
df['count']=df.loc[:,'col01':'col06'].notnull().sum(axis=1)
But no luck with spark df so far :( Any ideas?