Passing some time away. Non-pandas scenario here, and in pyspark I can generate the column value being a value concatenated with relevant column name, e.g. a solution I provided: Appending column name to column value using Spark.
Then, the following:
import org.apache.spark.sql.functions._
import spark.implicits._
val df = sc.parallelize(Seq(
("r1", 0.0, 0.0, 0.0, 0.0),
("r2", 6.4, 4.9, 6.3, 7.1),
("r3", 4.2, 0.0, 7.2, 8.4),
("r4", 1.0, 2.0, 0.0, 0.0)
)).toDF("ID", "aa1a", "bb3", "ccc4", "d1ddd")
val count_zero = df.columns.tail.map(x => when(col(x) === 0.0, 1).otherwise(0)).reduce(_+_)
df.withColumn("zero_count", count_zero).show(false)
So, what if, for arguments sake (only),
I wanted to also check that the actual column name contained a '1' somewhere in its name, as an extra condition in order to add the 1.
And I wanted this in the val_count_zero within the when?
I am not interested in generating column lists, sequences to process.
As I stated it is for arguments sake. I cannot find the approach here to get column name check in Scala within a when for a dataframe.