is there a way to calculate KDE of every column of a DataFrame?
I have a DataFrame where each column represents the values of one feature. The KDE function of Spark MLLib needs an RDD[Double]
of the sample values. The problem is I need to find a way without collecting the values for each column, because that would slow down the program to much.
Does anyone have an idea how I could solve that? Sadly all my tries failed till now.