0

I have a dataframe of 5 columns and I want to calculate median and Inter quartile range on all.

How do I write udf and call them on columns.

HaveNoDisplayName
  • 8,291
  • 106
  • 37
  • 47
nareshbabral
  • 821
  • 1
  • 10
  • 19
  • I think you have to sort every column and take the middle value in case of odd, and the average between both middle values if even. i don't know any other way to achieve this. – Alberto Bonsanto Feb 06 '16 at 15:14
  • Well, maybe I'm wrong due I'm not an expert, but I think that `UDFs` are more suited to record transformations or operations, than for this. – Alberto Bonsanto Feb 06 '16 at 15:26
  • 1
    Maybe this would help [How to calculate median in Spark SQLContext](http://stackoverflow.com/questions/34519549/how-to-claculate-median-in-spark-sqlcontext-for-column-of-data-type-double) – Alberto Bonsanto Feb 06 '16 at 15:34
  • This question has an answer at https://stackoverflow.com/questions/31432843/how-to-find-median-and-quantiles-using-spark. – Glenn Oct 30 '17 at 20:27

0 Answers0