I'm working on a problem where I have imported a DB table into Apache Spark.
I have converted it into a DataFrame. Then I performed a RegisterTempTable so that I can use Hive Queries on it.
I'm able to perform other mathematical operations like,
sqlContext.sql("select avg(Amount) from Table1001").show
However I'm unable to find the median for a field called Amount
. Is there any way to find the median on this DataFrame?
Kindly provide a suitable solution.