Have been trying to push a particular row in a Spark Dataframe to the end of the Dataframe. This is what I have tried so far.
Input Dataframe:
+-------------+-------+------------+
|expected_date|count |Downstream |
+-------------+-------+------------+
|2018-08-26 |1 |abc |
|2018-08-26 |6 |Grand Total |
|2018-08-26 |3 |xyy |
|2018-08-26 |2 |xxx |
+-------------+-------+------------+
Code:
df.withColumn("Downstream_Hierarchy", when(col("Downstream") === "Grand Total", 2)
.otherwise(1))
.orderBy(col("Downstream_Hierarchy").asc)
.drop("Downstream_Hierarchy")
Output Dataframe:
+-------------+-------+------------+
|expected_date|count |Downstream |
+-------------+-------+------------+
|2018-08-26 |1 |abc |
|2018-08-26 |3 |xyy |
|2018-08-26 |2 |xxx |
|2018-08-26 |6 |Grand Total |
+-------------+-------+------------+
Is there a simpler way to do this?