In Pyspark how do I take multiple columns and make the names one column and the values another column?

Asked Aug 23 '23 at 16:26

Active Aug 23 '23 at 16:26

Viewed 15 times

I have a pyspark dataframe that looks like this:

Date	Bought	Delivered	Returned
07/05/2021	1054	1036	13
07/06/2021	2036	2015	21

I need it to ultimately look like this:

Date	Step	Total
07/05/2021	Bought	1054
07/05/2021	Delivered	1036
07/05/2021	Returned	13
07/06/2021	Bought	2036
07/06/2021	Delivered	2015
07/06/2021	Returned	21

What I have tried so far is just making a bunch of different tables and then I union them at the end. But this takes too much time and I was hoping to find a much better way to do this. And since this is multiple columns the explode isn't what I am looking for. But that's all I can find here on stackoverflow so any help would be appreciated! :)

asked Aug 23 '23 at 16:26

rch_frnds

unpivot: https://stackoverflow.com/questions/60211169/how-to-unpivot-a-large-spark-dataframe/60212279#60212279 – Vitaliy Aug 23 '23 at 17:06

In Pyspark how do I take multiple columns and make the names one column and the values another column?

0 Answers0