Dataframe transform column for each day into rows

Asked Nov 08 '22 at 15:21

Active Nov 08 '22 at 15:24

Viewed 14 times

I have a csv that is formatted similar to the following:

Attr1	Attr2	10/1/22	10/2/22	10/3/22	etc.
Red	Square	5	10	12	0
Blue	Square	11	8	2	1
Red	Circle	1	12	3	4
Blue	Circle	3	5	7	6

I can load this into a dataframe, but I want to get it into this format:

Attr1	Attr2	Date	Qty
Red	Square	10/1/22	5
Red	Square	10/2/22	10
Red	Square	10/3/22	12
etc.	.	.	.
etc.	.	.	.
Blue	Circle	10/1/22	3
Blue	Circle	10/2/22	5
Blue	Circle	10/3/22	7

Issues:

the number of columns is variable (one per day) increasing each day
want to "explode" the date columns into 1 row per day while keeping the "attribute" columns

This is reformatting issue. No need for any aggregation or calculaiton.

Any ideas how to proceed? Thank you.

edited Nov 08 '22 at 15:24

asked Nov 08 '22 at 15:21

Patrick Flynn

2

Does this answer your question? [How to melt Spark DataFrame?](https://stackoverflow.com/questions/41670103/how-to-melt-spark-dataframe). or if you are using Pyspark 3.2+ (not sure the exact version), you can check pandas API on Pyspark. https://spark.apache.org/docs/3.2.0/api/python/reference/pyspark.pandas/api/pyspark.pandas.DataFrame.melt.html. – Emma Nov 08 '22 at 15:42

0 Answers0