2

Let's say I've a dataframe something like this,

+---+----+------+
|id |name|salary|
+---+----+------+
|10 |abc |100   |
+---+----+------+

And I would like to pivot/transpose the data so that the output looks like,

+--------+----+
|col_name|data|
+--------+----+
|id      |10  |
|name    |abc |
|salary  |100 |
+--------+----+

How would I do this using pyspark.

2713
  • 185
  • 1
  • 10

2 Answers2

1

You can use stack as

s = ','.join([f"'{i}', `{i}`" for i in df.columns])
df = df.select([col(i).cast('string') for i in df.columns])
df.select(expr(f'''stack({len(df.columns)},{s})''')).show()

+------+----+
|  col0|col1|
+------+----+
|    id|  10|
|  name| abc|
|salary| 100|
+------+----+
Shubham Jain
  • 5,327
  • 2
  • 15
  • 38
0

I'm not aware of a spark function that does that. You could use expr(stack(...)) or do something similar to this.

jayrythium
  • 679
  • 4
  • 11