-2

how can I transpose one spark DataFrame in such a way:

From:

Key Value
Key1 Value1
Key2 Value2
Key3 Value3

TO:

Key1 Key2 Key3
Value1 Value2 Value3

Thanks!

Robert Purtuc
  • 11
  • 1
  • 3

2 Answers2

4

You can apply pivot operation to transpose rows to columns.


from pyspark.sql import functions as F

data = [("Key1", "Value1", ),
("Key2", "Value2", ),
("Key3", "Value3", ), ]

df = spark.createDataFrame(data, ("Key", "Value", ))

df.groupBy().pivot("Key").agg(F.first("Value")).show()

"""
+------+------+------+
|  Key1|  Key2|  Key3|
+------+------+------+
|Value1|Value2|Value3|
+------+------+------+
"""
Nithish
  • 3,062
  • 2
  • 8
  • 16
1

pyspark.sql.GroupedData.pivot

df = spark.createDataFrame([('key1','value1'),('key2','value2'),('key3','value3')], ['key', 'value'])

import pyspark.sql.functions as F

df.groupBy().pivot('key').agg(F.first('value')).show()

or

df.groupBy().pivot('key').agg({"value":"first"}).show()

+------+------+------+
|  key1|  key2|  key3|
+------+------+------+
|value1|value2|value3|
+------+------+------+
David דודו Markovitz
  • 42,900
  • 6
  • 64
  • 88