how can I transpose one spark DataFrame in such a way:
From:
Key | Value |
---|---|
Key1 | Value1 |
Key2 | Value2 |
Key3 | Value3 |
TO:
Key1 | Key2 | Key3 |
---|---|---|
Value1 | Value2 | Value3 |
Thanks!
how can I transpose one spark DataFrame in such a way:
From:
Key | Value |
---|---|
Key1 | Value1 |
Key2 | Value2 |
Key3 | Value3 |
TO:
Key1 | Key2 | Key3 |
---|---|---|
Value1 | Value2 | Value3 |
Thanks!
You can apply pivot
operation to transpose rows to columns.
from pyspark.sql import functions as F
data = [("Key1", "Value1", ),
("Key2", "Value2", ),
("Key3", "Value3", ), ]
df = spark.createDataFrame(data, ("Key", "Value", ))
df.groupBy().pivot("Key").agg(F.first("Value")).show()
"""
+------+------+------+
| Key1| Key2| Key3|
+------+------+------+
|Value1|Value2|Value3|
+------+------+------+
"""
df = spark.createDataFrame([('key1','value1'),('key2','value2'),('key3','value3')], ['key', 'value'])
import pyspark.sql.functions as F
df.groupBy().pivot('key').agg(F.first('value')).show()
or
df.groupBy().pivot('key').agg({"value":"first"}).show()
+------+------+------+
| key1| key2| key3|
+------+------+------+
|value1|value2|value3|
+------+------+------+