Transpose operation on pyspark data frame using python

Question

I am new to python and pyspark, I have done transpose operation using pandas df.T. I have found that there is no direct operation on the pyspark dataframe (pyspark version = 2.2.0 and python version = 3.6.2)

I am loading a CSV file for the above operation using the following code

from pyspark.sql import SQLContext sql = SQLContext(spark_context) path = 'sample.csv' df = (sql.read.format("com.databricks.spark.csv").option("header","true").option("inferSchema", "true").load(path))

Possible duplicate of [Transpose column to row with Spark](https://stackoverflow.com/questions/37864222/transpose-column-to-row-with-spark) — Abdou, Aug 10 '17 at 12:00

score 0 · Answer 1 · answered Aug 10 '17 at 19:15

0

What is your data schema?

If it is some sort of sparse matrix, you can load with regular RDD and map + swap your coords.

answered Aug 10 '17 at 19:15

Michel Lemay

2,054
2
17
34

Transpose operation on pyspark data frame using python

1 Answers1