-1

I am new to python and pyspark, I have done transpose operation using pandas df.T. I have found that there is no direct operation on the pyspark dataframe (pyspark version = 2.2.0 and python version = 3.6.2)

I am loading a CSV file for the above operation using the following code

from pyspark.sql import SQLContext sql = SQLContext(spark_context) path = 'sample.csv' df = (sql.read.format("com.databricks.spark.csv").option("header","true").option("inferSchema", "true").load(path))

Sunil Rao
  • 800
  • 2
  • 6
  • 23

1 Answers1

0

What is your data schema?

If it is some sort of sparse matrix, you can load with regular RDD and map + swap your coords.

Michel Lemay
  • 2,054
  • 2
  • 17
  • 34