How to display pivoted dataframe with PySark, Pyspark?

Question

I cannot display/show/print a pivoted dataframe with PySpark. Although the dataframe seems to have been pivoted, when I try to use show() on it, it says AttributeError: 'GroupedData' object has no attribute 'show'.

Here's the code

meterdata = sqlContext.read.format("com.databricks.spark.csv").option("delimiter", ",").option("header", "false").load("/CBIES/meters/")

metercols = meterdata.groupBy("C0").pivot("C1")
metercols.show()  


Output:  Traceback (most recent call last): File "/tmp/zeppelin_pyspark-8003809301447367155.py", line 239, in eval(compiledCode) File " ", line 1, in AttributeError: 'GroupedData' object has no attribute 'show'

score 4 · Answer 1 · answered Jan 27 '17 at 13:21

The pivot() method returns a GroupedData object, just like groupBy(). You cannot use show() on a GroupedData object without using a aggregate function (such as sum() or even count()) on it before.

See this article or the PySpark documentation for more info.

How to display pivoted dataframe with PySark, Pyspark?

1 Answers1