-1

I cannot display/show/print a pivoted dataframe with PySpark. Although the dataframe seems to have been pivoted, when I try to use show() on it, it says AttributeError: 'GroupedData' object has no attribute 'show'.

Here's the code

meterdata = sqlContext.read.format("com.databricks.spark.csv").option("delimiter", ",").option("header", "false").load("/CBIES/meters/")

metercols = meterdata.groupBy("C0").pivot("C1")
metercols.show()  


Output:  Traceback (most recent call last): File "/tmp/zeppelin_pyspark-8003809301447367155.py", line 239, in eval(compiledCode) File " ", line 1, in AttributeError: 'GroupedData' object has no attribute 'show'
zero323
  • 322,348
  • 103
  • 959
  • 935
Aidan Condron
  • 21
  • 1
  • 1

1 Answers1

4

The pivot() method returns a GroupedData object, just like groupBy(). You cannot use show() on a GroupedData object without using a aggregate function (such as sum() or even count()) on it before.

See this article or the PySpark documentation for more info.

devict
  • 589
  • 4
  • 8