Using df.rdd.getNumPartitions()
, we can get the count of partitions. But how do we get the partitions?
I also tried to pick something up from the documentation and all the attributes (using dir(df)
) of a dataframe. However, I could not find any API that would give the partitions, only repartitioning
, coalesce
, getNumPartitions
were all that I could find.
I read this and deduced that Spark does not know the partitioning key(s). My doubt is, if it does not know the partitioning key(s), and hence, does not know the partitions, how can it know their count? If it can, how to determine the partitions?