3

What is difference between two types of partitions in Spark?

For example: I load a text file toto.csv from disk to spark cluster

val text = sc.textFile("toto.csv", 100)

=> It split my file into 100 fragments without "rules"

After that, if I do

val partion = text.partitionBy(new HashPartitioner(100))

=> It "split" my file into 100 partition by key

Thanks to any confirmation or suggestion

minh-hieu.pham
  • 1,029
  • 2
  • 12
  • 21
  • 3
    chapter 4 for detailed explanation : https://www.safaribooksonline.com/library/view/learning-spark/9781449359034/ch04.html – GameOfThrows Feb 22 '16 at 12:07

0 Answers0