I have small data set as below
panda,0
pink,3
pirate,3
panda,1
pink,4
panda = sc.textFile("/FileStore/tables/ehg1wksx1496214578178/Panda.txt")
new_pand = panda.map(lambda x: tuple(x.split(",")))
new = panda.sortByKey(ascending = True , numPartitions = None, keyfunc = lambda x : str(x) )
Now I tried to make a RDD
out of it then trying to sort it . But when I am using sortByKey()
it's giving me the below error
File "/databricks/spark/python/pyspark/rdd.py", line 1751, in add_shuffle_key
for k, v in iterator:
ValueError: too many values to unpack