hello could someone explain to me why mapPartitions
reacts differently to those two functions? (I have looked at this this thread and I don't think my problem comes from the fact that my iterable is TraversableOnce
as I create it.
L=range(10)
J=range(5,15)
K=range(8,18)
data=J+K+L
def function_1(iter_listoflist):
final_iterator=[]
for sublist in iter_listoflist:
final_iterator.append([x for x in sublist if x%9!=0])
return iter(final_iterator)
def function_2(iter_listoflist):
final_iterator=[]
listoflist=list(iter_listoflist)
for i in range(len(listoflist)):
for j in range(i+1,len(listoflist)):
sublist=listoflist[i]+listoflist[j]
final_iterator.append([x for x in sublist if x%9!=0])
pass
pass
return iter(final_iterator)
sc.parallelize(data,3).glom().mapPartitions(function_1).collect()
returns what it should while
sc.parallelize(data,3).glom().mapPartitions(function_2).collect()
returns an empty array, I have the checked the code by returning a list at the end and it does what I want it to.
thanks for your help
Philippe C