let's say I have a spark rdd and need to process it.
rdd.mapPartitionsWithIndex{(index, iter)=>
def someFunc(){}
def anotherFunc(){}
val x = someFunc(iter)
val y = anotherFunc(index, iter, x)
x zip y
}
I define the someFunc and anotherFunc inside the mapParititions because I don't want to define them in the driver and then serialize them to the worker. it works, but I can not test it because it's a nested function. how to test this? need to write test case for those functions. currently I can serialize it. but what if the function is not serializable and can not send from driver to worker?