My platform is spark 2.1.0, 8 nodes cluster, using python language.
Now I have about 100 random forest multiclassification models ,I have saved them in the HDFS.There are 100 datasets saved in the HDFS too. I want to predict the dataset using corresponding model parallely.
I use a loop to iterate the 100 dataset.In each iteration,I catch the corresponding model to predict the data. But the cost time shows that it is not in parallel.
I do not know how to do.
Thanks!