I have created an class that is used for analysising a specific type of data that I produce. I use this class on a local computer but occasionally there is too much data to work locally so I wanted to add an option to one of methods so that it can submit the job to a computer cluster. It mostly works except I am struggling to transfer a class method to the cluster.
My class looks like this
class Analysis():
def __init__(self, INPUT_PARAMETERS ETC):
self.data
OTHER_STUFF...
@staticmethod
def staticMethod1(input1, input2):
# PERFORM SOME KIND OF CALCULATION ON INPUT1 AND INPUT2 AND RETURN THE RESULT
return output
@staticmethod
def staticMethod2(input1, input2):
# PERFORM SOME KIND OF CALCULATION ON INPUT1 AND INPUT2 AND RETURN THE RESULT
return output
# MORE STATIC METHODS
@staticmethod
def staticMethodN(input1, input2):
# PERFORM SOME KIND OF CALCULATION ON INPUT1 AND INPUT2 AND RETURN THE RESULT
return output
def createArray(self, function):
# CREATE AN ARRAY BY APPLYING FUNCTION TO SELF.DATA
return array
So the createArray
method gets called and the user passes the static method that should be used to calculate the array. When I wanted the array in createArray
to be created on the cluster I saved the static method (that was passed to the this method e.g. staticMethod1
) into a Pickle
file using dill.dump
. The Pickle
file is then passed to the cluster but when I try to load the method from the Pickle
file it says ModuleNotFoundError: No module named 'analysis'
which is the module that the Analysis
class can be found in.
Do I really need to recreate the whole class on the cluster just to use a static method? Can anyone suggest a elegant fix to this problem or suggest a better way of implementing this functionality? It needs to work with any static method. FYI, one of the static methods uses from sklearn.metrics.cluster import adjusted_rand_score
just incase that may effect a solution using dill
.