I have a dictionary in which I collect ML models, that I built with a dataclass as follows:
@dataclass(frozen=True, order=True)
class Model:
data_sample: str
predictive_model: object
predictions: pd.DataFrame
binary: object
type: str
inputs: list
output: str
explain: bool
def to_dict(self):
return asdict(self)
I produce multiple models and use the dataclass to validate the inputs for a single, trained model. I cast this class as a dictionary to an ML list:
ML.append(model.to_dict())
The objects for binary
and predictive_model
are models (python classes) that come from libraries like scikit-learn, TPOT, SciPy and so on. One should assume that there is a lot of inheritance happening in these objects. I am struggling to make this list portable to another environment. My core idea of making this portable is to use libs like joblib
, dill
or pickle
to .dump
the dictionary in the runtime that trains the models, and use a .load
method to load the dictionary. When I do this, I notice that there is a ModuleNotFoundError: No module named ...
error. I already found this to be a common problem, and that there are answers around this error here: Python pickling after changing a module's directory
My question is: Is there a better way to "export" my dictionary? Preferably in such a way that it copies everything that it needs so that I can run this elsewhere without needing to manage any imports?
I get the feeling that pickling might not be what I need..