3

I'm trying to save my optimized Gaussian process model for use in a different script. My current line of thinking is to store the model information in a json file, utilizing GPy's built-in to_dict and from_dict functions. Something along the lines of:

import GPy
import numpy as np
import json

X = np.random.uniform(-3.,3.,(20,1))
Y = np.sin(X) + np.random.randn(20,1)*0.05
kernel = GPy.kern.RBF(input_dim=1, variance=1., lengthscale=1.)

m = GPy.models.GPRegression(X, Y, kernel)

m.optimize(messages=True)
m.optimize_restarts(num_restarts = 10)

jt = json.dumps(m.to_dict(save_data=False), indent=4)
with open("j-test.json", 'w') as file:
    file.write(jt)

This step works with no issues, but I run into problems when I try to load the model information using :

with open("j-test.json", 'r') as file:
    d = json.load(file)  # d is a dictionary

m2 = GPy.models.GPClassification.from_dict(d, data=None)

which gives me an assertion error because "data is not None", which it is -- or at least I think so. assertion error

I'm really new to GPy and using jsons, so I'm really not sure where I've gone astray. I tried looking into the documentation, but the documentation is a bit vague and I couldn't find an example of its use. Is there a step/concept that I missed? Also, is this the best way to store and reload my model? Any help with this would be greatly appreciated! Thanks!

gehbiszumeis
  • 3,525
  • 4
  • 24
  • 41
Bocephus85
  • 55
  • 3

2 Answers2

3

The module pickle is your friend here!

import pickle
with open('save.pkl', 'wb') as file:
    pickle.dump(m, file)

you can call it back in a future script with:

with open('save.pkl', 'rb') as file:
    loaded_model = pickle.load(file)
houseofleft
  • 347
  • 1
  • 12
  • Do I need to worry about cross-platform use of the pickled data since they are binary files? Some people in my group use Macs and some use Windows. I'd like the files to be usable by both parties without having to natively create the data for each since it might take up to several hours in the end to do so. – Bocephus85 Oct 27 '20 at 16:05
  • I haven't used pickle files across Mac/windows before, but they should be cross compatible as long as you open in binary mode both times: https://stackoverflow.com/questions/1849523/is-pickle-file-of-python-cross-platform – houseofleft Oct 28 '20 at 17:57
1

Pickle has not been suggested as the recommended method to do this. See here, in the section towards the end. Following is the example for the same.

# let X, Y be data loaded above
# Model creation:
m = GPy.models.GPRegression(X, Y)
m.optimize()
# 1: Saving a model:
np.save('model_save.npy', m.param_array)
# 2: loading a model
# Model creation, without initialization:
m_load = GPy.models.GPRegression(X, Y, initialize=False)
m_load.update_model(False) # do not call the underlying expensive algebra on load
m_load.initialize_parameter() # Initialize the parameters (connect the parameters up)
m_load[:] = np.load('model_save.npy') # Load the parameters
m_load.update_model(True) # Call the algebra only once
print(m_load)