I'm trying to write an integration test that uses the descriptive statistics (.describe().to_list()
) of the results of a model prediction (model.predict(X)
). However, even though I've set np.random.seed(###)
the descriptive statistics are different after running the tests in the console vs. in the environment created by Pycharm:
Here's a MRE for local:
from sklearn.linear_model import ElasticNet
from sklearn.datasets import make_regression
import numpy as np
import pandas as pd
np.random.seed(42)
X, y = make_regression(n_features=2, random_state=42)
regr = ElasticNet(random_state=42)
regr.fit(X, y)
pred = regr.predict(X)
# Theory: This result should be the same from the result in a class
pd.Series(pred).describe().to_list()
And an example test-file:
from unittest import TestCase
from sklearn.linear_model import ElasticNet
from sklearn.datasets import make_regression
import numpy as np
import pandas as pd
np.random.seed(42)
class TestPD(TestCase):
def testExpectedPrediction(self):
np.random.seed(42)
X, y = make_regression(n_features=2, random_state=42)
regr = ElasticNet(random_state=42)
regr.fit(X, y)
pred = pd.Series(regr.predict(X))
for i in pred.describe().to_list():
print(i)
# here we would have a self.assertTrue/Equals f.e. element
What appears to happen is that when I run this test in the Python Console, I get one result. But then when I run it using PyCharm's unittests for the folder, I get another result. Now, importantly, in PyCharm, the project interpreter is used to create an environment for the console that ought to be the same as the test environment. This leaves me to believe that I'm missing something about the way random_state is passed along. My expectation is, given that I have set a seed, that the results would be reproducible. But that doesn't appear to be the case and I would like to understand:
- Why they aren't equal?
- What I can do to make them equal?
I haven't been able to find a lot of best practices with respect to testing against expected model results. So commentary in that regard would also be helpful.