I tried to run a computational code on a cluster. But the results obtained from the computer and the cluster were different, where the cluster tended to break. I thought it was because the issue with the software and thus installed the miniconda in the cluster to be the same version to that of the computer. However, it same issue showed up.
The results of the more complicated computation were verified across different computers and was numerically consistent, and matched up with the analytic calculation. But the cluster, using the same python package provided different results and tended to break.
An example was attached.
from scipy import integrate
import numpy as np
def integrand(theta, t,theta_1x,theta_2x):
return np.sin(t/2)*np.sin(theta)*np.cos(t/2)**2/np.sqrt(t-b+a)/np.sqrt(b+a-t)*np.sin(theta_1x)/theta_1x/theta_2x;
def theta_integral(t,ax,bx):
return integrate.quad(integrand, bx-ax, t , args=(t,ax,bx,))[0]
b=1.5
a=0.1
integral_result_temp=integrate.quad(theta_integral, b-a,b+a ,args=(a,b,) );
print(integral_result_temp)
The computer
Python 3.9.12 (main, Apr 4 2022, 05:22:27) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
The cluster
Python 3.9.12 (main, Jun 1 2022, 11:38:51)
[GCC 7.5.0] :: Anaconda, Inc. on linux
where they used the same version of the package
import numpy as np
print(np.__version__)
import scipy as scipy
print(scipy.__version__)
1.21.5
1.7.3
The computer provided a result of
(0.07410654702360654, 1.34140350493972e-08)
where the cluster provided a result of
(0.07410654255620497, 1.1830009194468971e-11)
As you could see their results were already different and actually outside the different variance.
In a more complicated version of the integration, the computer was able to provide the correct result, where the cluster generally failed.
How was this possible? and how to fix it?
Related: What's the difference between Python built by MSC and Python built by GCC?