I am looking for a way to compute the correlation of two gridded time series. Both have the same shape of (432,55,144) which is (time steps, latitude, longitude). As you can see in the following picture, I was already successful with it and got a two dimensional array with all the correlation coefficients by:
corrvalue = []
if data1.shape==data2.shape:
corrcoefMatrix = [[0 for i in range(len(longitudes))] for j in range(len(latitudes))]
for x in range(len(latitudes)):
for y in range(len(longitudes)):
corrvalue = np.corrcoef(data1[:,x,y],data2[:,x,y])
corrcoefMatrix[x][y] = corrvalue[0,1]
corrcoefMatrix = np.squeeze(np.asarray(corrcoefMatrix))
However, there are some NANs causing the white missing value spots. Even though there is only one missing value in the 432 long time series, the correlations coef is NAN. According to this post pandas seems to be the best choice. However, it only accepts two dimensional arrays, so I transformed my data by using Jarads answer from this post
df1 = pd.DataFrame([list(l) for l in data1]).stack().apply(pd.Series).reset_index(0,drop=True)
df2 = pd.DataFrame([list(l) for l in data2]).stack().apply(pd.Series).reset_index(0,drop=True)
and using df.corrwith(df2)
. This gave me only a one dimensional 144 long array, not a 55x144 one as I want to. There must be a fairly simple way since such correlations with missing values are used quite often but it's not well documented or I just cannot find it.