Converting matlab running average into Python gives unexpected result

Question

I have this code in Matlab which computes running average:

as = movmean(std_new1,PTA);

Here is as when it's computed in matlab:

as = [NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0.0311573, 0.03135, 0.0315315, 0.0317018, 0.0318609, 0.0320087, 0.0321454, 0.0322708, 0.0323851, 0.0324881, 0.0325799, 0.0326605, 0.0329592, 0.0334758, 0.0342104, 0.0351631, 0.0363338, 0.0377224, 0.0393291, 0.0411538, 0.0431965, 0.0454572, 0.0473395, 0.0488433, 0.0499687, 0.0507156, 0.051084, 0.051074, 0.0506856, 0.0499187, 0.0487733, 0.0472495, 0.0456993, 0.0441228, 0.0425198, 0.0408905, 0.0392348, 0.0375527, 0.0358443, 0.0341094, 0.0323482, 0.0305606, 0.0290992, 0.0279639, 0.0271548, 0.0266719, 0.0265151, 0.0266844, 0.02718, 0.0280016, 0.0291495, 0.0306235, 0.0319449, 0.0331137, 0.03413, 0.0349937, 0.0357048, 0.0362634, 0.0366693, 0.0369227, 0.0370235, 0.0369717, 0.0369048, 0.0368227, 0.0367255, 0.0366131, 0.0364856, 0.036343, 0.0361852, 0.0360122, 0.0358241, 0.0356209, 0.03539, 0.0351316, 0.0348455, 0.0345318, 0.0341905, 0.0338216, 0.0334251, 0.033001, 0.0325493, 0.0320699, 0.0315601, 0.0310198, 0.030449, 0.0298477, 0.029216, 0.0285537, 0.027861, 0.0271378, 0.0263841, 0.0255999, 0.0248585, 0.02416, 0.0235044, 0.0228916, 0.0223217, 0.0217946, 0.0213104, 0.020869, 0.0204704, 0.0201148, 0.0198367, 0.0196361, 0.0195132, 0.0194679, 0.0195001, 0.0196099, 0.0197973, 0.0200623, 0.0204049, 0.0208251, 0.0211917, 0.0215047, 0.0217641, 0.02197, 0.0221224, 0.0222211, 0.022429, 0.0226666, 0.0229393, 0.0232537, 0.0235459, 0.0238112, 0.0240434, 0.0242341, 0.0243722]

I need to do the same operation in Python. I basically tried every solution proposed here Moving average or running mean but the problem is that none of them gives correct results on my data.

This is std_new1

std_new1 = np.array([np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, 0.0223287, 0.023921, 0.0255133, 0.0271056, 0.0286979, 0.0302902, 0.0318825, 0.0334747, 0.035067, 0.0366593, 0.0382516, 0.0370447, 0.0358378, 0.0346309, 0.0334241, 0.0322172, 0.0310103, 0.0298034, 0.0285965, 0.0273896, 0.0261827, 0.0275508, 0.028919, 0.0302871, 0.0316552, 0.0330233, 0.0343914, 0.0357595, 0.0371276, 0.0384957, 0.0398638, 0.0430172, 0.0461705, 0.0493238, 0.0524771, 0.0556305, 0.0587838, 0.0619371, 0.0650904, 0.0682438, 0.0713971, 0.0651962, 0.0589952, 0.0527943, 0.0465933, 0.0403924, 0.0341914, 0.0279905, 0.0217896, 0.0155886, 0.00938767, 0.0120134, 0.0146392, 0.017265, 0.0198907, 0.0225165, 0.0251422, 0.027768, 0.0303938, 0.0330195, 0.0356453, 0.0359675, 0.0362898, 0.036612, 0.0369342, 0.0372565, 0.0375787, 0.037901, 0.0382232, 0.0385454, 0.0388677, 0.0384419, 0.0380161, 0.0375903, 0.0371645, 0.0367387, 0.0363129, 0.0358871, 0.0354613, 0.0350355, 0.0346097, 0.034629, 0.0346483, 0.0346676, 0.034687, 0.0347063, 0.0347256, 0.034745, 0.0347643, 0.0347836, 0.034803, 0.033825, 0.032847, 0.0318689, 0.0308909, 0.0299129, 0.0289349, 0.0279569, 0.0269789, 0.0260009, 0.0250229, 0.0244325, 0.0238421, 0.0232518, 0.0226614, 0.022071, 0.0214806, 0.0208903, 0.0202999, 0.0197095, 0.0191191, 0.0189982, 0.0188772, 0.0187562, 0.0186352, 0.0185142, 0.0183932, 0.0182722, 0.0181512, 0.0180302, 0.0179092, 0.0188705, 0.0198318, 0.0207932, 0.0217545, 0.0227158, 0.0236771, 0.0246384, 0.0255997, 0.026561, 0.0275224, 0.02633, 0.0251377, 0.0239454, 0.022753, 0.0215607, 0.0203684])

This is PTA (1x1 matrix)

PTA = np.array([20])

The following one, for example,

AS = [np.mean(std_new1[x:x + PTA[0]]) for x in range(len(std_new1) - PTA[0] + 1)]

gives me almost the same result but the are less NA values at the beginning and there are numeric values missing at the end.

This is as computed in Python:

[       nan        nan        nan        nan        nan        nan
        nan        nan        nan 0.03115731 0.03135002 0.03153151
 0.03170179 0.03186087 0.03200873 0.03214538 0.03227083 0.03238507
 0.0324881  0.03257992 0.03266053 0.03295915 0.03347579 0.03421044
 0.03516309 0.03633375 0.03772243 0.03932911 0.04115381 0.04319652
 0.04545724 0.04733951 0.04884332 0.04996868 0.05071558 0.05108404
 0.05107404 0.05068559 0.04991868 0.04877333 0.04724952 0.04569933
 0.04412277 0.04251983 0.04089051 0.03923481 0.03755273 0.03584427
 0.03410944 0.03234823 0.03056064 0.0290992  0.02796393 0.02715482
 0.02667186 0.02651507 0.02668443 0.02717996 0.02800164 0.02914948
 0.03062348 0.0319449  0.03311375 0.03413001 0.0349937  0.03570481
 0.03626335 0.0366693  0.03692268 0.03702348 0.0369717  0.03690477
 0.0368227  0.03672548 0.03661312 0.03648561 0.03634296 0.03618516
 0.03601221 0.03582412 0.03562089 0.03539004 0.03513158 0.03484552
 0.03453184 0.03419055 0.03382164 0.03342514 0.03300102 0.03254929
 0.03206995 0.03156012 0.03101981 0.03044902 0.02984774 0.02921597
 0.02855372 0.02786099 0.02713777 0.02638407 0.02559987 0.02485853
 0.02416004 0.02350441 0.02289162 0.02232169 0.0217946  0.02131037
 0.02086898 0.02047045 0.02011476 0.01983666 0.01963615 0.01951322
 0.01946787 0.01950011 0.01960994 0.01979734 0.02006233 0.02040491
 0.02082507 0.02119166 0.02150469 0.02176414 0.02197004 0.02212236
 0.02222112]

This most likely has to do with the following, see [`movmean`](https://mathworks.com/help/matlab/ref/movmean.html): "`'shrink'`: Shrink the window size near the endpoints of the input to include only existing elements." In other words: you need to shrink the window size appropriately at the edges of your vector. Also note that you didn't pass the `nanflag` to MATLAB, so any window in which a `nan` value is present (so the first 28 points in your case) will be NaN. With this question, as with your previous ones, read the MATLAB documentation carefully. A lot of default happens under the hood. — Adriaan, Jun 11 '20 at 11:38
So are you saying that the issue is in how the movmean was used in the Matlab code? This would mean then that my Python code is correct? — claw91, Jun 11 '20 at 12:09
What "correct" is, is something only you can determine. What I'm saying is, is that MATLAB build-in functions have a whole lot of options one can set, and each and every one of those has defaults (check the docs on each function). Then naively implementing something in Python and expecting it to do the same is obviously not going to work. Therefore my comment: read the MATLAB docs on each function you translate, and be sure to translate all the desired functionality and defaults. To be honest: I'd never translate code, but rather start fresh from scratch to avoid his kind of hassle. — Adriaan, Jun 11 '20 at 13:17
I didn't mean to be rude, apologies. What I meant is: since you're the one using the application, you're the one to determine what to do with nan values and whether to "shrink" the window size at the edges of the array. Either way can be correct, it just depends on the application. Given that you claim the MATLAB code worked for you, I presume that you deem that correct and thus need to copy the endpoint-shrink and nan-handling that MATLAB does. — Adriaan, Jun 11 '20 at 13:48
Yes, we're assuming the matlab code shown and its result are ok. Given that, I want to reproduce exactly what it does using Python. However I cannot seem to reproduce the same values with it regardless how many implementation of moving mean I've tried so far. I guess as you said, the way Matlab does it (even if it's code shows only the two vectors as parameters) it's much more complicated than it seems. There must be some hidden default parameter which I'm missing in Python ,hence the different results. — claw91, Jun 12 '20 at 09:30

Converting matlab running average into Python gives unexpected result

0 Answers0