The simplest numpy only approach, which does much less work than convolution and will therefore be likely faster than filter based methods, is to resize your original array to one with extra dimensions, then reduce it back to normal by summing over the new dimensions:
>>> arr = np.arange(108).reshape(9, 12)
>>> rows, cols = arr.shape
>>> arr.reshape(rows//3, 3, cols//3, 3).sum(axis=(1, 3))
array([[117, 144, 171, 198],
[441, 468, 495, 522],
[765, 792, 819, 846]])
If you wanted the mean, you would simply divide the resulting array by the number of elements:
>>> arr.reshape(rows//3, 3, cols//3, 3).sum(axis=(1, 3)) / 9
array([[ 13., 16., 19., 22.],
[ 49., 52., 55., 58.],
[ 85., 88., 91., 94.]])
This method only works if your array has a shape which is itself a multiple of 3.