NumPy broadcasting from 4D using 3D

Question

I am trying to broadcast values from 4D matrix to 3D matrix using index values/locations from another 3D matrix. Unfortunately, the results are not what I expect -- either not working or getting wrong dimensions.

Namely, I have a 4D matrix with dimensions (time,level,lat,lon). Based on maximum values along the level (axis 1), I am trying to reduce my 4D matrix to 3D matrix with dimensions (time,lat,lon).

I tried to reduce the dimensions by selecting only the first time-moment, getting 3D and 2D matrices respectively, but the broadcasting gives me wrong dimensions -- answer will be 4D matrix again.

Here is a simplified example:

#!/usr/bin/env ipython
import numpy as np
# ----------------------
nx = 10
ny = 10
ntime = 20
nlevs = 30
# =============================================
np.random.seed(10);
datain_a = np.random.random((ntime,nlevs,ny,nx)); # generate data A
datain_b = np.random.random((ntime,nlevs,ny,nx)); # generate data B
# ---------------------------------------------
calc_smt = np.abs(np.diff(datain_a,axis=1)/np.diff(datain_b,axis=1)) # calculate some ratio between two matrices
# ---------------------------------------------
calc_a = np.nanmax(calc_smt,axis=1) # find the maximum ratio at every gridcell -- answer has dimensions (time,lat,lon)

ind_out = np.argmax(calc_smt,axis=1) # location of maximum ratio 
# ---------------------------------------------------------------------------------
# Broadcasting attempts: 
calc_b = datain_b[ind_out] # Get an error with axis 26 is out of bounds... # NOT WORKING

calc_b = datain_b[ind_out[:,np.newaxis,:,:]] # still an error: IndexError: index 26 is out of bounds for axis 0 with size 20  # NOT WORKING
# ---------------------------------------------------------------------------
# let us try 1st time moment:
dd_a = ind_out[0,:,:]
dd_b = datain_b[0,:,:,:]

smt = dd_b[dd_a] # getting something with dimensions (10,10,10,10)  # NOT WORKING?
# ---------------------------------------------------------------------------
# This is the output I expect:
correct_output = np.zeros((ntime,ny,nx));
for itime in range(ntime):
    for jj in range(ny):
        for ii in range(nx):
            correct_output[itime,jj,ii] = datain_b[itime,ind_out[itime,jj,ii],jj,ii]
# ------------------------------------------------------------------------------
# How to get the same without 3 loops?

I would like to get an answer with dimensions (time,lat,lon) instead of not getting one at all.

Does this work `datain_b[np.arange(ntime)[:,None],ind_out]`. — hpaulj, Jun 16 '23 at 15:17
correction, because `ind_out` has shape (20,10,10), the indexing has to include nx,ny ranges, `M=datain_b[np.arange(ntime)[:,None,None],ind_out,np.arange(nx)[:,None],np.arange(ny)]`. That's what `ogrid` produces. Those 4 indexing arrays broadcast together. — hpaulj, Jun 16 '23 at 17:54

Thomas Wagenaar · Accepted Answer · 2023-06-16T17:00:08.677

1

Use np.ogrid:

I,J,K = np.ogrid[:ntime,:ny,:nx]
correct_output = datain_b[I,ind_out,J,K]

A good explanation is given in this answer.

edited Jun 16 '23 at 17:00

answered Jun 16 '23 at 16:39

Thomas Wagenaar

6,489
5
30
73

Thanks a lot for the help! Solution works! – msi_gerva Jun 17 '23 at 23:10

NumPy broadcasting from 4D using 3D

1 Answers1