I find it really hard to visualize reshaping 4D 5D arrays in numpy/pytorch. (I assume both reshape in similar patter, I am using pytorch currently!).
Like suppose I have videos with dimension [N x C x D x H x W]
(num videos x channels video x frames video x height video x width video)
Suppose I want to reshape video into frames as [N x C x H x W], how should I proceed in reshape.
Simple applying x = x.reshape(N*D, C, H, W)
doesn't actually do it, it gives wrong order of elements.
Can you help me with how to do this, and any slight of intuition of pattern you used?
On a sidenote, if i have one video (i.e suppose 1x3x100x256x256 I use :
the following code approach:
x = x.squeeze(0).T.reshape((100,3,256,256))[:,:,None,:,:]
and it worksgreat. Couldnt figure out for more than 1 video.
Thanks!
As per the request :
input = np.random.randn(N,C,D,H,W)
output = np.zeros((N*D,C,H,W))
As per the request, a for loop based code to show what I want
for h in range(N):
for i in range(D):
for j in range(C):
for k in range(H):
for l in range(W):
output[h*D + i,j,k,l] = input[h,j,i,k,l]