0

I have two lists of shape (130, 64, 2048), call it (s, f, b), and one vector of length 64, call this v. I need to append these two lists together to make a list of shape (130, 2, 64, 2048) and multiply all 2048 values in f[i] with the i th value of v.

The output array also needs to have shape (130, 2, 64, 2048)

Obviously these two steps can be done interchangeably. I want to know the most Pythonic way of doing something like this.

My main issue is that my code takes forever in turning the list into a numpy array which is necessary for some of my calculations. I have:

new_prof = np.asarray( new_prof )

but this seems to take two long for the size and shape of my list. Any thoughts as to how I could initialise this better?

The problem outlined above is shown by my attempt:

    # Converted data should have shape (130, 2, 64, 2048)
    converted_data = IQUV_to_AABB( data, basis = "cartesian" )

    new_converted = np.array((130, 2, 64, 2048))

    # I think s.shape is (2, 64, 2048) and cal_fa has length 64
    for i, s in enumerate( converted_data ):
        aa = np.dot( s[0], cal_fa )
        bb = np.dot( s[1], cal_fb )
        new_converted[i].append( (aa, bb) )

However, this code doesn't work and I think it's got something to do with the dot product. Maybe??

I would also love to know why the process of changing my list to a numpy array is taking so long.

zhn11tau
  • 205
  • 3
  • 10

1 Answers1

1

Try to start small and look at the results in the console:

import numpy as np

x = np.arange(36)
print(x)

y = np.reshape(x, (3, 4, 3))
print(y)

# this is a vector of the same size as dimension 1
a = np.arange(4)
print(a)

# expand and let numpy's broadcasting do the rest
# https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
# https://scipy.github.io/old-wiki/pages/EricsBroadcastingDoc
b = a[np.newaxis, :, np.newaxis]
print(b)

c = y * b
print(c)

You can read about np.newaxis here, here and here.

Using numpy.append is rather slow as it has to preallocate memory and copy the whole array each time. A numpy array is a continuous block of memory.

You might have to use it if you run out of computer memory. But in this case try to iterate over appropriate chunks, as big as your computer can still handle them. Re-aranging the dimension is sometimes a way to speed up calculations.

Joe
  • 6,758
  • 2
  • 26
  • 47