0

Let's say that I have an array M and I want to get rid of some blocks of rows. Indeed, I want to do something similar to what the following code implements (where bot_indices, top_indices and M are well defined somewhere else in the code):

for i_bot, i_top in zip(bot_indices, top_indices):
        M_new = np.vstack((M_new, M[i_bot:i_top]))

The problem is that, in the first iteration of the for statement, M_new is not defined, so I will get a NameError. To overcome this problem, I've thought to add a try/except statement:

for i_bot, i_top in zip(bot_indices, top_indices):
        try:
            M_new = np.vstack((M_new, M[i_bot:i_top]))
        except NameError:
            M_new = M[i_bot:i_top]

Now, the problem is that, if this block of code is embedded within a for or while statement, as M_new already exists the second time that the block shown is accessed, it will end up containing a concatenation of all arrays generated everytime block is accessed. In other words, I would need to "initialize" M_new at the end of the block shown (maybe with a del(M_new)?).

Which is the optimal (in terms of readability, run-time and length of code) way of tackling this problem?

desertnaut
  • 57,590
  • 26
  • 140
  • 166
xpius
  • 1
  • 1
  • Can you give some example data? – Nils Werner Nov 07 '21 at 15:54
  • Welcome to Stack Overflow. "The problem is that, in the first iteration of the for statement, M_new is not defined, so I will get a NameError" Maybe you can instead think of a value that you could assign for `M_new` before the loop, that makes the right thing happen the first time through the loop? Hint: if you were building an ordinary `list` by `.append`ing elements, what initial value would you use? An *empty* list (`[]`), right? So, maybe something similar exists here. Hint: what are the dimensions of `M[i_bot:i_top]`? What would the dimensions be if `i_bot == i_top`? – Karl Knechtel Nov 07 '21 at 15:54
  • 1
    @Karl, modeling an array iteration on lists is not a good idea. list append is in-place. Array 'append' makes a whole new array each time. – hpaulj Nov 07 '21 at 16:25
  • 1
    Collect the arrays in a list, and do just one `vstack` at the end. – hpaulj Nov 07 '21 at 16:27
  • @hpaulj True. I am trying to teach problem-solving techniques, though, not simply the appropriate in-context solution. – Karl Knechtel Nov 08 '21 at 11:39

1 Answers1

1

Instead of filtering M and appending the result to a new array, you want to do the filtering in one step:

filter_indices = np.array([0, 1, 2, 10, 11, 12, 13])
M[filter_indices]

You can use this method to create a common range of indices from your indices, and filter M in one single operation:

import numpy as np

def create_ranges(a):
    a = np.asarray(a)
    l = a[:,1] - a[:,0]
    clens = l.cumsum()
    ids = np.ones(clens[-1],dtype=int)
    ids[0] = a[0,0]
    ids[clens[:-1]] = a[1:,0] - a[:-1,1]+1
    out = ids.cumsum()
    return out

M = np.random.rand(1024, 1024)
bot_indices = [10, 20, 60]
top_indices = [15, 30, 100]

limits = np.asarray([bot_indices, top_indices]).T

filter_indices = create_ranges(limits)
filter_indices
# array([10, 11, 12, 13, 14, 20, 21, 22, 23, 24, 25, ...])
M[filter_indices]
Nils Werner
  • 34,832
  • 7
  • 76
  • 98
  • Thank you for the detailed answer, @Nils (I apologize for answering so late, but this code belongs to a personal project to which I could not dedicate much time this month). So, as far as I see, the idea is to avoid using `for` instances with `arrays`, as this slows down the code. Thank you! – xpius Nov 29 '21 at 13:15