I have a 2D array of 0s, 1s and 2s with very large number of columns. I am trying to select only those rows which have consecutive zeros not exceeding certain number. My method is to convert the array into characters, merge columns and then apply the regular expression filter to it. But this is very slow. Especially the conversion and joining the characters in each row. Is there a way to make it faster by an order of magnitude? Maybe using another tactic altogether?
import re
import numpy as np
n=100
k = 1000
x = np.random.choice([0,1,2], replace=True, size=(n,k))
s = np.apply_along_axis(lambda t: ''.join(t) , 1, x.astype(str))
N_ramp=3
mask = [re.search(r'[12]0{1,'+str(N_ramp)+r'}[12]', i) is None for i in s]