How to concatenate numpy.ones and numpy.zeros functions in one array on python?

Question

I want to create a 1D array that consists of alternating sets of ones and zeros defined by two input arrays. For example:

import numpy as np

In1 = np.array([2, 1, 3])
In2 = np.array([1, 1, 2])

Out1 = np.array([])

for idx in range(In1.size):
    Ones = np.ones(In1[idx])
    Zeros = np.zeros(In2[idx])

    Out1 = np.concatenate((Out1, Ones, Zeros))

print(Out1)
array([1., 1., 0., 1., 0., 1., 1., 1., 0., 0.])

Is there a more efficient way to do this that doesn't use a for loop?

score 3 · Answer 1 · answered Jul 02 '20 at 12:44

3

Using np.repeat:

(np.arange(1,1+In1.size+In2.size)&1).repeat(np.array([In1,In2]).reshape(-1,order="F"))
# array([1, 1, 0, 1, 0, 1, 1, 1, 0, 0])

answered Jul 02 '20 at 12:44

Paul Panzer

51,835
3
54
99

Not pretty but seems good on performance if the island lengths are not huge. Guess we can optimize further by repeating on a boolean array. – Divakar Jul 02 '20 at 13:15
@Divakar I'm done with pretty ;-) I think this one is good for small arrays, for larger ones yours seem faster. – Paul Panzer Jul 02 '20 at 13:26

Divakar · Accepted Answer · 2020-07-02T11:15:28.970

2

Here's a vectorized one using cumsum -

L = In1.sum() + In2.sum()
idar = np.zeros(L, dtype=int)

s = In1+In2
starts = np.r_[0,s[:-1].cumsum()]
stops = In1+starts
idar[starts] = 1
idar[stops] = -1
out = idar.cumsum()

Alternatively, if the slices are large or just to achieve memory efficiency, we might want to use a loop with just slicing to assign 1s -

# Re-using L, starts, stops from earlier approach
out = np.zeros(L, dtype=bool)
for (i,j) in zip(starts,stops):
    out[i:j] = 1
out = out.view('i1')

edited Jul 02 '20 at 11:15

answered Jul 02 '20 at 10:56

Divakar

218,885
19
262
358

That cumsum approach is amazing in it's elegance. Which approach do you recommend? My assumption is that cumsum is optimized on C so it will be the fastest to run. My bottleneck is cpu speed and not RAM. – Al-Baraa El-Hag Jul 02 '20 at 12:54
1

@Al-BaraaEl-Hag Yeah for large number of entries in `In1` and `In2`, you would see the vectorized one doing better. – Divakar Jul 02 '20 at 13:14

score 0 · Answer 3 · answered Jul 02 '20 at 11:08

I did this with map. In my opinion the most time consuming part of you code is concatenations so I replaced that with python lists. (based on this)

from itertools import chain
creator = lambda i: In1[i]*[1] + In2[2]*[0]
nested = list(map(creator,range(len(In1))))
flatten = np.array(list(chain(*nested)))
print(flatten)

How to concatenate numpy.ones and numpy.zeros functions in one array on python?

3 Answers3