Interleaving two NumPy arrays efficiently

Question

Assume the following arrays are given:

a = array([1, 3, 5])
b = array([2, 4, 6])

How would one interleave them efficiently so that one gets a third array like the following?

c = array([1, 2, 3, 4, 5, 6])

It can be assumed that length(a) == length(b).

How about, same question, but you are trying to interleave matrices. That is a and b are 3 dimensional, and not necessarily the same size in the first dimension. Note:Only the first dimension should be interleaved. — Geronimo, Nov 17 '17 at 21:29
adding a comment for anyone trying to search "translate Wolfram Mathematica's Riffle to Python" and not finding anything. hope this was picked up by your search engine — Dan Oak, Sep 05 '22 at 16:59

score 194 · Accepted Answer · edited Apr 14 '23 at 09:19

194

I like Josh's answer. I just wanted to add a more mundane, usual, and slightly more verbose solution. I don't know which is more efficient. I expect they will have similar performance.

import numpy as np

a = np.array([1,3,5])
b = np.array([2,4,6])

c = np.empty((a.size + b.size,), dtype=a.dtype)
c[0::2] = a
c[1::2] = b

edited Apr 14 '23 at 09:19

Peter Mortensen

30,738
21
105
131

answered Mar 18 '11 at 02:41

Paul

42,322
15
106
123

2

Unless speed is really really important, I would go with this as it's much more comprehensible which is important if anyone is ever going to look at it again. – John Salvatier Mar 18 '11 at 03:02
7

+1 I played around with timings and your code surprisingly seems to be 2-5x faster depending on inputs. I still find the efficiency of these types of operations to be nonintuitive, so it's always worth it to use `timeit` to test things out if a particular operation is a bottleneck in your code. There are usually more than one way to do things in numpy, so definitely profile code snippets. – JoshAdel Mar 18 '11 at 03:04
@JoshAdel: I guess if `.reshape` creates an additional copy of the array, then that would explain a 2x performance hit. I don't think it always makes a copy, however. I'm guessing the 5x difference is only for small arrays? – Paul Mar 18 '11 at 03:39
looking at `.flags` and testing `.base` for my solution, it looks like the reshape to 'F' format creates a hidden copy of the vstacked data, so it's not a simple view as I thought it would be. And strangely the 5x is only for intermediate sized arrays for some reason. – JoshAdel Mar 18 '11 at 14:52
Another advantage of this answer is it's not limited to arrays of the same length. It could weave `n` items with `n-1` items. – EliadL Sep 25 '19 at 11:48

score 102 · Answer 2 · edited Apr 14 '23 at 09:28

I thought it might be worthwhile to check how the solutions performed in terms of performance. And this is the result:

This clearly shows that the most upvoted and accepted answer (Paul's answer) is also the fastest option.

The code was taken from the other answers and from another Q&A:

# Setup
import numpy as np

def Paul(a, b):
    c = np.empty((a.size + b.size,), dtype=a.dtype)
    c[0::2] = a
    c[1::2] = b
    return c

def JoshAdel(a, b):
    return np.vstack((a,b)).reshape((-1,),order='F')

def xioxox(a, b):
    return np.ravel(np.column_stack((a,b)))

def Benjamin(a, b):
    return np.vstack((a,b)).ravel([-1])

def andersonvom(a, b):
    return np.hstack(zip(a,b))

def bhanukiran(a, b):
    return np.dstack((a,b)).flatten()

def Tai(a, b):
    return np.insert(b, obj=range(a.shape[0]), values=a)

def Will(a, b):
    return np.ravel((a,b), order='F')

# Timing setup
timings = {Paul: [], JoshAdel: [], xioxox: [], Benjamin: [], andersonvom: [], bhanukiran: [], Tai: [], Will: []}
sizes = [2**i for i in range(1, 20, 2)]

# Timing
for size in sizes:
    func_input1 = np.random.random(size=size)
    func_input2 = np.random.random(size=size)
    for func in timings:
        res = %timeit -o func(func_input1, func_input2)
        timings[func].append(res)

%matplotlib notebook

import matplotlib.pyplot as plt
import numpy as np

fig = plt.figure(1)
ax = plt.subplot(111)

for func in timings:
    ax.plot(sizes,
            [time.best for time in timings[func]],
            label=func.__name__)  # you could also use "func.__name__" here instead
ax.set_xscale('log')
ax.set_yscale('log')
ax.set_xlabel('size')
ax.set_ylabel('time [seconds]')
ax.grid(which='both')
ax.legend()
plt.tight_layout()

Just in case you have numba available you could also use that to create a function:

import numba as nb

@nb.njit
def numba_interweave(arr1, arr2):
    res = np.empty(arr1.size + arr2.size, dtype=arr1.dtype)
    for idx, (item1, item2) in enumerate(zip(arr1, arr2)):
        res[idx*2] = item1
        res[idx*2+1] = item2
    return res

It could be slightly faster than the other alternatives:

Also of note, the accepted answer is _way_ faster than the a native Python solution with [`roundrobin()`](https://docs.python.org/3/library/itertools.html#itertools-recipes) from the itertools recipes. — Brad Solomon, Jan 28 '18 at 17:41
As per the chart, Paul's answer appears to be the slowest as the size of the data increases. However, @MSeifert says that 'This clearly shows that the most upvoted and accepted answer (Pauls answer) is also the fastest option.' Given MSeifert's statement, I believe I am reading the chart wrong. Could you please clarify? — user3613932, Feb 10 '23 at 03:34
@user3613932 Pauls answer is the blue line. And regarding the interpretation: Lower means faster. The blue and the yellow-greenish line (numba/Paul) are lowest and therefore fastest. The pink and purple (Tai and andersonvom) are highest and therefore slowest. I agree that the line colors are not really easy to differentiate but you should be able to easily reproduce the graph with the given code. — MSeifert, Feb 15 '23 at 15:31
Nice plot. Can you share to us how to call the benchmark with the `Timing setup` as the input, also the `plot` function? Thank you. — Muhammad Yasirroni, Feb 21 '23 at 10:41
@MuhammadYasirroni I don't know what you mean. The code in this answer should be runnable as-is (in a jupyter notebook environment at least). :) — MSeifert, Feb 23 '23 at 09:00
@MSeifert sorry, did not see the scroll bar that show how to run the benchmark. Thanks. — Muhammad Yasirroni, Feb 24 '23 at 09:46

score 48 · Answer 3 · answered Mar 18 '11 at 01:24

48

Here is a one-liner:

c = numpy.vstack((a,b)).reshape((-1,),order='F')

answered Mar 18 '11 at 01:24

JoshAdel

66,734
27
141
140

25

Wow, this is so unreadable :) This is one of the cases where if you don't write a proper comment in the code, it can drive somebody crazy. – Ilya Kogan Mar 18 '11 at 01:26
11

It's just two common numpy commands strung together. I wouldn't think it is that unreadable, although a comment never hurts. – JoshAdel Mar 18 '11 at 01:31
1

@JohnAdel, well, it's not `numpy.vstack((a,b)).interweave()` :) – Ilya Kogan Mar 18 '11 at 13:52
7

@Ilya: I would have called the function `.interleave()` personally :) – JoshAdel Mar 18 '11 at 14:53
What does `reshape` do? – Danijel May 19 '17 at 11:23
@JoshAdel I use that `interleave()` naming in my [answer](https://stackoverflow.com/a/75519265/11671779) – Muhammad Yasirroni Feb 21 '23 at 10:38

score 26 · Answer 4 · edited May 06 '17 at 18:28

Here is a simpler answer than some of the previous ones

import numpy as np
a = np.array([1,3,5])
b = np.array([2,4,6])
inter = np.ravel(np.column_stack((a,b)))

After this inter contains:

array([1, 2, 3, 4, 5, 6])

This answer also appears to be marginally faster:

In [4]: %timeit np.ravel(np.column_stack((a,b)))
100000 loops, best of 3: 6.31 µs per loop

In [8]: %timeit np.ravel(np.dstack((a,b)))
100000 loops, best of 3: 7.14 µs per loop

In [11]: %timeit np.vstack((a,b)).ravel([-1])
100000 loops, best of 3: 7.08 µs per loop

andersonvom · Answer 5 · 2013-04-16T17:15:35.037

10

This will interleave/interlace the two arrays and I believe it is quite readable:

a = np.array([1,3,5])      #=> array([1, 3, 5])
b = np.array([2,4,6])      #=> array([2, 4, 6])
c = np.hstack( zip(a,b) )  #=> array([1, 2, 3, 4, 5, 6])

edited Apr 16 '13 at 17:15

answered Apr 16 '13 at 16:47

andersonvom

11,701
4
35
40

3

I like this one as most readable. despite the fact that it is the slowest solution. – kimstik Dec 06 '19 at 09:31
2

Wrap `zip` in a `list` to avoid depreciation warning – Milo Wielondek Sep 21 '20 at 17:28

score 8 · Answer 6 · edited Apr 14 '23 at 09:23

8

Improving xioxox's answer:

import numpy as np

a = np.array([1,3,5])
b = np.array([2,4,6])
inter = np.ravel((a,b), order='F')

edited Apr 14 '23 at 09:23

Peter Mortensen

30,738
21
105
131

answered Nov 29 '17 at 23:38

Will

163
2
3

score 4 · Answer 7 · edited Apr 14 '23 at 09:19

4

Maybe this is more readable than JoshAdel's solution:

c = numpy.vstack((a,b)).ravel([-1])

edited Apr 14 '23 at 09:19

Peter Mortensen

30,738
21
105
131

answered Mar 18 '11 at 15:12

Benjamin

11,560
13
70
119

4

`ravel`'s `order` argument in [the documentation](http://docs.scipy.org/doc/numpy/reference/generated/numpy.ravel.html) is one of `C`, `F`, `A`, or `K`. I think you really want `.ravel('F')`, for FORTRAN order (column first) – Nick T Feb 11 '14 at 18:10

score 4 · Answer 8 · answered Nov 27 '19 at 19:33

4

I needed to do this but with multidimensional arrays along any axis. Here's a quick general purpose function to that effect. It has the same call signature as np.concatenate, except that all input arrays must have exactly the same shape.

import numpy as np

def interleave(arrays, axis=0, out=None):
    shape = list(np.asanyarray(arrays[0]).shape)
    if axis < 0:
        axis += len(shape)
    assert 0 <= axis < len(shape), "'axis' is out of bounds"
    if out is not None:
        out = out.reshape(shape[:axis+1] + [len(arrays)] + shape[axis+1:])
    shape[axis] = -1
    return np.stack(arrays, axis=axis+1, out=out).reshape(shape)

answered Nov 27 '19 at 19:33

clwainwright

1,624
17
21

1

+1 for such a generalized recipe (works with n-dim, interleaves along any axis, works for any number of input arrays, takes an optional `out` arg, and works for sub-classed arrays). Personally, I would prefer `axis` to default to `-1` rather than to `0`, but maybe that's just me. And you might want to link to this answer of yours, from [this question](https://stackoverflow.com/questions/42162300/how-to-interleave-numpy-ndarrays/42162780https://stackoverflow.com/questions/42162300/how-to-interleave-numpy-ndarrays/42162780), which actually asked for the input arrays to be n-dimensional. – fountainhead Dec 02 '20 at 00:11

score 3 · Answer 9 · edited Apr 14 '23 at 09:45

3

vstack sure is an option, but a more straightforward solution for your case could be the hstack:

a = array([1,3,5])
b = array([2,4,6])
hstack((a,b)) # Remember it is a tuple of arrays that this function swallows in.
array([1, 3, 5, 2, 4, 6])
sort(hstack((a,b)))
array([1, 2, 3, 4, 5, 6])

And more importantly this works for arbitrary shapes of a and b.

Also you may want to try out dstack:

a = array([1,3,5])
b = array([2,4,6])
dstack((a,b)).flatten()
array([1, 2, 3, 4, 5, 6])

You’ve got options now!

edited Apr 14 '23 at 09:45

Peter Mortensen

30,738
21
105
131

answered Mar 19 '11 at 04:33

bhanukiran

166
1
2
7

8

-1 to first answer because question has nothing to do with sorting. +1 to second answer, which is the best I've seen so far. This is why multiple solutions should be posted as multiple answers. Please split it into multiple answers. – endolith Jan 16 '13 at 17:57

score 2 · Answer 10 · edited Apr 14 '23 at 09:24

2

One can also try np.insert (the solution was migrated from Interleave NumPy arrays).

import numpy as np

a = np.array([1,3,5])
b = np.array([2,4,6])
np.insert(b, obj=range(a.shape[0]), values=a)

Please see the documentation and tutorial for more information.

edited Apr 14 '23 at 09:24

Peter Mortensen

30,738
21
105
131

answered Jan 28 '18 at 16:28

Tai

7,684
3
29
49

Arty · Answer 11 · 2020-09-24T09:33:29.160

2

Another one-liner: np.vstack((a,b)).T.ravel()
One more: np.stack((a,b),1).ravel()

edited Sep 24 '20 at 09:33

answered Jun 02 '20 at 05:53

Arty

14,883
6
36
69

Zhaosheng Pan · Answer 12 · 2022-09-21T10:05:39.840

0

Another one-liner:

>>> c = np.array([a, b]).T.flatten()
>>> c
array([1, 2, 3, 4, 5, 6])

edited Sep 21 '22 at 10:05

answered Sep 21 '22 at 07:43

Zhaosheng Pan

1
2

Muhammad Yasirroni · Answer 13 · 2023-04-15T10:42:33.513

0

For 2D numpy array:

def interleave2d(a, b):
    """Interleave between columns of two arrays"""
    c = np.empty((len(a), a.shape[1] * 2), dtype=a.dtype)
    c[:, 0::2] = a
    c[:, 1::2] = b
    return c

edited Apr 15 '23 at 10:42

answered Feb 21 '23 at 10:37

Muhammad Yasirroni

1,512
12
22

score 0 · Answer 14 · answered Jul 04 '23 at 19:49

Not the prettiest function, but I needed one that could interleave an arbitrary number of matrices. Maybe helpful?

 def interleave_narr(*args):
        ''' Given N numpy arrays, interleave arr i+1...i+N'''
        m_sizes = 0
        for m in args:
                m_sizes += m.size
        o = np.empty((m_sizes,), dtype=args[0].dtype)
    
        n_mats = len(args)
        for ii in range(n_mats):
            o[ii::n_mats] = args[ii]
        return o

Interleaving two NumPy arrays efficiently

14 Answers14

Linked

Related