112

Assume the following arrays are given:

a = array([1, 3, 5])
b = array([2, 4, 6])

How would one interleave them efficiently so that one gets a third array like the following?

c = array([1, 2, 3, 4, 5, 6])

It can be assumed that length(a) == length(b).

aschipfl
  • 33,626
  • 12
  • 54
  • 99
D R
  • 21,936
  • 38
  • 112
  • 149
  • 2
    How about, same question, but you are trying to interleave matrices. That is a and b are 3 dimensional, and not necessarily the same size in the first dimension. Note:Only the first dimension should be interleaved. – Geronimo Nov 17 '17 at 21:29
  • 1
    adding a comment for anyone trying to search "translate Wolfram Mathematica's Riffle to Python" and not finding anything. hope this was picked up by your search engine – Dan Oak Sep 05 '22 at 16:59

14 Answers14

194

I like Josh's answer. I just wanted to add a more mundane, usual, and slightly more verbose solution. I don't know which is more efficient. I expect they will have similar performance.

import numpy as np

a = np.array([1,3,5])
b = np.array([2,4,6])

c = np.empty((a.size + b.size,), dtype=a.dtype)
c[0::2] = a
c[1::2] = b
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Paul
  • 42,322
  • 15
  • 106
  • 123
  • 2
    Unless speed is really really important, I would go with this as it's much more comprehensible which is important if anyone is ever going to look at it again. – John Salvatier Mar 18 '11 at 03:02
  • 7
    +1 I played around with timings and your code surprisingly seems to be 2-5x faster depending on inputs. I still find the efficiency of these types of operations to be nonintuitive, so it's always worth it to use `timeit` to test things out if a particular operation is a bottleneck in your code. There are usually more than one way to do things in numpy, so definitely profile code snippets. – JoshAdel Mar 18 '11 at 03:04
  • @JoshAdel: I guess if `.reshape` creates an additional copy of the array, then that would explain a 2x performance hit. I don't think it always makes a copy, however. I'm guessing the 5x difference is only for small arrays? – Paul Mar 18 '11 at 03:39
  • looking at `.flags` and testing `.base` for my solution, it looks like the reshape to 'F' format creates a hidden copy of the vstacked data, so it's not a simple view as I thought it would be. And strangely the 5x is only for intermediate sized arrays for some reason. – JoshAdel Mar 18 '11 at 14:52
  • Another advantage of this answer is it's not limited to arrays of the same length. It could weave `n` items with `n-1` items. – EliadL Sep 25 '19 at 11:48
102

I thought it might be worthwhile to check how the solutions performed in terms of performance. And this is the result:

enter image description here

This clearly shows that the most upvoted and accepted answer (Paul's answer) is also the fastest option.

The code was taken from the other answers and from another Q&A:

# Setup
import numpy as np

def Paul(a, b):
    c = np.empty((a.size + b.size,), dtype=a.dtype)
    c[0::2] = a
    c[1::2] = b
    return c

def JoshAdel(a, b):
    return np.vstack((a,b)).reshape((-1,),order='F')

def xioxox(a, b):
    return np.ravel(np.column_stack((a,b)))

def Benjamin(a, b):
    return np.vstack((a,b)).ravel([-1])

def andersonvom(a, b):
    return np.hstack(zip(a,b))

def bhanukiran(a, b):
    return np.dstack((a,b)).flatten()

def Tai(a, b):
    return np.insert(b, obj=range(a.shape[0]), values=a)

def Will(a, b):
    return np.ravel((a,b), order='F')

# Timing setup
timings = {Paul: [], JoshAdel: [], xioxox: [], Benjamin: [], andersonvom: [], bhanukiran: [], Tai: [], Will: []}
sizes = [2**i for i in range(1, 20, 2)]

# Timing
for size in sizes:
    func_input1 = np.random.random(size=size)
    func_input2 = np.random.random(size=size)
    for func in timings:
        res = %timeit -o func(func_input1, func_input2)
        timings[func].append(res)

%matplotlib notebook

import matplotlib.pyplot as plt
import numpy as np

fig = plt.figure(1)
ax = plt.subplot(111)

for func in timings:
    ax.plot(sizes,
            [time.best for time in timings[func]],
            label=func.__name__)  # you could also use "func.__name__" here instead
ax.set_xscale('log')
ax.set_yscale('log')
ax.set_xlabel('size')
ax.set_ylabel('time [seconds]')
ax.grid(which='both')
ax.legend()
plt.tight_layout()

Just in case you have numba available you could also use that to create a function:

import numba as nb

@nb.njit
def numba_interweave(arr1, arr2):
    res = np.empty(arr1.size + arr2.size, dtype=arr1.dtype)
    for idx, (item1, item2) in enumerate(zip(arr1, arr2)):
        res[idx*2] = item1
        res[idx*2+1] = item2
    return res

It could be slightly faster than the other alternatives:

Enter image description here

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
MSeifert
  • 145,886
  • 38
  • 333
  • 352
  • 2
    Also of note, the accepted answer is _way_ faster than the a native Python solution with [`roundrobin()`](https://docs.python.org/3/library/itertools.html#itertools-recipes) from the itertools recipes. – Brad Solomon Jan 28 '18 at 17:41
  • As per the chart, Paul's answer appears to be the slowest as the size of the data increases. However, @MSeifert says that 'This clearly shows that the most upvoted and accepted answer (Pauls answer) is also the fastest option.' Given MSeifert's statement, I believe I am reading the chart wrong. Could you please clarify? – user3613932 Feb 10 '23 at 03:34
  • 2
    @user3613932 Pauls answer is the blue line. And regarding the interpretation: Lower means faster. The blue and the yellow-greenish line (numba/Paul) are lowest and therefore fastest. The pink and purple (Tai and andersonvom) are highest and therefore slowest. I agree that the line colors are not really easy to differentiate but you should be able to easily reproduce the graph with the given code. – MSeifert Feb 15 '23 at 15:31
  • Nice plot. Can you share to us how to call the benchmark with the `Timing setup` as the input, also the `plot` function? Thank you. – Muhammad Yasirroni Feb 21 '23 at 10:41
  • @MuhammadYasirroni I don't know what you mean. The code in this answer should be runnable as-is (in a jupyter notebook environment at least). :) – MSeifert Feb 23 '23 at 09:00
  • @MSeifert sorry, did not see the scroll bar that show how to run the benchmark. Thanks. – Muhammad Yasirroni Feb 24 '23 at 09:46
  • The axes text is nearly *unreadable* in dark mode. – Peter Mortensen Apr 14 '23 at 09:36
48

Here is a one-liner:

c = numpy.vstack((a,b)).reshape((-1,),order='F')
JoshAdel
  • 66,734
  • 27
  • 141
  • 140
26

Here is a simpler answer than some of the previous ones

import numpy as np
a = np.array([1,3,5])
b = np.array([2,4,6])
inter = np.ravel(np.column_stack((a,b)))

After this inter contains:

array([1, 2, 3, 4, 5, 6])

This answer also appears to be marginally faster:

In [4]: %timeit np.ravel(np.column_stack((a,b)))
100000 loops, best of 3: 6.31 µs per loop

In [8]: %timeit np.ravel(np.dstack((a,b)))
100000 loops, best of 3: 7.14 µs per loop

In [11]: %timeit np.vstack((a,b)).ravel([-1])
100000 loops, best of 3: 7.08 µs per loop
Community
  • 1
  • 1
xioxox
  • 2,526
  • 1
  • 22
  • 22
10

This will interleave/interlace the two arrays and I believe it is quite readable:

a = np.array([1,3,5])      #=> array([1, 3, 5])
b = np.array([2,4,6])      #=> array([2, 4, 6])
c = np.hstack( zip(a,b) )  #=> array([1, 2, 3, 4, 5, 6])
andersonvom
  • 11,701
  • 4
  • 35
  • 40
8

Improving xioxox's answer:

import numpy as np

a = np.array([1,3,5])
b = np.array([2,4,6])
inter = np.ravel((a,b), order='F')
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Will
  • 163
  • 2
  • 3
4

Maybe this is more readable than JoshAdel's solution:

c = numpy.vstack((a,b)).ravel([-1])
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Benjamin
  • 11,560
  • 13
  • 70
  • 119
  • 4
    `ravel`'s `order` argument in [the documentation](http://docs.scipy.org/doc/numpy/reference/generated/numpy.ravel.html) is one of `C`, `F`, `A`, or `K`. I think you really want `.ravel('F')`, for FORTRAN order (column first) – Nick T Feb 11 '14 at 18:10
4

I needed to do this but with multidimensional arrays along any axis. Here's a quick general purpose function to that effect. It has the same call signature as np.concatenate, except that all input arrays must have exactly the same shape.

import numpy as np

def interleave(arrays, axis=0, out=None):
    shape = list(np.asanyarray(arrays[0]).shape)
    if axis < 0:
        axis += len(shape)
    assert 0 <= axis < len(shape), "'axis' is out of bounds"
    if out is not None:
        out = out.reshape(shape[:axis+1] + [len(arrays)] + shape[axis+1:])
    shape[axis] = -1
    return np.stack(arrays, axis=axis+1, out=out).reshape(shape)
clwainwright
  • 1,624
  • 17
  • 21
  • 1
    +1 for such a generalized recipe (works with n-dim, interleaves along any axis, works for any number of input arrays, takes an optional `out` arg, and works for sub-classed arrays). Personally, I would prefer `axis` to default to `-1` rather than to `0`, but maybe that's just me. And you might want to link to this answer of yours, from [this question](https://stackoverflow.com/questions/42162300/how-to-interleave-numpy-ndarrays/42162780https://stackoverflow.com/questions/42162300/how-to-interleave-numpy-ndarrays/42162780), which actually asked for the input arrays to be n-dimensional. – fountainhead Dec 02 '20 at 00:11
3

vstack sure is an option, but a more straightforward solution for your case could be the hstack:

a = array([1,3,5])
b = array([2,4,6])
hstack((a,b)) # Remember it is a tuple of arrays that this function swallows in.
array([1, 3, 5, 2, 4, 6])
sort(hstack((a,b)))
array([1, 2, 3, 4, 5, 6])

And more importantly this works for arbitrary shapes of a and b.

Also you may want to try out dstack:

a = array([1,3,5])
b = array([2,4,6])
dstack((a,b)).flatten()
array([1, 2, 3, 4, 5, 6])

You’ve got options now!

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
bhanukiran
  • 166
  • 1
  • 2
  • 7
  • 8
    -1 to first answer because question has nothing to do with sorting. +1 to second answer, which is the best I've seen so far. This is why multiple solutions should be posted as multiple answers. Please split it into multiple answers. – endolith Jan 16 '13 at 17:57
2

One can also try np.insert (the solution was migrated from Interleave NumPy arrays).

import numpy as np

a = np.array([1,3,5])
b = np.array([2,4,6])
np.insert(b, obj=range(a.shape[0]), values=a)

Please see the documentation and tutorial for more information.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Tai
  • 7,684
  • 3
  • 29
  • 49
2

Another one-liner: np.vstack((a,b)).T.ravel()
One more: np.stack((a,b),1).ravel()

Arty
  • 14,883
  • 6
  • 36
  • 69
0

Another one-liner:

>>> c = np.array([a, b]).T.flatten()
>>> c
array([1, 2, 3, 4, 5, 6])
0

For 2D numpy array:

def interleave2d(a, b):
    """Interleave between columns of two arrays"""
    c = np.empty((len(a), a.shape[1] * 2), dtype=a.dtype)
    c[:, 0::2] = a
    c[:, 1::2] = b
    return c
Muhammad Yasirroni
  • 1,512
  • 12
  • 22
0

Not the prettiest function, but I needed one that could interleave an arbitrary number of matrices. Maybe helpful?

 def interleave_narr(*args):
        ''' Given N numpy arrays, interleave arr i+1...i+N'''
        m_sizes = 0
        for m in args:
                m_sizes += m.size
        o = np.empty((m_sizes,), dtype=args[0].dtype)
    
        n_mats = len(args)
        for ii in range(n_mats):
            o[ii::n_mats] = args[ii]
        return o
K. W. Cooper
  • 313
  • 3
  • 12