2

I'm trying to combine different length numpy arrays as one could equivalently do using lists with itertools.zip_longest. Say I have:

a = np.array([1, 5, 9, 13])
b = np.array([2, 6])

With itertools one could interleave these two arrays using chain and zip_longest, and fill the missing values with say 0:

from itertools import chain, zip_longest
list(chain(*zip_longest(*[a, b], fillvalue=0)))
# [1, 2, 5, 6, 9, 0, 13, 0]

Is there a simple way to do this using numpy that I'm missing?

yatu
  • 86,083
  • 12
  • 84
  • 139
  • [You mean this?](https://stackoverflow.com/a/5347492/8204776) – meowgoesthedog Apr 29 '19 at 10:05
  • As the example is put, I think I'd first allocate the output array with zeros and assign the strided slices, but I don't know if that would work in your real use case. – jdehesa Apr 29 '19 at 10:06
  • @meowgoesthedog: If you remove 6 from the `b` array from the linked answer, you are missing a 0 which is what OP wants as the fillvalue – Sheldore Apr 29 '19 at 10:06
  • I'm looking preferably for a general solution, applicable to different length arrays instead of using strided slices from the array @jdehesa. Perhaps I should make it more general – yatu Apr 29 '19 at 10:08

2 Answers2

2

Here's an almost vectorized one -

# https://stackoverflow.com/a/38619350/3293881 @Divakar
def boolean_indexing(v):
    lens = np.array([len(item) for item in v])
    mask = lens[:,None] > np.arange(lens.max())
    out_dtype = np.result_type(*[arr.dtype for arr in v])
    out = np.zeros(mask.shape,dtype=out_dtype)
    out[mask] = np.concatenate(v)
    return out

v = [a,b] # list of all input arrays
out = boolean_indexing(v).ravel('F')

Sample run -

In [23]: a = np.array([1, 5, 9, 13])
    ...: b = np.array([2, 6])
    ...: c = np.array([7, 8, 10])
    ...: v = [a,b,c]

In [24]: boolean_indexing(v).ravel('F')
Out[24]: array([ 1,  2,  7,  5,  6,  8,  9,  0, 10, 13,  0,  0])
Divakar
  • 218,885
  • 19
  • 262
  • 358
  • Ahh this is great!! Nice use of broadcasting here, and thanks for sharing an easier way to interleave the result array using `.ravel('F')`! – yatu Apr 29 '19 at 11:30
1

I think I'd do that like this:

import numpy as np

def chain_zip_longest(*arrs, fillvalue=0, dtype=None):
    arrs = [np.asarray(arr) for arr in arrs]
    if not arrs:
        return np.array([])
    n = len(arrs)
    dtype = dtype or np.find_common_type([arr.dtype for arr in arrs], [])
    out = np.full(n * max(len(arr) for arr in arrs), fillvalue, dtype=dtype)
    for i, arr in enumerate(arrs):
        out[i:i + n * len(arr):len(arrs)] = arr
    return out

print(chain_zip_longest([1, 2], [3, 4, 5], [6]))
# [1 3 6 2 4 0 0 5 0]
jdehesa
  • 58,456
  • 7
  • 77
  • 121
  • Nice solution @jdehesa. Seems like there's no simple way around this and assigning in a for loop like here with `enumerate` is the way to go. Doesn't seem like there's any stacking method such as `np.column_stack` with some other further concatenation that supports this functionality – yatu Apr 29 '19 at 10:26
  • @yatu Yes, it's always complicated with variable-length arrays. I suppose you could also do this through a sparse matrix (each given array as a column and then flatten it), but I'm not sure if more efficiently (without copies or no looping)... – jdehesa Apr 29 '19 at 10:37