Vectorized solution to filling a 1-D numpy array given a list of start and end indices for slicing?

Question

Given an 1-D array of zeros called a:

In [38]: a

Out[38]: array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.])

I would like to fill certain indices with certain values. I have a list of start and end indices with the associated value that should be filled in these locations. This is stored in a list: fill_oneDim_array

[[1, 3, 500], [5, 7, 1000], [9, 15, 200]]

For example: [1, 3, 500], fill array a as such; a[1:3] = 500. Repeat for [5, 7, 100] as a[5:7] = 1000.

Is there a vectorized solution to this? I want to avoid for loops as much as possible.

My research so far: - to my knowledge there doesn't seem to be a obvious solution to this.

How are you doing this now? If your non vectorized solution is not completely naive, I'd like to see it. — Mad Physicist, Dec 10 '17 at 15:01
For this test case, the obvious iteration is fast(est): `for s,e,v in spec: target[s:e]=v`. It scales with the number of fill sets. — hpaulj, Dec 10 '17 at 17:32
The relative advantage of the 'vectorized' solutions (if any) varies with number of slices compared to their length. Vectorization as proposed in the answers is only worth it if there are many short slices. — hpaulj, Dec 10 '17 at 17:55
What is the intended behavior if the ranges overlap? Or will they never do that? — Daniel F, Dec 11 '17 at 07:43

Divakar · Answer 1 · 2017-12-10T18:26:44.243

Here's a vectorized method inspired from the trick mentioned in this post -

def fillval(a, fill):
    info = np.asarray(fill)
    start, stop, val = info.T
    id_arr = np.zeros(len(a), dtype=int)
    id_arr[start] = 1
    id_arr[stop] = -1
    a[id_arr.cumsum().astype(bool)] = np.repeat(val, stop - start)
    return a

Sample run -

In [676]: a = np.zeros(20, dtype=int)
     ...: fill = [[1, 3, 500], [5, 7, 1000], [9, 15, 200]]

In [677]: fillval(a, fill)
Out[677]: 
array([   0,  500,  500,    0,    0, 1000, 1000,    0,    0,  200,  200,
        200,  200,  200,  200,    0,    0,    0,    0,    0])

Modified/optimized version

This could be modified/optimized further to do everything on the input with minimal memory footprint, like so -

def fillval(a, fill):
    fill = np.asarray(fill)
    start, stop, val = fill[:,0], fill[:,1], fill[:,2]
    a[start] = val
    a[stop] = -val
    return a.cumsum()

Sample run -

In [830]: a = np.zeros(20, dtype=int)
     ...: fill = [[1, 3, 500], [5, 7, 1000], [9, 15, 200]]

In [831]: fillval(a, fill)
Out[831]: 
array([   0,  500,  500,    0,    0, 1000, 1000,    0,    0,  200,  200,
        200,  200,  200,  200,    0,    0,    0,    0,    0])

Benchmarking

Other approaches -

# Loopy one
def loopy(a, fill):
    for start,stop,val in fill:
        a[start:stop] = val
    return a

# @Paul Panzer's soln
def multifill(target, spec):
    spec = np.asarray(spec)    
    inds = np.zeros((2*len(spec) + 2,), dtype=int)
    inds[-1] = len(target)
    inds[1:-1] = spec[:, :2].astype(int).ravel()
    lens = np.diff(inds)
    mask = np.repeat((np.arange(len(lens), dtype=np.uint8)&1).view(bool), lens)
    target[mask] = np.repeat(spec[:, 2], lens[1::2])
    return target

Timings -

Case #1 : Tightly spaced short groups

In [912]: # Setup inputs with group lengths at maximum extent of 10
     ...: L = 10000 # decides number of groups
     ...: np.random.seed(0)
     ...: s0 = np.random.randint(0,9,(L)) + 20*np.arange(L)
     ...: s1 = s0 + np.random.randint(2,10,(len(s0)))
     ...: fill = np.c_[s0,s1, np.random.randint(0,9,(len(s0)))].tolist()
     ...: len_a = fill[-1][1]+1
     ...: a0 = np.zeros(len_a, dtype=int)
     ...: a1 = a0.copy()
     ...: a2 = a0.copy()

In [913]: %timeit loopy(a0, fill)
     ...: %timeit multifill(a1, fill)
     ...: %timeit fillval(a2, fill)
100 loops, best of 3: 4.26 ms per loop
100 loops, best of 3: 4.49 ms per loop
100 loops, best of 3: 3.34 ms per loop

In [914]: # Setup inputs with group lengths at maximum extent of 10
     ...: L = 100000 # decides number of groups

In [915]: %timeit loopy(a0, fill)
     ...: %timeit multifill(a1, fill)
     ...: %timeit fillval(a2, fill)
10 loops, best of 3: 43.2 ms per loop
10 loops, best of 3: 49.4 ms per loop
10 loops, best of 3: 38.2 ms per loop

Case #2 : Widely spaced long groups

In [916]: # Setup inputs with group lengths at maximum extent of 10
     ...: L = 10000 # decides number of groups
     ...: np.random.seed(0)
     ...: s0 = np.random.randint(0,9,(L)) + 100*np.arange(L)
     ...: s1 = s0 + np.random.randint(10,50,(len(s0)))
     ...: fill = np.c_[s0,s1, np.random.randint(0,9,(len(s0)))].tolist()
     ...: len_a = fill[-1][1]+1
     ...: a0 = np.zeros(len_a, dtype=int)
     ...: a1 = a0.copy()
     ...: a2 = a0.copy()

In [917]: %timeit loopy(a0, fill)
     ...: %timeit multifill(a1, fill)
     ...: %timeit fillval(a2, fill)
100 loops, best of 3: 4.51 ms per loop
100 loops, best of 3: 9.18 ms per loop
100 loops, best of 3: 5.16 ms per loop

In [921]: # Setup inputs with group lengths at maximum extent of 10
     ...: L = 100000 # decides number of groups

In [922]: %timeit loopy(a0, fill)
     ...: %timeit multifill(a1, fill)
     ...: %timeit fillval(a2, fill)
10 loops, best of 3: 44.9 ms per loop
10 loops, best of 3: 89 ms per loop
10 loops, best of 3: 58.3 ms per loop

So, choosing the fastest one depends on the use case, specifically on the typical group lengths and their spread within the input array.

Just finished mine ... Your solution actually looks better in my benchmarks than in yours. — Paul Panzer, Dec 10 '17 at 16:14
@PaulPanzer Yup, I assumed less variance on group lengths (just 2-10) and also tighter in input array. — Divakar, Dec 10 '17 at 16:34
@PaulPanzer Seemed like `ar.cumsum()` was faster. Also, edited benchmarks with wider group lens. — Divakar, Dec 10 '17 at 16:47
Than `np.cumsum`? Isn't @hpaulj always telling us that the function versions actually call the method versions if present? But that shouldn't make a huge difference... — Paul Panzer, Dec 10 '17 at 16:58
@PaulPanzer Nah, I was using `np.cumsum(a, out=a)` earlier. That writing back seems slower. — Divakar, Dec 10 '17 at 17:00

Paul Panzer · Accepted Answer · 2017-12-11T01:27:13.090

You can build a mask and fill values using np.repeat:

import numpy as np

def multifill(target, spec):
    inds = np.zeros((2*len(spec) + 2,), dtype=int)
    inds[-1] = len(target)
    inds[1:-1] = spec[:, :2].astype(int).ravel()
    lens = np.diff(inds)
    mask = np.repeat((np.arange(len(lens), dtype=np.uint8)&1).view(bool), lens)
    target[mask] = np.repeat(spec[:, 2], lens[1::2])

target = np.zeros((16,))
spec = np.array([[1, 3, 500], [5, 7, 1000], [9, 15, 200]])
multifill(target, spec)
print(target)

# [    0.   500.   500.     0.     0.  1000.  1000.     0.     0.   200.
#    200.   200.   200.   200.   200.     0.]

Benchmarks. Divakar2 is fastest, but it requires the template to be all zeros. PP and Divakar1 are more flexible. Update: These are all vaporized by the simple loop, "thanks" @hpaulj.

# hpaulj                0.00256890 ms
# pp                    0.01587310 ms
# D1                    0.01193481 ms
# D2                    0.00533720 ms
# n=100000
# hpaulj                0.03514440 ms
# pp                    0.57968440 ms
# D1                    0.87605349 ms
# D2                    0.34365610 ms
# n=1000000
# hpaulj                0.50301510 ms
# pp                    6.91325230 ms
# D1                    8.96669030 ms
# D2                    3.97435970 ms

Code:

import numpy as np
import types
from timeit import timeit

def f_hpaulj(target, spec):
    for s, e, v in spec:
        target[int(s):int(e)] = v

def f_pp(target, spec):
    inds = np.zeros((2*len(spec) + 2,), dtype=int)
    inds[-1] = len(target)
    inds[1:-1:2] = spec[:, 0].astype(int)
    inds[2:-1:2] = spec[:, 1].astype(int)
    lens = np.diff(inds)
    mask = np.repeat((np.arange(len(lens), dtype=np.uint8)&1).view(bool), lens)
    target[mask] = np.repeat(spec[:, 2], lens[1::2])

def f_D1(a, info):
    start, stop, val = info[:,0].astype(int), info[:,1].astype(int), info[:,2]
    id_arr = np.zeros(len(a), dtype=int)
    id_arr[start] = 1
    id_arr[stop] = -1
    a[id_arr.cumsum().astype(bool)] = np.repeat(val, stop - start)

def f_D2(a, info):
    start, stop, val = info[:,0].astype(int), info[:,1].astype(int), info[:,2]
    id_arr = np.zeros(len(a), dtype=val.dtype)
    id_arr[start] = val
    id_arr[stop] = -val
    return id_arr.cumsum()

def setup_data(n, k):
    inds = np.sort(np.random.randint(0, n-2*k, (2*k,)) + np.arange(2*k))
    return np.c_[inds.reshape(-1, 2), np.random.randint(1, 10, (k,))].astype(float)

for n in (100, 100000, 1000000):
    k = 3**(n.bit_length()>>3)
    spec = setup_data(n, k)
    ref = np.zeros((n,))
    f_pp(ref, spec)
    print(f"n={n}")
    for name, func in list(globals().items()):
        if not name.startswith('f_') or not isinstance(func, types.FunctionType):
            continue
        try:
            res = np.zeros((n,))
            ret = func(res, spec)
            if not ret is None:
                res = ret
            assert np.allclose(ref, res)
            print("{:16s}{:16.8f} ms".format(name[2:], timeit(
                'f(a, spec)', 'a=np.zeros((n,))',
                globals={'f':func, 'spec':spec, 'np':np, 'n':n}, number=10)*100))
        except Exception:
            print("{:16s} apparently failed".format(name[2:]))

Your `k` is small relative to `n`. For that the simple iteration is substantially faster. `for s,e,v in spec: target[int(s):int(e)]=v` — hpaulj, Dec 10 '17 at 17:45
@hpaulj Yup! You are right there. Edited my timings to reflect that. — Divakar, Dec 10 '17 at 18:27

Vectorized solution to filling a 1-D numpy array given a list of start and end indices for slicing?

2 Answers2

Benchmarking