Ok, after coming to my senses, here's a non-ridiculous version of window_iter_fill
. My previous version (visible in edits) was terrible because I forgot to use izip
. Not sure what I was thinking. Using izip, this works, and, in fact, is the fastest option for small inputs!
def window_iter_fill(gen, size=2, fill=None):
gens = (chain(repeat(fill, size - i - 1), gen, repeat(fill, i))
for i, gen in enumerate(tee(gen, size)))
return izip(*gens)
This one is also fine for tuple-yielding, but not quite as fast.
def window_iter_deque(it, size=2, fill=None, fill_left=False, fill_right=False):
lfill = repeat(fill, size - 1 if fill_left else 0)
rfill = repeat(fill, size - 1 if fill_right else 0)
it = chain(lfill, it, rfill)
d = deque(islice(it, 0, size - 1), maxlen=size)
for item in it:
d.append(item)
yield tuple(d)
HoverHell's newest solution is still the best tuple-yielding solution for high inputs.
Some timings:
Arguments: [xrange(1000), 5, 'x', True, True]
==============================================================================
window HoverHell's frankeniter : 0.2670ms [1.91x]
window_itertools from old itertools docs : 0.2811ms [2.02x]
window_iter_fill extended `pairwise` with izip : 0.1394ms [1.00x]
window_iter_deque deque-based, copying : 0.4910ms [3.52x]
ia_with_copy deque-based, copying v2 : 0.4892ms [3.51x]
ia deque-based, no copy : 0.2224ms [1.60x]
==============================================================================
Scaling behavior:
Arguments: [xrange(10000), 50, 'x', True, True]
==============================================================================
window HoverHell's frankeniter : 9.4897ms [4.61x]
window_itertools from old itertools docs : 9.4406ms [4.59x]
window_iter_fill extended `pairwise` with izip : 11.5223ms [5.60x]
window_iter_deque deque-based, copying : 12.7657ms [6.21x]
ia_with_copy deque-based, copying v2 : 13.0213ms [6.33x]
ia deque-based, no copy : 2.0566ms [1.00x]
==============================================================================
The deque-yielding solution by agf is super fast for large inputs -- seemingly O(n) instead of O(n, m) like the others, where n is the length of the iter and m is the size of the window -- because it doesn't have to iterate over every window. But I still think it makes more sense to yield a tuple in the general case, because the calling function is probably just going to iterate over the deque anyway; it's just a shift of the computational burden. The asymptotic behavior of the larger program should remain the same.
Still, in some special cases, the deque
-yielding version will probably be faster.
Some more timings based on HoverHell's test structure.
>>> import testmodule
>>> kwa = dict(gen=xrange(1000), size=4, fill=-1, fill_left=True, fill_right=True)
>>> %timeit -n 1000 [a + b + c + d for a, b, c, d in testmodule.window(**kwa)]
1000 loops, best of 3: 462 us per loop
>>> %timeit -n 1000 [a + b + c + d for a, b, c, d in testmodule.ia(**kwa)]
1000 loops, best of 3: 463 us per loop
>>> %timeit -n 1000 [a + b + c + d for a, b, c, d in testmodule.window_iter_fill(**kwa)]
1000 loops, best of 3: 251 us per loop
>>> %timeit -n 1000 [sum(x) for x in testmodule.window(**kwa)]
1000 loops, best of 3: 525 us per loop
>>> %timeit -n 1000 [sum(x) for x in testmodule.ia(**kwa)]
1000 loops, best of 3: 462 us per loop
>>> %timeit -n 1000 [sum(x) for x in testmodule.window_iter_fill(**kwa)]
1000 loops, best of 3: 333 us per loop
Overall, once you use izip
, window_iter_fill
is quite fast, as it turns out -- especially for small windows.