Stopping a Reduce() operation mid way. Functional way of doing partial running sum

Question

I have been doing some functional programming and had a question. Perhaps I might be missing something but is there any way to stop a "reduce()" function midway? Lets say when I reach a certain condition? The idea somehow seems anti functional. I haven't seen any such option in python or F#,

As an example, lets say I have a list such as [1,2,3,4,5]. I want to sum the elements in this list until the sum is not greater than some number (lets say 8), and return/mark/store/identify somehow, the number of elements I have actually added.

If we looked at python for example for I might try something like

reduce(lambda a,b : a if a + b > 8 else a + b, input)

This gives me the right answer 6, but how do I find that I had added 3 elements to get here. There is no counter as such. I can't do assignments inside lambdas. I think F# has the same situation.

I know I can use a for loop or use a function that can store state etc. But what would be the functional way of doing/thinking about this. Reduce() wants to run until the end, but somewhere along this line of processing, we either want to stop it (because we don't care about processing the rest of the elements) or at least make a note of the place where we stopped caring.

What's important to you, the 3 or the 6? Or both? How would you want to use this function? Return a tuple - `(num_items, result)`? It's a neat idea, but I think a loop is the most straightforward code. — Stephen, Jun 28 '10 at 06:07
They are both important. I want to know I can take 3 elements and that the closest I can get to my limit is 6. Yes, a loop would be pretty straight forward, but I wanted to see how a functional programmer would attack it / think about it. I can't return a tuple, because reduce needs another int from the function to add to the next element in the list. — Chaitanya, Jun 28 '10 at 06:14
Regarding Python, it could be possible to write a `filtered_reduce` function, but Python remains to be an imperative language whose functional-like features should not be overestimated. In Python the call to `reduce` is translated into a simple loop anyway, so you gain nothing from it. — Philipp, Jun 28 '10 at 06:29

tux21b · Accepted Answer · 2010-06-28T06:46:17.883

15

Reduce is often used in combination with map. Google for example has developed a map-reduce framework for querying their databases and this map-reduce pattern is now used in several other projects (e.g. CouchDB, Hadoop, etc).

First, you need to map the input variables [2, 1, 3, 4, 5] to something like:

[(1, 2), (1, 1), (1, 3), (1, 4), (1, 5)]

In that case, x[0] will represent the number of the elements to get the sum x[1]. Of course, the number of elements is 1 at the beginning for each single element.

The next thing then, is to operate on those tuples:

reduce(
    lambda a, b: a if a[1] + b[1] > 8 else (a[0] + b[0], a[1] + b[1]),
    map(lambda x: (1, x), input))

This will return (3, 6), meaning the partial sum is 6 using 3 elements.

I hope you got the idea behind map-reduce-algorithms.

Regards,
Christoph

edited Jun 28 '10 at 06:46

answered Jun 28 '10 at 06:24

tux21b

90,183
16
117
101

Oooohhhh.... niiice. I had read about map reduce, but I guess I didn't fully grok it. Very nicely done. – Chaitanya Jun 28 '10 at 07:08
1

Here are two links which might interest you: Google's Map-Reduce paper (http://labs.google.com/papers/mapreduce.html) and a course Map Reduce in a Week (http://code.google.com/edu/submissions/mapreduce/listing.html). – tux21b Jun 28 '10 at 07:21
And a Python framework (based on Erlang) for doing efficient map-reduce computing is Disco. With that you can use multiple cores / computers and work with (nearly) unlimited data sets... http://discoproject.org/ – tux21b Jun 28 '10 at 07:54
1

I'm not downvoting, but this can hardly be idiomatic FP..? Chaitanya has picked his golden hammer, and you're helping him/her use it to bash a square peg into a round hole. – eevar Jun 28 '10 at 07:59
Hehe, well put! Anyway, map-reduce algorithms can be more maintainable and faster because they can scale well (of course not with the Python build-ins!). And for a short Python solution where performance doesn't matter at all, they might be shorter to write. So Map-Reduce isn't that bad, but it's surely not the fastest Python-only solution... no doubt on that :) – tux21b Jun 28 '10 at 08:20
2

Nice description of map/reduce, but if the input contained a million values and we hit the exit condition after three of them that's a lot of empty work being done. When you hit the exit condition use an exception to exit the loop. – Duncan Jun 28 '10 at 09:00
@eevar - Surely this is more idiomatic FP than using a loops with counter increments. JaredPar and Tomas have similar FP'ish solutions, but I like this because its shorter and also uses map/reduce :-p. So can you explain which part of it you find non FP - idiomatic. @Duncan - Yes for a relatively small program that doesn't do partitioning map reduce might be an overkill, and python's implementation is loops anyway. I wasn't concerned about python or F# particularly, but more about thinking "Functionally". I have some data, the how, what and when of running functions over it to manipulate it. – Chaitanya Jun 28 '10 at 10:54
It might make more sense to use `reduce( lambda a, b: a if a[1] + b[1] > 8 else (b[0], a[1] + b[1]), enumerate(input, 1))`... [**Try it here**](http://repl.it/nKj/1) – mbomb007 May 08 '15 at 19:49

score 9 · Answer 2 · answered Jun 28 '10 at 10:27

I agree with JaredPar that writing your own recursive function that behaves similarly to fold, but allows you to stop the computation earlier is the best approach. The way I would write it is a bit more general (so that you can use the function for any situation where you need folding that can stop earlier):

// Generalized 'fold' function that allws you to stop the execution earlier
// The function 'f' has a type 'State -> 'T -> Option<'State>
// By returning 'None' we can stop the execution (and return the 
// current state), by returning Some(newState), we continue folding
let rec foldStop f state input = 
  match input with
  | x::xs -> 
      match f state x with
      | None -> state
      | Some(newState) -> foldStop f newState xs
  | [] -> state

// Example that stops folding after state is larger than 10
foldStop (fun st n -> if st > 10 then None else Some(st + n)) 0 [ 1 .. 10 ]

This is a very general function and you can use it for all similar scenarios. The nice thing about writing it is that you will never need to write similar explicit recursion again (because you can just use foldStop once you have it).

Note that you can use foldStop to implement fold by always wrapping the result of the accumulation function in 'Some' (so it is more general):

let fold f state input = 
  foldStop (fun st n -> Some(f st n)) state input

But I want to return the final state when I stopped as well as place where I stopped as well. My F# isn't fluent enough, but this would require changing the state and the input function as follows : foldStop (fun (st,i) n -> if st > 10 then None else Some(st + n, i + 1)) (0,0) [ 1 .. 10 ] — Chaitanya, Jun 28 '10 at 11:42
@Chaitanya: Yes, that would require changing the code a little bit (or you would need to update the condition to stop on the next state). Alternatively, you could use `Choice` instead of `Option` (that allows you to return the state, but still break the computation by returning a speical case). — Tomas Petricek, Jun 28 '10 at 12:13

tokland · Answer 3 · 2017-10-13T21:04:31.867

6

Let's imagine Python had two functions, ireduce (similar to reduce but it would yield intermediate values; it's called scanl in some languages) and ilast (get last item of an iterable):

from itertools import takewhile
from operator import add
xs = [1, 2, 3, 4, 5]
pair = ilast(enumerate(takewhile(lambda x: x < 8, ireduce(add, xs, 0))))
# (3, 6)

In Haskell:

last $ zip [0..] (takeWhile (< 8) (scanl (+) 0 xs))

edited Oct 13 '17 at 21:04

answered Jun 28 '10 at 14:09

tokland

66,169
13
144
170

Hmmm ... Haskell one of those languages I keep wanting to learn but never get around to it – Chaitanya Jun 29 '10 at 00:36
`itertools.dropwhile` is also similarly useful together with `next` to return the first item in an iterable. – Asclepius Oct 12 '17 at 17:44

score 5 · Answer 4 · answered Jun 28 '10 at 06:08

I think that the 'most functional' way to do this is probably via lazy evaluation. If you're in a lazy language like Haskell, or in an eager language but using a lazy list data structure (like LazyList in the F# PowerPack), you can create e.g. a 'scan' of the running sums, and then leave it in the hands of the consumer of the list to decide how much she wants/needs to evaluate.

Or, you know, write a simple recursive function, like @JaredPar's answer. For some reason I often get a mental block on that, preventing me from noticing that "not everything has to be a fold, you can in fact write your own recursive functions" :)

Indeed. I am in that block now ... I keep thinking there's gotta be a way to fold or partially fold this thing. I know there are other ways of doing it, but fold/reduce keeps beckoning me — Chaitanya, Jun 28 '10 at 07:12

score 3 · Answer 5 · answered Jun 28 '10 at 06:07

3

Try the following

let sumUntil list stopAfter = 
    let rec inner list sum = 
        if sum >= stopAfter then sum
        else 
            match list with
            | [] -> sum
            | h::t-> inner t (sum + h)
    inner list 0

F# interactive result

> sumUntil [1;2;3;4;5] 8;;
val it : int = 10

answered Jun 28 '10 at 06:07

JaredPar

733,204
149
1,241
1,454

In other words, not use reduce at all? I just keep thinking there must be someway in the lambda/function thats passed to a reduce that there should be a way to make some state changes and/or stop abort the processing – Chaitanya Jun 28 '10 at 06:19
Right, `reduce` is no good for this. It has the wrong type signature, and it always processes the entire list. – Brian Jun 28 '10 at 06:29
This is only returning the sum though. Not the number of elements that we had added up. But I guess it would be easy to change the inner recursive loop to take a counter and thread that counter through while incrementing it every time we call the inner recursive loop – Chaitanya Jun 28 '10 at 11:16

Stephen · Answer 6 · 2010-06-28T06:41:04.140

2

This is a function that implements that functional program:

>>> def limited_reduce(reducer, pred, lst):
...  i = 0
...  y = lst[0]
...  while pred(y) and i < len(lst):
...    i += 1
...    y = reducer(lst[i], y)
...  return (i, y)

or recursively:

>>> def limited_reduce(reducer, pred, lst):
...   def helper(i, accum, rest):
...     if not rest or not pred(accum): return (i, accum)
...     return helper(i+1, reducer(rest[0], accum), rest[1:])
...   return helper(0, lst[0], lst[1:])

There's probably a way to clean it up a bit, but you would use it like this:

>>>> limited_reduce(lambda x,y: x+y, lambda r: r < 6, [1,2,1,3,2])
(3, 7)

edited Jun 28 '10 at 06:41

answered Jun 28 '10 at 06:25

Stephen

47,994
7
61
70

Good solution, +1 from me. But note that your `reduce` is `foldr` and requires a sequence, unlike the builtin `reduce`. – Philipp Jun 28 '10 at 06:33
@Philipp : Thanks! Good point about the sequence. Now you've got me reading about `foldr` :) – Stephen Jun 28 '10 at 06:49

Johan Kullbom · Answer 7 · 2010-06-29T11:54:22.963

Another functional approch could be using a "continution"-based version of reduce/fold:

let rec foldC fn acc cont = function
    | []      -> acc
    | x :: xs -> fn x acc (fun acc -> foldC fn acc cont xs)

Call with 'id' (fun x -> x) as 'initial continuation':

foldC (fun x sum c -> 
           if (sum + x) > 8 
           then sum 
           else c (sum + x))
      0
      (fun x -> x) 
      [1; 2; 3; 4; 5]

And you will get your '6'.

Note that this version of foldC is not tail recursive - or otherwise recommended - thought...

score 2 · Answer 8 · answered Jun 28 '10 at 21:00

I think this does what you are after, using functions built-in to the F# Seq module:

let answer =
    [1; 2; 3; 4; 5]
    |> Seq.scan (fun (count,sum) x -> (count+1, sum + x) ) (0,0)
    |> Seq.find (fun (_,x) -> x > 8)

The "scan" function is similar to "fold", but returns a sequence containing intermediate (and final) states, rather than just the final state. In this case, the state is a tuple containing a count and sum of items thus far processed, starting with (0,0). This gets computed and fed, one at a time, into the "find" function, which returns the first element which matches the supplied condition (v>8), in this case (4,10).

The only issue you'd need to handle with the above is the case where the "find" condition is never satisfied, in which case a KeyNotFoundException is thrown. You could use "tryFind" which returns an option value. However, I can't see a graceful way to return the last element computed if no earlier state matches the condition, short of pre-computing the length of the sequence:

let xs = [1; 2; 3; 4; 5]
let len = Seq.length xs
let answer =
    xs
    |> Seq.scan (fun (count,acc) v -> (count+1, v + acc) ) (0,0)
    |> Seq.find (fun (count,v) -> v > 99 || count = len)

Grant Palmer · Answer 9 · 2018-04-18T14:35:31.903

If you want to avoid carrying out unnecessary computations (which you still will with a map-reduce algorithm), write your own reduce and catch StopIteration:

from functools import reduce as _reduce

def stop_iter(rv=None):
    raise StopIteration(rv)

def reduce(*args):
    try: return _reduce(*args)
    except StopIteration as e: return e.args[0]

Then, write a step function that wraps the return value in a call to stop_iter when you reach a certain condition. Using your original lambda:

reduce(lambda a, b : stop_iter(a) if a + b > 8 else a + b, input)

Similar to Duncan's answer, but allows you to use lambdas (no manually raising exceptions).

score 1 · Answer 10 · answered Feb 13 '15 at 06:47

I know you're specifically interested in python, but I thought I would chime in with respect to how Clojure accomplishes this, since it solves the problem quite elegantly and directly.

Clojure has a reduced function that returns a version of whatever it is passed, such that this version will immediately terminate within a call to reduce. This makes it trivially simple to do something like this:

(reduce (fn [a v]
          (if (< a 100) 
            (+ a v)
            (reduced a)))
        (range 20))
;; => 105

This returns the first sum which is greater than or equal to a hundred, or the greatest sum reached if none exceeds. And it's worth noting that it does this without consuming/iterating through the entirety of the collection being reduced over, which could be very large or even infinite lazy sequence. Moreover, this has a definite advantage over applying some filter operation first, as you can have your termination condition dependent on the value being constructed, not just by individual values in the collection being reduced.

You mention this idea seems somehow "anit-functional". This might seem to be the case in python, where it's unclear how you would accomplish it without resorting to some messy outside state (or at best an alternate version of reduce). However, this works cleanly and functionally (even purely so) in Clojure because it's been baked into the language. The key is that reduce knows to look for reduced values, and objects can carry that information around with them (either as a wrapped value of as metadata; not sure which actually...).

It's certainly a handy feature I've been glad to have when I've needed it.

And to complete your solution in order to match the accepted one in python from @tux21b, you can add a counter in the accumulator and get both the sum and the count: (reduce (fn [[a c] v] (if (< a 100) [(+ a v) (inc c)] (reduced [a c]))) [0 0] (range 20)) — T.Gounelle, Feb 13 '15 at 08:19

score 1 · Answer 11 · answered Jun 28 '10 at 08:58

The only way to get out of the builtin reduce part way through is to throw an exception. Fortunately it's not hard to get the desired result this way:

def interruptible_reduce(fn, *args):
    try:
        return reduce(fn, *args)
    except StopIteration, e:
        return e.args[0]

def reducefn(a, b):
    total = a[1] + b[1]
    if total > 8:
        raise StopIteration(a)
    return (a[0]+b[0], total)

input = [2, 1, 3, 4, 5]

>>> from itertools import imap
>>> interruptible_reduce(reducefn, imap(lambda x: (1,x), input))
(3, 6)

score 1 · Answer 12 · answered May 15 '18 at 08:08

First, in F#. What's the first triangle number greater than 100?

> [1..1000] |> Seq.scan (+) 0 |> Seq.find (fun x -> x > 100);;
val it : int = 105

Note that Seq.scan is lazy, so triangle numbers beyond the solution are never calculated.

To find the ordinal of the solution, we exchange find for findIndex

> [1..1000] |> Seq.scan (+) 0 |> Seq.findIndex (fun x -> x > 100);;
val it : int = 14

In Python, the analogue of F#'s List.scan is itertools.accumulate, introduced Python 3.2 (2011).

>>> from itertools import accumulate
>>> next(x for x in accumulate(range(0,1000)) if x > 100)
105
>>> next(i for (i,x) in enumerate(accumulate(range(0,1000))) if x > 100)
14

score 0 · Answer 13 · answered Jun 28 '10 at 06:41

Here is a slight variation of Stephen's code, using foldl instead of foldr (I hope) and not requiring a sequence:

#!/usr/bin/env python

import operator
import functools

def limited_reduce(op, it, start, pred):
    if not pred(start):
        return 0, start
    for i, x in enumerate(it):
        y = op(start, x)
        if pred(y):
            start = y
        else:
            break
    return i, start

print limited_reduce(operator.add, xrange(1, 6), 0,
                     functools.partial(operator.gt, 8))

Stopping a Reduce() operation mid way. Functional way of doing partial running sum

13 Answers13

Linked