29

I want to get a running total from a list of numbers.

For demo purposes, I start with a sequential list of numbers using range

a = range(20)

runningTotal = []
for n in range(len(a)):
    new = runningTotal[n-1] + a[n] if n > 0 else a[n]
    runningTotal.append(new)

# This one is a syntax error
# runningTotal = [a[n] for n in range(len(a)) if n == 0 else runningTotal[n-1] + a[n]]

for i in zip(a, runningTotal):
    print "{0:>3}{1:>5}".format(*i)

yields

  0    0
  1    1
  2    3
  3    6
  4   10
  5   15
  6   21
  7   28
  8   36
  9   45
 10   55
 11   66
 12   78
 13   91
 14  105
 15  120
 16  136
 17  153
 18  171
 19  190

As you can see, I initialize an empty list [], then append() in each loop iteration. Is there a more elegant way to this, like a list comprehension?

alain.janinm
  • 19,951
  • 10
  • 65
  • 112
Kit
  • 30,365
  • 39
  • 105
  • 149

14 Answers14

30

A list comprehension has no good (clean, portable) way to refer to the very list it's building. One good and elegant approach might be to do the job in a generator:

def running_sum(a):
  tot = 0
  for item in a:
    tot += item
    yield tot

to get this as a list instead, of course, use list(running_sum(a)).

Alex Martelli
  • 854,459
  • 170
  • 1,222
  • 1,395
28

If you can use numpy, it has a built-in function named cumsum that does this.

import numpy as np
tot = np.cumsum(a)  # returns a np.ndarray
tot = list(tot)     # if you prefer a list
Glorfindel
  • 21,988
  • 13
  • 81
  • 109
Mr Fooz
  • 109,094
  • 6
  • 73
  • 101
12

I'm not sure about 'elegant', but I think the following is much simpler and more intuitive (at the cost of an extra variable):

a = range(20)

runningTotal = []

total = 0
for n in a:
  total += n
  runningTotal.append(total)

The functional way to do the same thing is:

a = range(20)
runningTotal = reduce(lambda x, y: x+[x[-1]+y], a, [0])[1:]

...but that's much less readable/maintainable, etc.

@Omnifarous suggests this should be improved to:

a = range(20)
runningTotal = reduce(lambda l, v: (l.append(l[-1] + v) or l), a, [0])

...but I still find that less immediately comprehensible than my initial suggestion.

Remember the words of Kernighan: "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it."

pjz
  • 41,842
  • 6
  • 48
  • 60
  • 1
    +1 for the debugging quote, emphasizing the un-readability of the reduce example :) – Kit Aug 08 '10 at 03:04
  • 1
    I would've written the `reduce` as `reduce(lambda l, v: (l.append(l[-1] + v) or l), a, [0])` – Omnifarious Aug 08 '10 at 03:08
  • @Satoru.Logic - I think dismissing `reduce` by making the code purposely more obscure than it has to be is rather disingenuous. I also think there is a bit of towing the party line that reduce is scary and you shouldn't program functionally in Python. – Omnifarious Aug 08 '10 at 20:53
  • @Omnifarious Me too. I never use FP in Python until I have to do so. – satoru Aug 09 '10 at 00:00
  • @Satoru.Logic - Well, I use it when I think it makes the solution to a problem clearer. In this case, I think it's a wash. – Omnifarious Aug 09 '10 at 11:51
  • Thank you so much for the second code (the reduce implementation). For the fun's sake I built a minimalist brainf*** interpreter with it (supports +,-,., inspired by [obfuscated perl contest](http://www.retas.de/thomas/computer/programs/useless/japh/japh_2/index.html))! ### src = '' \ print''.join(map(chr,reduce(lambda x,y:x+[x[-1]+y],[sum({'+':1,'-':-1}[c]for c in l if c in'+-')for l in src.split('.')],[0])[1:])) – cessor Mar 26 '13 at 22:05
10

Use itertools.accumulate(). Here is an example:

from itertools import accumulate

a = range(20)
runningTotals = list(accumulate(a))

for i in zip(a, runningTotals):
    print "{0:>3}{1:>5}".format(*i)

This only works on Python 3. On Python 2 you can use the backport in the more-itertools package.

Boris Verkhovskiy
  • 14,854
  • 11
  • 100
  • 103
mleyfman
  • 635
  • 4
  • 13
  • 1
    This is an old question with a lot of old answers, but in 2015 this is the best solution. – sffc Oct 02 '15 at 04:26
10

This can be implemented in 2 lines in Python.

Using a default parameter eliminates the need to maintain an aux variable outside, and then we just do a map to the list.

def accumulate(x, l=[0]): l[0] += x; return l[0];
map(accumulate, range(20))
satoru
  • 31,822
  • 31
  • 91
  • 141
  • 4
    This "exploits" a Python feature that has tripped me up before. I like it, but fear that it makes for a nasty trap if related code ever needs to be debugged! – Kaushik Ghose Oct 14 '14 at 19:11
  • 5
    More like 4 PEP-8 lines :) – user Jan 26 '15 at 05:40
  • 2
    An official "accumulate" function is now available in Python 3 as `from itertools import accumulate`. Also, while clever, satoru's "accumulate" implementation will break as soon as you try running it a second time. – sffc Oct 02 '15 at 04:23
  • 3
    downvoted, because as @sffc said, this will give you an incorrect result when running more than once – fferri Jul 29 '16 at 15:18
7

When we take the sum of a list, we designate an accumulator (memo) and then walk through the list, applying the binary function "x+y" to each element and the accumulator. Procedurally, this looks like:

def mySum(list):
    memo = 0
    for e in list:
        memo = memo + e
    return memo

This is a common pattern, and useful for things other than taking sums — we can generalize it to any binary function, which we'll supply as a parameter, and also let the caller specify an initial value. This gives us a function known as reduce, foldl, or inject[1]:

def myReduce(function, list, initial):
    memo = initial
    for e in list:
        memo = function(memo, e)
    return memo

def mySum(list):
    return myReduce(lambda memo, e: memo + e, list, 0)

In Python 2, reduce was a built-in function, but in Python 3 it's been moved to the functools module:

from functools import reduce

We can do all kinds of cool stuff with reduce depending on the function we supply as its the first argument. If we replace "sum" with "list concatenation", and "zero" with "empty list", we get the (shallow) copy function:

def myCopy(list):
    return reduce(lambda memo, e: memo + [e], list, [])

myCopy(range(10))
> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

If we add a transform function as another parameter to copy, and apply it before concatenating, we get map:

def myMap(transform, list):
    return reduce(lambda memo, e: memo + [transform(e)], list, [])

myMap(lambda x: x*2, range(10))
> [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

If we add a predicate function that takes e as a parameter and returns a boolean, and use it to decide whether or not to concatenate, we get filter:

def myFilter(predicate, list):
    return reduce(lambda memo, e: memo + [e] if predicate(e) else memo, list, [])

myFilter(lambda x: x%2==0, range(10))
> [0, 2, 4, 6, 8]

map and filter are sort of unfancy ways of writing list comprehensions — we could also have said [x*2 for x in range(10)] or [x for x in range(10) if x%2==0]. There's no corresponding list comprehension syntax for reduce, because reduce isn't required to return a list at all (as we saw with sum, earlier, which Python also happens to offer as a built-in function).

It turns out that for computing a running sum, the list-building abilities of reduce are exactly what we want, and probably the most elegant way to solve this problem, despite its reputation (along with lambda) as something of an un-pythonic shibboleth. The version of reduce that leaves behind copies of its old values as it runs is called reductions or scanl[1], and it looks like this:

def reductions(function, list, initial):
    return reduce(lambda memo, e: memo + [function(memo[-1], e)], list, [initial])

So equipped, we can now define:

def running_sum(list):
    first, rest = list[0], list[1:]
    return reductions(lambda memo, e: memo + e, rest, first)

running_sum(range(10))
> [0, 1, 3, 6, 10, 15, 21, 28, 36, 45]

While conceptually elegant, this precise approach fares poorly in practice with Python. Because Python's list.append() mutates a list in place but doesn't return it, we can't use it effectively in a lambda, and have to use the + operator instead. This constructs a whole new list, which takes time proportional to the length of the accumulated list so far (that is, an O(n) operation). Since we're already inside the O(n) for loop of reduce when we do this, the overall time complexity compounds to O(n2).

In a language like Ruby[2], where array.push e returns the mutated array, the equivalent runs in O(n) time:

class Array
  def reductions(initial, &proc)
    self.reduce [initial] do |memo, e|
      memo.push proc.call(memo.last, e)
    end
  end
end

def running_sum(enumerable)
  first, rest = enumerable.first, enumerable.drop(1)
  rest.reductions(first, &:+)
end

running_sum (0...10)
> [0, 1, 3, 6, 10, 15, 21, 28, 36, 45]

same in JavaScript[2], whose array.push(e) returns e (not array), but whose anonymous functions allow us to include multiple statements, which we can use to separately specify a return value:

function reductions(array, callback, initial) {
    return array.reduce(function(memo, e) {
        memo.push(callback(memo[memo.length - 1], e));
        return memo;
    }, [initial]);
}

function runningSum(array) {
    var first = array[0], rest = array.slice(1);
    return reductions(rest, function(memo, e) {
        return x + y;
    }, first);
}

function range(start, end) {
    return(Array.apply(null, Array(end-start)).map(function(e, i) {
        return start + i;
    }
}

runningSum(range(0, 10));
> [0, 1, 3, 6, 10, 15, 21, 28, 36, 45]

So, how can we solve this while retaining the conceptual simplicity of a reductions function that we just pass lambda x, y: x + y to in order to create the running sum function? Let's rewrite reductions procedurally. We can fix the accidentally quadratic problem, and while we're at it, pre-allocate the result list to avoid heap thrashing[3]:

def reductions(function, list, initial):
    result = [None] * len(list)
    result[0] = initial
    for i in range(len(list)):
        result[i] = function(result[i-1], list[i])
    return result

def running_sum(list):
    first, rest = list[0], list[1:]
    return reductions(lambda memo, e: memo + e, rest, first)

running_sum(range(0,10))
> [0, 1, 3, 6, 10, 15, 21, 28, 36, 45]

This is the sweet spot for me: O(n) performance, and the optimized procedural code is tucked away under a meaningful name where it can be re-used the next time you need to write a function that accumulates intermediate values into a list.

  1. The names reduce/reductions come from the LISP tradition, foldl/scanl from the ML tradition, and inject from the Smalltalk tradition.
  2. Python's List and Ruby's Array are both implementations of an automatically resizing data structure known as a "dynamic array" (or std::vector in C++). JavaScript's Array is a little more baroque, but behaves identically provided you don't assign to out of bounds indices or mutate Array.length.
  3. The dynamic array that forms the backing store of the list in the Python runtime will resize itself every time the list's length crosses a power of two. Resizing a list means allocating a new list on the heap of twice the size of the old one, copying the contents of the old list into the new one, and returning the old list's memory to the system. This is an O(n) operation, but because it happens less and less frequently as the list grows larger and larger, the time complexity of appending to a list works out to O(1) in the average case. However, the "hole" left by the old list can sometimes be difficult to recycle, depending on its position in the heap. Even with garbage collection and a robust memory allocator, pre-allocating an array of known size can save the underlying systems some work. In an embedded environment without the benefit of an OS, this kind of micro-management becomes very important.
April Arcus
  • 87
  • 2
  • 3
  • You just resurrected a 5-year-old thread, but, thank you! I learned a lot: especially by knowing that it's a common pattern, and that there are best practices for it. – Kit Mar 26 '15 at 07:29
  • 1
    Minor bug: you would need to fix your indexes by 1 in `reductions` in a few places; or you can rewrite reductions to automatically grab the first item of the list as an initial value (same as built-in `reduce`). Alternatively, you can just create a function that appends to and returns a list, and replace `.append` in your original `O(N^2)` with that function. – max Jan 07 '17 at 17:45
  • Also, do you think `itertools.accumulate` is essentially the same as your `reductions`, or there are some meaningful differences between the two (besides returning iterator vs list)? – max Jan 07 '17 at 18:21
  • @max - yes, they are equivalent besides their return types and evaluation style (my `reductions` implementation is strict; `itertools.accumulate` is lazy). – April Arcus Apr 02 '20 at 17:18
4

Starting Python 3.8, and the introduction of assignment expressions (PEP 572) (:= operator), we can use and increment a variable within a list comprehension:

# items = range(7)
total = 0
[(x, total := total + x) for x in items]
# [(0, 0), (1, 1), (2, 3), (3, 6), (4, 10), (5, 15), (6, 21)]

This:

  • Initializes a variable total to 0 which symbolizes the running sum
  • For each item, this both:
    • increments total by the current looped item (total := total + x) via an assignment expression
    • and at the same time returns the new value of total as part of the produced mapped tuple
Xavier Guihot
  • 54,987
  • 21
  • 291
  • 190
3

I wanted to do the same thing to generate cumulative frequencies that I could use bisect_left over - this is the way I've generated the list;

[ sum( a[:x] ) for x in range( 1, len(a)+1 ) ]
user1908017
  • 47
  • 1
  • 1
2

Here's a linear time solution one liner:

list(reduce(lambda (c,s), a: (chain(c,[s+a]), s+a), l,(iter([]),0))[0])

Example:

l = range(10)
list(reduce(lambda (c,s), a: (chain(c,[s+a]), s+a), l,(iter([]),0))[0])
>>> [0, 1, 3, 6, 10, 15, 21, 28, 36, 45]

In short, the reduce goes over the list accumulating sum and constructing an list. The final x[0] returns the list, x[1] would be the running total value.

topkara
  • 886
  • 9
  • 15
2

Another one-liner, in linear time and space.

def runningSum(a):
    return reduce(lambda l, x: l.append(l[-1]+x) or l if l else [x], a, None)

I'm stressing linear space here, because most of the one-liners I saw in the other proposed answers --- those based on the pattern list + [sum] or using chain iterators --- generate O(n) lists or generators and stress the garbage collector so much that they perform very poorly, in comparison to this.

nickie
  • 5,608
  • 2
  • 23
  • 37
  • This is very elegant! I go a bit stuck on the 'or l' part until I've realized it's short for `...; return(l)` – Palimondo Oct 11 '18 at 09:01
1

I would use a coroutine for this:

def runningTotal():
    accum = 0
    yield None
    while True:
        accum += yield accum

tot = runningTotal()
next(tot)
running_total = [tot.send(i) for i in xrange(N)]
aaronasterling
  • 68,820
  • 20
  • 127
  • 125
  • 2
    alex's answer is much cleaner but i'll leave this one up as an example of why not to use coroutines – aaronasterling Aug 08 '10 at 02:47
  • This answer does have the virtue of allowing the interpreter to allocate a list of the appropriate size to hold the result right up front. I suspect the interpreter is generally not that smart yet though. – Omnifarious Aug 09 '10 at 11:56
0

This is inefficient as it does it every time from beginning but possible it is:

a = range(20)
runtot=[sum(a[:i+1]) for i,item in enumerate(a)]
for line in zip(a,runtot):
    print line
Tony Veijalainen
  • 5,447
  • 23
  • 31
0

You are looking for two things: fold (reduce) and a funny function that keeps a list of the results of another function, which I have called running. I made versions both with and without an initial parameter; either way these need to go to reduce with an initial [].

def last_or_default(list, default):
    if len(list) > 0:
        return list[-1]
    return default

def initial_or_apply(list, f, y):
    if list == []:
        return [y]
    return list + [f(list[-1], y)]

def running_initial(f, initial):
    return (lambda x, y: x + [f(last_or_default(x,initial), y)])

def running(f):
    return (lambda x, y: initial_or_apply(x, f, y))

totaler = lambda x, y: x + y
running_totaler = running(totaler)
running_running_totaler = running_initial(running_totaler, [])

data = range(0,20)
running_total = reduce(running_totaler, data, [])
running_running_total = reduce(running_running_totaler, data, [])

for i in zip(data, running_total, running_running_total):
    print "{0:>3}{1:>4}{2:>83}".format(*i)

These will take a long time on really large lists due to the + operator. In a functional language, if done correctly, this list construction would be O(n).

Here are the first few lines of output:

0   0                      [0]
1   1                   [0, 1]
2   3                [0, 1, 3]
3   6             [0, 1, 3, 6]
4  10         [0, 1, 3, 6, 10]
5  15     [0, 1, 3, 6, 10, 15]
6  21 [0, 1, 3, 6, 10, 15, 21]
Cirdec
  • 24,019
  • 2
  • 50
  • 100
0

with Python 3.8 and above you can now use walrus operator

xs = range(20)
total = 0
run = [(total := total + d) for d in xs]
Rohit Sharma
  • 6,136
  • 3
  • 28
  • 47