0

Given a set where each element is a string, how can I reduce the set into an integer that is the sum of the length of these strings?

setA = ("hi", "hello", "bye")
reduce(lambda .... for word in setA)

Calling reduce with some lambda function should return 10 (2 + 5 + 3).

I can do it with a couple lambdas, I think, but there must be a cleaner way.

Clev3r
  • 1,568
  • 1
  • 15
  • 28
  • reduce was removed from builtins for a good reason... Don't use it – JBernardo Feb 13 '13 at 21:42
  • @JBernardo: There are some cases where `reduce` is the right solution. That's why it was moved to `functools` instead of scrapped. But yeah, it definitely shouldn't be the first tool you reach for; there's _usually_ a simpler and better way to do it. – abarnert Feb 13 '13 at 21:42
  • PS, @Clever, `setA` is a `tuple`, not a `set`. Is that intentional? If not, use `{}` braces instead of `()` parens. – abarnert Feb 13 '13 at 21:43
  • @abarnert No, there are not. Use a for loop instead – JBernardo Feb 13 '13 at 21:43
  • @JBernardo: Then why does 3.3 still have `functools.reduce`? Do you want me to dig up the thread where Guido agreed that it was worth keeping around for the uncommon but still existing cases where it's useful, or are you going to disagree with him too? – abarnert Feb 13 '13 at 21:45
  • @abarnert I said "removed from builtins". And you can always write a for loop with about 10 extra chars to replace an unreadable reduce. – JBernardo Feb 13 '13 at 21:49
  • 1
    @JBernardo: And you can always write a for loop with about 10 extra chars to replace a list comprehension, generator expression, `map` or `filter` call, etc. Does that mean they're never useful? Again, `reduce` was removed from `builtins` because it was an "attractive nuisance" that caused people to overuse it, but it was left in the stdlib because it is actually useful. – abarnert Feb 13 '13 at 21:53

4 Answers4

11

The idiomatic solution is to use a generator expression:

sum(len(s) for s in setA)

Generator expressions and list comprehensions should be preferred over map() and reduce() and lambdas. The latter are available but are considered "unpythonic".

Community
  • 1
  • 1
John Kugelman
  • 349,597
  • 67
  • 533
  • 578
  • +1. This isn't what the OP asked for, but it's a much simpler and more idiomatic way of doing what he's trying to do. – abarnert Feb 13 '13 at 21:38
  • Aye. I like map(), too. So this is unrelated... but a `for x in y` is called a generator? Why? Is it a call to python's generator module? – Clev3r Feb 13 '13 at 21:44
  • @Clever: See [Generator Expressions](http://docs.python.org/2/tutorial/classes.html#generator-expressions) in the tutorial. The idea is that it's shorthand for a generator function. – abarnert Feb 13 '13 at 21:48
  • @Clever: Or, more precisely: just as a list comprehension is shorthand for building and calling a function that loops, accumulating a list, and returns it, a generator expression is shorthand for building and calling a generator function that loops, yielding values. – abarnert Feb 13 '13 at 21:49
  • In the new edit, I'm not sure you want to categorical say they "should be preferred". The question you link to doesn't say that. Certainly you should avoid `map` if you'd have to use `lambda` (or `partial`, or define an otherwise-unnecessary function you don't have a good name for) to wrap up an expression, but for cases like `map(len, setA)`—I've even seen that in Guido's code. – abarnert Feb 13 '13 at 21:57
4

Here you go:

sum(map(len, setA))
Joel Cornett
  • 24,192
  • 9
  • 66
  • 88
3
In [4]: reduce(lambda x, y: x+y , map(lambda x: len(x), setA))
Out[4]: 10
jassinm
  • 7,323
  • 3
  • 33
  • 42
  • That second `lambda` is unnecessary, as you could just do `map(len, ...`. – Joel Cornett Feb 13 '13 at 21:46
  • 3
    @JoelCornett: And the first is unnecessary too, as you could just do `reduce(operator.add, map(len, setA))`. – abarnert Feb 13 '13 at 21:47
  • agreed reduce(operator.add, map(len, setA)) is the nicest of all for my taste – jassinm Feb 13 '13 at 21:50
  • @locojay: Maybe nicest within the constraints of the original question, but if you're not forced to use `reduce`, `sum` (with the same `map` call or genexp) is much nicer. – abarnert Feb 13 '13 at 21:54
3

If you really want to do this with a lambda and reduce, you can:

reduce(lambda x, y: x + len(y), s, 0)

But I'm not sure why you'd want to reduce from 0 instead of just using sum, in which case your lambda is just lambda y: len(y), which is equivalent to just len.

abarnert
  • 354,177
  • 51
  • 601
  • 671
  • I think OP wants a "cleaner" alternative to using `reduce` and `lambda`s. – Joel Cornett Feb 13 '13 at 21:38
  • I decided to go with `reduce((lambda x,y: x + y), map(len, setA), 0)`. Although now that I think about it Joel takes the cake for cleanliness. – Clev3r Feb 13 '13 at 21:39
  • @JoelCornett: I'm not sure. He explicitly says "Calling reduce with some lambda function should return 10", and his problem is that he needs "a couple lambdas" and wants something cleaner. Using a single, simple lambda is about as clean as you can get within the confines of "calling reduce". – abarnert Feb 13 '13 at 21:39
  • @JoelCornett: I do agree that John Kugelman's answer is a better way to do what the OP actually wants—which is why I say "If you really want to do this with a `lambda` and a `reduce`", and "I'm not sure why you'd want to `reduce` instead of just using `sum`" in the answer, and "it's a much simpler and more idiomatic way" in a comment to John Kugelman's answer. – abarnert Feb 13 '13 at 21:41
  • @Clever: Why write `lambda x, y: x + y` instead of just using `operator.add`—or, again, much more simply, just using `sum` instead of `reduce`. – abarnert Feb 13 '13 at 21:41
  • 2
    +1. I think I was referring to the last line of OP's question. That almost wistful, "...but there must be a cleaner way." ;) – Joel Cornett Feb 13 '13 at 21:43
  • Because until 1 minute ago I was unaware of how to use sum/add. Obviously there must be many implementations of common lambda functions, but I didn't know and digging through documentation can be difficult if you don't know exactly what to look for. – Clev3r Feb 13 '13 at 21:43
  • 2
    @Clever: If you know what you want to do, but don't know how to do it, it's better to ask "How do I do this? Here's what I tried…" than "How do I use this feature?" This isn't quite an [XY problem](http://meta.stackexchange.com/questions/66377/what-is-the-xy-problem) case, but it's in the same direction. – abarnert Feb 13 '13 at 21:46
  • 1
    @JoelCornett: That's the great thing about Python. When you think "there must be a cleaner way", there almost always is… if not in builtins/stdlib, at least on PyPI. (I suppose when you think "there must be a cleaner way" in C or JS the same is true, but only because "do it in Python" is usually the answer…) – abarnert Feb 13 '13 at 21:50
  • A for loop in this case is not only more readable but uses less code: `x = 0`; `for y in s: x += len(y)`. – JBernardo Feb 13 '13 at 21:54
  • @JBernardo: Clearly, the `sum` version is more readable than either `reduce` or an explicit `loop`. The OP asked how to do this with `reduce`, so I answered with `reduce`. And John Kugelman showed how the idiomatic and most readable way to do this. There's no reason to write the loop out. – abarnert Feb 13 '13 at 21:59
  • YES! reduce is never idiomatic python. – JBernardo Feb 13 '13 at 22:01
  • @JBernardo: Your code requires a semicolon—and isn't actually legal, because you can't follow a semicolon with a colon-introduced block on the same line. It also generates twice as many opcodes. And if you wrap it up as a function, you need a third line to `return x`, because there's no expression to stick the `return` in front of. And so on. But I'm not interested in arguing against rigid dogma any further. – abarnert Feb 13 '13 at 22:06
  • 1st: you *can* write a single line of code after a colon (because of SO comment space, the two code pieces should be two different lines). 2nd: try `timeit` – JBernardo Feb 13 '13 at 22:11
  • @JBernardo: Yes, you can write a single line of code after a colon—but you can't put a block statement after a semicolon. Try it: in either 2.7 or 3.3, you will get a `SyntaxError` pointing at your `for`. – abarnert Feb 13 '13 at 22:16