3

I was just going through this page here and found this entry :

print sum(ord(c) for c in 'Happy new year to you!')

It is python code and on execution it prints 2014. Could someone help a Java developer understand exactly what's going on here?

Hele
  • 1,558
  • 4
  • 23
  • 39
  • Now I know why there seems to be renewed interest in this. Also, @Frg figured out how to update it for 2015. 'A Happy New Year to You!' – dansalmo Jan 08 '15 at 23:49
  • 1
    As a Java developer, you might like this also: http://stackoverflow.com/questions/10363927/the-simplest-algorithm-for-poker-hand-evaluation/#answer-20715903 – dansalmo Jan 08 '15 at 23:59

6 Answers6

5

A few things to understand:

  • Strings are iterable by default, so one can simply iterate over each element in a string:

    for c in 'Hello there':
        print c
    
  • ord is a built-in function that returns the actual numerical code point for a character.

  • The expression ord(c) for c in 'Happy new year to you!' is a generator expression. The result of this returns a generator function back, which retrieves the results of the total generator expression upon subsequent calls to __next__(). That happens both under the covers to us and is done in a lazy fashion; if the __next__() piece isn't invoked, then you don't generate the next value. This is useful if the expression you want to generate contains a lot of values.

    This is actually the crux of the snippet of code; it's expressing something that would have to be written more clumsily in Java in a more terse way.

  • sum takes a list as an argument and returns the total numerical value of its contents.
Makoto
  • 104,088
  • 27
  • 192
  • 230
4
int s = 0;

for (char c: "Happy new year to you!".toCharArray())
  s += (int) c;

System.out.println(s);
Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
1

ord() converts a character to its ASCII value. sum() adds up a collection of objects for which the addition operation is defined, mathematical scalar addition in this case.

The expression inside the sum() is a generator expression, a type of iterable statement that doesn't have a clean equivalent in Java, but is similar to LINQ in .NET. Essentially, it is an inline for-each loop, looping over each character in the string "Happy new year to you!", calculating the ASCII value of the character with ord, and summing these numerical values.

mobiusklein
  • 1,403
  • 9
  • 12
1

1) Built-in function ord returns integer value of char.

>>> help(ord)
Help on built-in function ord in module __builtin__:

ord(...)
    ord(c) -> integer

    Return the integer ordinal of a one-character string.

2) for loops does the iteration on each char of the string 'Happy new year to you!'

>>> for c in 'Happy new year to you':
...     print ord(c)
...
72
97
112
112
...

3) (ord(c) for c in 'Happy new year to you!') is a generator expression in python.

>>> result = (ord(c) for c in 'Happy new year to you!')
>>> result.next()
72
>>> result.next()
97

4) sum built-in function returns total of integer value of each char:

>>> help(sum)
Help on built-in function sum in module __builtin__:

sum(...)
    sum(sequence[, start]) -> value

    Returns the sum of a sequence of numbers (NOT strings) plus the value
    of parameter 'start' (which defaults to 0).  When the sequence is
    empty, returns start.

So the result of combining all these expression is:

>>> sum(ord(c) for c in 'Happy new year to you!')
2014

Another possible solution could be:

>>> sum(map(lambda c:ord(c), 'Happy new year to you!'))
2014
Tanveer Alam
  • 5,185
  • 4
  • 22
  • 43
0

print is a statement (in Python 2.x) that will print the expression that follows it.

(Note that in Python 3.x, print() is a function that prints its arguments.)

The expression is an call to a built-in function sum(). Whatever it is summing, the result is 2014, so print prints 2014.

sum() is being passed a special construct called a "generator expression". This is similar to a "list comprehension" but a bit more efficient.[1] The basic format of a generator expression is:

expression for variable in iterable

Here, variable is c. The iterable is a string, 'Happy new year to you!' The expression is a call to the built-in function ord() that returns an integer representing the character it is passed; for example, ord('A') returns 65.

So, this sums the ordinal values of all the characters in the string; the sum is 2014 and that is printed.

[1] A list comprehension builds a list of values. A generator expression doesn't build anything, but can be repeatedly called to yield up one value at a time. Functions in Python that accept iterables are able to accept a generator expression and get the values from it.

You could write this with a generator expression to build a list, then sum the list. But if you did that, the list would be constructed, looked at once, then garbage-collected. Why waste the effort to allocate and destroy the list object, when all you want is to sum the values? Thus, the generator expression.

steveha
  • 74,789
  • 21
  • 92
  • 117
-1

An expression of the form found in this code snippet and surrounded by "naked" ( ) is called a generator comprehension. It produces a specific kind of iterable known as a generator in Python.

There are other kinds of comprehensions as well. The expression surrounded by naked brackets would be a list comprehensions. Example:

[char for char in "string"] 

This will produce a list:

['s','t','r','i','n','g']

And "naked" braces (aka a set comprehension) produce a set:

{char for char in "string"} 

This makes the set:

{'s','t','r','i','n','g'}

(There are also dictionary comprehensions.)

As I said at the first, using just the parentheses around this kind of statement of the form something for something in something_else produces a special kind of iterator called a generator in Python (rather than a list or a set, like the above examples).

However, in Python, lots of other things are iterable, including strings. Inside of the generator, each character is retrieved as the string is iterated over, one at a time as it is called in turn, s, t,... Etc. The retrieved character is then the object referred to by char for that iteration.

The ord(char) part applies the ord function to each char in turn as the string is iterated over. The ord function simply finds the unicode number for the particular character that has been retrieved from the string. That unicode value is then the result of the overall generator for the current iteration.

To get the values out of a generator, you must iterate over it in some way - such as using next(), or a for...in statement. But usually you can also apply a generator as an argument to any function that receives an iterable for an argument. In this case, sum() (which is obviously meant to add a series of successive arguments together) is being applied to all of the results of the generator. Each yielded result of the generator is a member of the series.

So overall effect of the code is to add together all the unicode values of the string characters. The overall result of 2014 just seems to be a coincidence. Nothing mysterious or magical going on there.

Rick
  • 43,029
  • 15
  • 76
  • 119
  • Your explanation of comprehensions is weird and confusing. I don't know what you mean by "naked" parentheses or square braces. To anyone confused by the explanation, here's a good reference to list comprehensions and generator expressions: https://docs.python.org/2/howto/functional.html#generator-expressions-and-list-comprehensions I also would say you have iterators and generators backwards: a generator is a kind of iterator, but you can't really say that iterators are "aka" generators. As for the 2014 result, my guess is someone carefully made the sentence come out that way. – steveha Jan 23 '15 at 02:44
  • i just meant it was a coincidence that particular message came out to 2014. i'm sure someone put it together on purpose. thanks for the comments. – Rick Jan 23 '15 at 02:48
  • after rereading my answer i agree it's confusing. i have the terminology way wrong here. i guess i have learned a lot in 3 weeks. – Rick Jan 23 '15 at 02:49
  • I didn't downvote you, but someone did. You might want to just delete the answer, which removes the downvote. We're always learning here (including me of course) and you can answer more questions. Three weeks, huh? Welcome to StackOverflow! – steveha Jan 23 '15 at 02:52
  • nah, what's done is done. i made an attempt to fix it up though. – Rick Jan 23 '15 at 03:10