4

Which is "more correct (logically)"? Specific to Leap Year, not in general.

  1. With Parentheses

    return year % 4 == 0 and (year % 100 != 0 or year % 400 == 0)
    
  2. Without

    return year % 4 == 0 and year % 100 != 0 or year % 400 == 0
    

Additional Info

Parentheses change the order in which the booleans are evaluated (and goes before or w/o parenthesis).

Given that all larger numbers are divisible by smaller numbers in this problem, it returns the correct result either way but I'm still curious.

Observe the effects of parentheses:

  1. False and True or True
    #True
    
    False and (True or True)
    #False
    
  2. False and False or True
    #True
    
    False and (False or True)
    #False
    

Without parentheses, there are scenarios where even though year is not divisible by 4 (first bool) it still returns True (I know that's impossible in this problem)! Isn't being divisible by 4 a MUST and therefore it's more correct to include parenthesis? Anything else I should be paying attention to here? Can someone explain the theoretical logic of not/including parentheses?

JBallin
  • 8,481
  • 4
  • 46
  • 51

7 Answers7

5

The parens affect what order your booleans take. ands are grouped together and resolved before ors are, so:

a and b or c

becomes:

(a and b) or c

if either BOTH a and b are truthy, OR if c is truthy, we get True.

With the parentheses you get:

a and (b or c)

Now you get True if both a is truthy and either b OR c is truthy.


As far as "correctness," as long as your code derives the correct result then "more correct" is only a matter of opinion. I would include parens where you feel like it makes the result more clear. For instance:

if (a and b) or c:

is more clear than

if a and b or c:

However it is NOT more clear (in my opinion) than:

if some_long_identifier and some_other_long_identifier or \
   some_third_long_identifier_on_another_line:

Your guide when writing Python code should be PEP8. PEP8 is quiet on when you should include stylistic parentheses (read: parens that follow the natural order of operations), so use your best judgement.


For leap years specifically, the logic is:

  1. If the year is evenly divisible by 4, go to step 2. ...
  2. If the year is evenly divisible by 100, go to step 3. ...
  3. If the year is evenly divisible by 400, go to step 4. ...
  4. The year is a leap year (it has 366 days).
  5. The year is not a leap year (it has 365 days).

In other words: all years divisible by 4 are leap years, unless they're divisible by 100 and NOT divisible by 400, which translates to:

return y % 4 == 0 and not (y % 100 == 0 and y % 400 != 0)
Adam Smith
  • 52,157
  • 12
  • 73
  • 112
  • 1
    Already stated in "Additional Info", does not answer question: specifically asking about Leap Year problem, not in general – JBallin Aug 15 '16 at 18:16
  • @JBallin Then I think you should restate your question, since this exactly answers the question as-asked. – Adam Smith Aug 15 '16 at 18:20
  • Last line shows how I can separate (b or c) logically! Also - [De Morgan's Laws](https://en.wikipedia.org/wiki/De_Morgan's_laws): not (a and b) = not a or not b – JBallin Aug 16 '16 at 15:48
  • @JBallin The conversation seems to be getting a bit pedantic. Correct code is the code that A) produces the correct output, B) does so within the performance benchmarks of its function, and C) creates the least amount of technical debt. Where you put the parens in this statement affect exactly none of these things, and De Morgan's Laws are mathematical proofs not a style guide for Python. – Adam Smith Aug 16 '16 at 16:18
  • This was a theoretical/logical question (vs. in practice). I cited DML because your last line coupled with DML was the logical bridge I was looking for - wanted to share for future readers! – JBallin Aug 16 '16 at 22:29
3

Include the parentheses. In English, the rule is:

  1. Year must be divisible by 4.
  2. Year must not be visible by 100, unless it's divisible by 400.

The version with parentheses matches this two-pronged rule best.

return year % 4 == 0 and (year % 100 != 0 or year % 400 == 0)
       ^^^^^^^^^^^^^     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
            (1)                          (2)

As it happens, removing the parentheses does not break the code, but it leads to an unnatural version of the rules:

  1. Year must be divisible by 4, but not by 100; or
  2. Year must be divisible by 400.

That's not the way I think of the leap year rule.

John Kugelman
  • 349,597
  • 67
  • 533
  • 578
2

Which answer is "more correct" and why?

It's not a matter of what is 'more correct', rather; what logic do you wish to implement? Parenthesis' in boolean expressions change order of operations. This allows you to force precedence in its execution.

>>> (True or True) and False  # or expression evaluates first.
False
>>> True or True and False  # and evaluates first.
True

As for the logic in the leap year formula, the rules go as follows:

  1. Leap Years are any year that can be evenly divided by 4 (such as 2012, 2016, etc)

  2. Except if it can be evenly divided by 100, then it isn't (such as 2100, 2200, etc)

  3. Except if it can be evenly divided by 400, then it is (such as 2000, 2400)

Thus the exception rules must take precedence, which is why the parenthesis around the or is necessary to adhere to the formula's rules. Other wise the two arguments to and would be evaluated first.

ospahiu
  • 3,465
  • 2
  • 13
  • 24
  • I stated this in "Additional Info" – JBallin Aug 15 '16 at 18:25
  • @JBallin Updated with more logic explanation. – ospahiu Aug 15 '16 at 18:32
  • Are you saying: "(divisible by 4) AND (not exception)"? – JBallin Aug 16 '16 at 15:53
  • 1
    @JBallin So we know that the year has to be evenly divisible by 4, that is the first part of the expression. Then we know that if it isn't evenly divided by 100 then it can also be a leap year if rule 1 holds true. So we need rule 1 to hold true regardless, with either rule 2 and 3 evaluating to true in order for the entire expression to be true. Thus, the whole expression -> (#1) and (not #2 or 3). – ospahiu Aug 16 '16 at 16:59
1

Which answer is "more correct" and why? (Specific to Leap Year Logic, not in general)

With Parentheses

return year % 4 == 0 and (year % 100 != 0 or year % 400 == 0)

Without

return year % 4 == 0 and year % 100 != 0 or year % 400 == 0

Depends on your definition of "more correct". As you know, both return correct.

Now speculating the "more correctness" - if you are referring to performance benefits, I cannot think of any, considering the current smart compilers.

If you are discussing the human readability point of view, I would go with,

return year % 4 == 0 and year % 100 != 0 or year % 400 == 0

It naturally narrows down the scope, opposed to your other alternative, which seemed to include visually two disjoint elements.

I would suggest, include parentheses, but as below:

return (year % 4 == 0 and year % 100 != 0) or year % 400 == 0

Community
  • 1
  • 1
1

As you noted, in operation, there is no difference, since the number being divisible by 400 implies that it is also divisible by 100, which implies that it is also divisible by 4. Operationally, whether the parentheses have any effect depends on the lexical order (order of evaluation) of the language. Most languages today follow the conventions of c, which means a specified precedence of operators, and left-to-right otherwise. When in doubt, I always put parentheses for readability.

Stylistically, this sort of thing is hard to read when put in a long expression like that. If it must be one expression, I would prefer to have the logical "sum of products" to the "product of sums" So I would go

return (year%400 == 0) or (year%100 != 0 and year%4 == 0)

Or even

bool IsLeap = false;
if (year%4 == 0) IsLeap = true;
if (year%100 == 0) IsLeap = false;
if (year%400 == 0) IsLeap = true;

return IsLeap;

An optimizing compiler will make efficient code, anyway, and this sort of thing really helps poor humans like me to read it.

roderick young
  • 284
  • 1
  • 10
1

Answer: Include Parentheses


John Kugelman explains why they are 2 separate logical tests as opposed to 3, -> the last 2 should be grouped together:

  1. Year must be divisible by 4.
  2. (2) Year must not be visible by 100, (3) unless it's divisible by 400.

The version with parentheses matches this two-pronged rule best.

return year % 4 == 0 and (year % 100 != 0 or year % 400 == 0)
       ^^^^^^^^^^^^^     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
            (1)                          (2)

As it happens, removing the parentheses does not break the code, but it leads to an unnatural version of the rules:

  1. Year must be divisible by 4, but not by 100; or
  2. Year must be divisible by 400.

That's not the way I think of the leap year rule.


Inspired by mrdomoboto, 100/400 are the exception!:

Year must be divisible by 4, 100 is an exception and 400 is an exception of the exception but they are still one exception in total (see above). This means that if year is not divisible by 4 then the whole thing must be False. The only way to ensure this is to put parens around the exception because False and bool will always return False.

See below examples of this from JBallin

  1. False and True or True
    #True
    
    False and (True or True)
    #False
    
  2. False and False or True
    #True
    
    False and (False or True)
    #False
    

Adam Smith translated the english into code:

All years divisible by 4 are leap years, unless they're divisible by 100 and NOT divisible by 400, which translates to:

return y % 4 == 0 and not (y % 100 == 0 and y % 400 != 0)

JBallin cited De Morgan's Laws:

not(a and b) = (not a or not b)

To convert above into the desired answer:

#convert using "DML"
return y % 4 == 0 and (not y % 100 == 0 or not y % 400 != 0)
#remove "not"s by switching "==" and "!="
return y % 4 == 0 and (y % 100 != 0 or y % 400 == 0)
Community
  • 1
  • 1
JBallin
  • 8,481
  • 4
  • 46
  • 51
  • The parenthesis allows python's short-circuit AND operator (https://docs.python.org/2/library/stdtypes.html#boolean-operations-and-or-not) to save CPU time. The `y % 400` calculation doesn't happen if `y % 4` is false. Also, get rid of some module (division) math... Use this instead: `((year & 3) == 0 && ((year % 25) != 0 || (year & 15) == 0))` https://stackoverflow.com/a/11595914/733805 – Kevin P. Rice Dec 10 '18 at 12:31
1

Here's the other difference: SPEED AND EFFICIENCY.

Aside from order of evaluation (already mentioned in other answers)...

Let's simplify the original expression year % 4 == 0 and (year % 100 != 0 or year % 400 == 0) to this:

A and (B or C)

If A is false, there is no reason to test B or C because and requires BOTH sides to be true.

SHORT-CIRCUIT OPERATORS

Logical operators and and or have a "short-circuit" effect in many languages, including Python, where only the left side is evaluated (See https://docs.python.org/2/library/stdtypes.html#boolean-operations-and-or-not):

  • and short-circuits when the left side is false (because the right side cannot make the result be true)
  • or short-circuits when the left side is true (because the right side cannot make the result be false)

WITH PARENTHESIS:

A and (B or C)

  • When A is false, the right side (B or C) doesn't get evaluated, saving CPU resources.
  • When A is true, B gets evaluated, but C only gets evaluated if B is false.

WITHOUT PARENTHESIS:

A and B or C

  • When A is false, B doesn't get evaluated, but C gets (needlessly) evaluated.
  • When A is true, B gets evaluated. If B is false, C also gets evaluated.

CONCLUSION: Without parenthesis, C (the year % 400 test) gets needlessly evaluated when A is false (the year % 4 test). This is 75% of the time that the CPU could stop at A, but continues to do more math needlessly.

MOST-EFFICIENT EXPRESSION:

Use this instead: ((year & 3) == 0 && ((year % 25) != 0 || (year & 15) == 0))

This expression replaces modulo (slow division) with bitwise-AND (fast!) in two cases.

More details at: https://stackoverflow.com/a/11595914/733805

Kevin P. Rice
  • 5,550
  • 4
  • 33
  • 39