Explanation Please

Question

This is my first every question here on this website. For a bit of background I am very interested in applying machine learning in preventative medicine as I believe this is what the future of medicine has in store. For this reason I have been teaching myself python via Rosalind.info. One question had us creating a function that can calculate the GC content of a sequence. One issue that I had was that when I used the first code, it evaluated to one. However, when I used the second code, it evaluated to the correct answer. If anyone can explain why this is the case that'd be much appreciated!

First try:

n = input("Paste in sequence here!").upper()

def cg_content(sequence):
    gc_count = 0
    total = len(sequence)
    for base in sequence:
        if base =='C' or 'G':
            gc_count += 1
        else:
            gc_count = gc_count
    percentage = float(gc_count)/float(total)
    print(percentage) 

cg_content(n)

Second Try:

n = input("Paste in sequence here!").upper()

def cg_content(sequence):
    gc_count = 0
    total = len(sequence)
    for base in sequence:
        if base =='C' or base == 'G':
            gc_count += 1
        else:
            gc_count = gc_count
    percentage = float(gc_count)/float(total)
    print(percentage) 

cg_content(n)

I know it has something to do with the 'or' statement but I thought that both statements are essentially equivalent regardless of whether the '==' was there once or twice.

score 2 · Answer 1 · answered Oct 13 '18 at 23:54

The reason you get different outputs resides in:

if base =='C' or 'G':

which differs from:

if base =='C' or base == 'G':

In the first case, you are evaluating the truth value of the expression 'G' which is a variable of type char. Python documentation says that:

By default, an object is considered true unless its class defines either a bool() method that returns False or a len() method that returns zero, when called with the object.

Therefore, if 'G' evaluates always to true.

Funny story:

Even though 'G' evaluates as True, the following expression evaluates asFalse:

if 'G' == True:
    print("I will be never printed")

score 1 · Answer 2 · answered Oct 13 '18 at 23:53

It's not equivalent. The first code is basically:

if (base == 'C') or ('G'):

or in other words, if base == 'C' is true, or if 'G' is true. Clearly if 'G' makes no sense. In general, 0 = false, and non-zero = true, so if 'G' will likely always be true.

Therefore, you're saying if base == 'C' or True and that will always be true, so the first if clause always wins. Anything or True is True...even if False or True ;)

Serge · Answer 3 · 2018-10-15T15:47:36.787

In python any non-empty string is interpreted as True (affirmative)

thus base == 'C' or 'G' which is same as (base =='C') or 'G' is always interpreted as True, regardless of what is value of base.

While Python is often intuitive and elegant it follow some strict set of rules rather than common sense in interpreting code. The tradition of interpreting non-null/non-empty values as True value stems from C with which standard Python interpreter tightly inter-operates. It is not very strict or intuitive but often allows for simpler and elegant code.

Explanation Please

3 Answers3