Avoiding repetition of if statement

Question

I prepared functions and sorted data for this task: (it's actually AoC day 4, but quick explanation to make it clear) I have already sorted data to this 'structure'

byr:1991
eyr:2022
hcl:#341e13
iyr:2016
pid:729933757
hgt:167cm
ecl:gry

hcl:231d64
cid:124
ecl:gmt
eyr:2039
hgt:189in
pid:#9c3ea1

ecl:#1f58f9
pid:#758e59
iyr:2022
hcl:z
byr:2016
hgt:68
eyr:1933

[and so on +250 packages(by package I mean set of byr,ecl,eyr... separated by new line).]

and prepared this code:

def check_fields(list):
    comparison_list = ['byr', 'iyr', 'eyr',
                       'hgt', 'hcl', 'ecl',
                       'pid']
    statement = True
    for i in comparison_list:
        statement = statement and (i in list)
    return statement


def check_byr_iyr_eyr(line):
    prefix,value = line.split(':')
    cases = {'byr':{'min':1920, 'max':2002},
             'iyr':{'min':2010, 'max':2020},
             'eyr':{'min':2020, 'max':2030} }
    return cases[prefix]['min'] <= int(value) <= cases[prefix]['max']


def check_hgt(line):
    unit = line[len(line)-2] + line[len(line)-1]
    value = line[line.index(':')+1: -2]
    cases = {'cm':{'min':150, 'max':193},
             'in':{'min':59, 'max':76}}
    return cases[unit]['min'] <= int(value) <= cases[unit]['max']


def check_hcl(line):
    statement = True
    if line[line.index(':')+1] != '#' or len(line[line.index(':')+2:]) != 6:
        return False
    else:
        string = line[line.index('#')+1:]
        for i in string:
            statement = statement and (97 <= ord(i) <= 102 or 48 <= ord(i) <= 57)
        return statement


def check_ecl(line):
    comparison_list = ['amb', 'blu', 'brn',
                       'gry', 'grn', 'hzl',
                       'oth' ]
    if line[line.index(':') + 1:] in comparison_list:
        return True
    return False


def check_pid(line):
    if len(line[line.index(':')+1:]) != 9:
        return False
    try:
        int(line[line.index(':')+1:])
        return True
    except:
        return False


line_list = []
valid_passports = 0
with open('results.txt', 'r') as f:
    for line in f:
        if line != '\n':
            ''' add line to line_list'''
            pass
        else:
            '''
            check lines from line_list
            using above declared functions
            if every line is ok:
                valid_passports +=1
            '''

I have to check if every package contains every key except of cid, and then check if every value for each key is proper.

byr (Birth Year) - four digits; at least 1920 and at most 2002.
iyr (Issue Year) - four digits; at least 2010 and at most 2020.
eyr (Expiration Year) - four digits; at least 2020 and at most 2030.
hgt (Height) - a number followed by either cm or in:
If cm, the number must be at least 150 and at most 193.
If in, the number must be at least 59 and at most 76.
hcl (Hair Color) - a # followed by exactly six characters 0-9 or a-f.
ecl (Eye Color) - exactly one of: amb blu brn gry grn hzl oth.
pid (Passport ID) - a nine-digit number, including leading zeroes.
cid (Country ID) - ignored, missing or not.

(above mentioned rules are ensured by earlier declared functions)
And the question/problem is How can I avoid repetition of if statement during checking every line added to line list(it refers to part with multi-line comment with "pseudo code") ? - I mean I could do it like

if line[0:3] == "byr":
    check_byr(line)
# and so on, many if statement checking the first 3 letters to adjust proper function to use

but it doesn't seem like proper and elegant solution, matybe you could give me hints how to deal with that, or give another idea to solve that problem in different way that I didn't use. Please help, thanks.

This might be of interest: https://stackoverflow.com/a/15112149/8881141 — Mr. T, Jan 09 '21 at 11:51

score 2 · Answer 1 · answered Jan 09 '21 at 11:39

Can't you have a mapping from prefix to target function?

Something like

line = # ...
prefix = # ... either "hgt" or "pid" or other

def check_hgt(line):
    pass
def check_pid(line):
    pass
# ... other checker functions

checker_functions_pool = {"hgt": check_hgt, "pid": check_pid}

checker_function = checker_functions_pool[prefix]
checker_function(line)

score 1 · Answer 2 · answered Jan 09 '21 at 13:49

@viGor207, this is anther way to approach it: (part 2 example):

import re

passports = [
    dict(
        line.split(':')
        for line
        in pas.split()
    )
    for pas
    in open('input').read().split('\n\n')
]

required = {'byr', 'iyr', 'eyr', 'hgt', 'hcl', 'ecl', 'pid'}


def valid(pas):
    return bool(
        1920 <= int(pas['byr']) <= 2002 and
        2010 <= int(pas['iyr']) <= 2020 <= int(pas['eyr']) <= 2030 and
        re.fullmatch(r'[0-9]{2,3}(cm|in)', pas['hgt']) and
        (
            (pas['hgt'][-2:] == 'cm' and 150 <= int(pas['hgt'][:-2]) <= 193) or
            (pas['hgt'][-2:] == 'in' and 59 <= int(pas['hgt'][:-2]) <= 79)
        ) and
        re.fullmatch(r'#[0-9a-f]{6}', pas['hcl']) and
        pas['ecl'] in {'amb', 'blu', 'brn', 'gry', 'grn', 'hzl', 'oth'} and
        re.fullmatch(r'[0-9]{9}', pas['pid'])
    )


print(
    sum(
        all(r in pas for r in required) and valid(pas)
        for pas
        in passports
    )
)

score 1 · Answer 3 · answered Jan 09 '21 at 13:52

To make it complete, here is the part one:

passports = [
    dict(
        line.split(':')
        for line
        in pas.split()
    )
    for pas
    in open('input').read().split('\n\n')
]

required = {'byr', 'iyr', 'eyr', 'hgt', 'hcl', 'ecl', 'pid'}

print(
    sum(
        all(r in pas for r in required)
        for pas in passports
    )
)

Paddy3118 · Answer 4 · 2021-01-10T11:24:09.737

The task states:

Each passport is represented as a sequence of key:value pairs separated by spaces or newlines. Passports are separated by blank lines.

a sequence of key:value pairs is screaming use a list of dicts in Python, but your method could be used.

You could use a dict that maps field names to the function that checks for that field line, field_to_checker . I took your example parsed inputs as a list of your parsed lines then added a checker for cid that just returns True, and created the following code snippet:

def check_cid(line):
    return True

field_to_checker = {
  'byr': check_byr_iyr_eyr,
  'cid': check_cid,
  'ecl': check_ecl,
  'eyr': check_byr_iyr_eyr,
  'hcl': check_hcl,
  'hgt': check_hgt,
  'iyr': check_byr_iyr_eyr,
  'pid': check_pid,
  }

line_data = """byr:1991
eyr:2022
hcl:#341e13
iyr:2016
pid:729933757
hgt:167cm
ecl:gry

hcl:231d64
cid:124
ecl:gmt
eyr:2039
hgt:189in
pid:#9c3ea1

ecl:#1f58f9
pid:#758e59
iyr:2022
hcl:z
byr:2016
hgt:68
eyr:1933""".split('\n')

valid_passports = 0
ok_passport = True  # Accumulating over all fields of one passport
for line in line_data + ['\n']:  # Add empty line to force processing last passport
    line = line.rstrip()
    if line:    # Not blank line
        if ok_passport:  # One False value in a passport will prevail
            key = line[:3]
            ok_passport = (key in field_to_checker 
                           and field_to_checker[key](line))
    else:  # Blank line, end of passport record
        if ok_passport:
            valid_passports += 1
        ok_passports = True

In the for line in line_data + ['\n'] loop the count of valid_passports is only updated when there is a blank line ending a passport record. The last passport needs to have a blank line after it to be properly counted hence the addition of an extra blank line to the end of the line_data.

The above is untested, but should give you tips on how to extend what you have started with.

score 0 · Answer 5 · answered Jan 11 '21 at 19:55

I would suggest placing the dictionary values in variables as early as possible to make the logic simpler to write and read.

That should allow you to make single line conditions that are ledgible:

data = \
"""byr:1991
eyr:2022
hcl:#341e13
iyr:2016
pid:729933757
hgt:167cm
ecl:gry

hcl:231d64
cid:124
ecl:gmt
eyr:2039
hgt:189in
pid:#9c3ea1

ecl:#1f58f9
pid:#758e59
iyr:2022
hcl:z
byr:2016
hgt:68
eyr:1933"""

...

# iterator to get packages
def getPackages(d):
    package = dict()    
    for line in d:
        if line:
            field,value = line.split(":",1)
            package[field]=value
        else:
            yield package.copy()
            package.clear()
    if package: yield package

fields  = ['byr', 'iyr', 'eyr', 'hgt', 'hcl', 'ecl', 'pid']
for package in getPackages(data.split("\n")):
    values = [package.get(f,"") for f in fields]
    byr, iyr, eyr, hgt, hcl, ecl, pid = values

    isValid = "" not in values[:-1] \
        and   int(byr) in range(1920,2001+1) \
        and   int(iyr) in range(2010,2020+1) \
        and   int(eyr) in range(2020,2030+1) \
        and   int(hgt[:-2]) in {"cm":range(150,193+1),"in":range(59,76+1)}.get(hgt[-2:],[]) \
        and   hcl.startswith("#") and len(hcl)==7 \
        and   all(97 <= ord(i) <= 102 or 48 <= ord(i) <= 57 for i in hcl[1:]) \
        and   ecl in {'amb', 'blu', 'brn', 'gry', 'grn', 'hzl', 'oth' } \
        and   (pid == "" or pid.isdigit() and len(pid)==9)

    print(pid,isValid)

"""        
    729933757 True
    #9c3ea1 False
    #758e59 False
"""

Avoiding repetition of if statement

5 Answers5