Validating an input in Python

Question

I have a piece of code that asks for an input. The input must not contain any numbers:

def main():

    invalid = []
    foo = ""


    foo = str(input("Enter a word: ") )

    invalid = ["1","2","3","4","5","6","7","8","9","0"]

    for eachInvalid in invalid:
        if eachInvalid in foo:
            print("Not a real word!")
            main()
        else:
            pass


main()

So far, it works but I am not proud of it. What would be the better way of doing this be? For example, a word does not contain things like ~@: so I would have to add those to that list. How would I avoid this but keep the code rather clean and readable?

Also, another error, if I run the code and type something in like hello, that is valid but if I type something invalid it takes me back to the start but when I type a valid word it still says invalid? For example:

Enter a word: hello
>>> ================================ RESTART ================================
>>> 
Enter a word: 123
Not a real word!
Enter a word: hello
Not a real word!

How would I fix this error and what would be the best way of looking for invalid characters in an input?

edit: nevermind, regular expression is fine.

Regexes are actually the best and simplest way to achieve it. — Michał Szydłowski, Feb 25 '15 at 17:17
I want to keep my code simple and not import too many modules however, if it is easier to be done in regular expression then that's ok. — Zak, Feb 25 '15 at 17:17
`if any(char.isdigit() for char in foo):`? Also, have a look at http://stackoverflow.com/q/23294658/3001761 — jonrsharpe, Feb 25 '15 at 17:21
Note too that you don't have to *"initialise variables"* in Python - the first assignments to `invalid` and `foo` are totally redundant. — jonrsharpe, Feb 25 '15 at 17:28
`re` is part of the Python standard library. I'd damn near bet precious parts of my anatomy that importing `re` will not be a performance bottleneck in your application. — , Feb 25 '15 at 17:45
@jonrsharpe While technically correct, I would temper that a bit with a recommendation to initialize variables (even if it's just `foo = None`) in cases where the variable might possibly be referenced/returned without passing through a code path that sets it to something... I've been bitten by that many times... So, sometimes, you do want to initialize a variable, even if you don't strictly *have* to... — twalberg, Feb 25 '15 at 18:55

score 6 · Accepted Answer · answered Feb 25 '15 at 17:26

6

While a more complex validation is an appropriate use case for a regular expression, in this simple case there is a built in function isalpha() which checks whether a string only contains alphabetic characters.

foo.isalpha()

Returns True or False.

Note that in Python 3 this will deal with all unicode characters defined as "letters".

answered Feb 25 '15 at 17:26

neil

3,387
1
14
11

Thanks. Exactly what I was looking for, even works for words with an umlaut, etc. – Zak Feb 25 '15 at 17:35

score 0 · Answer 2 · answered Feb 27 '15 at 08:16

0

You could use a simpler method to test for a digit in the whole string, rather than testing each letter individually.

if any(x.isdigit() for x in foo):
        print("Invalid Word")
    else:
        print("valid word")

answered Feb 27 '15 at 08:16

Crafter0800

57
1
2
6

score -1 · Answer 3 · answered Feb 25 '15 at 17:27

You could reverse the way you are doing, instead of having a list of invalids, it is better to have a list of valids. Then check every character of your string :

valids = ['a', 'b', 'c', 'd']
for letter in foo:
    if (not letter in valids):
            print("Not a real word!")

It is even easier with regex as it is easier to list all valids options :

import re
if (not re.match("^[a-zA-Z_ ]*$", foo)):
            print("Not a real word!")

The regex ^[a-zA-Z_ ]*$ meaning a word that contain only symbols in [a-zA-Z_ ]

If you want to stay with a list of invalids, use negated regular expression :

if (re.match("[^0-9@]", foo)):
            print("Not a real word!")

where [^0-9@] means anything but the characters defined between the brackets

`^[a-zA-Z_ ]*$` might be more clear if it includes `\s` instead of whitespace (though i think it should be excluded since it would mean more than 1 word is given). Also, `[^0-9@]` wouldn't work since it matches characters like `,.!?-` etc. — user, Feb 25 '15 at 17:40

Validating an input in Python

3 Answers3