Im using the below regular expression to check if a string contains alphanumeric or not but I get result = None.
>>> r = re.match('!^[0-9a-zA-Z]+$','_')
>>> print r
None
Im using the below regular expression to check if a string contains alphanumeric or not but I get result = None.
>>> r = re.match('!^[0-9a-zA-Z]+$','_')
>>> print r
None
The !
doesn't have any special meaning in RegEx, you need to use ^
to negate the match, like this
>>> re.match('^[^0-9a-zA-Z]+$','_')
<_sre.SRE_Match object; span=(0, 1), match='_'>
In Python 2.x,
>>> re.match('^[^0-9a-zA-Z]+$','_')
<_sre.SRE_Match object at 0x7f435e75f238>
Note: this RegEx will give you a match, only if the entire string is full of non-alphanumeric characters.
If you want to check if any of the characters is non-alphanumeric, then you need to use re.search
and drop the +
and $
, like this
>>> re.search('[^0-9a-zA-Z]', '123abcd!')
<_sre.SRE_Match object; span=(7, 8), match='!'>
It means that find any character other than 0-9
, a-z
and A-Z
, anywhere in the string. (re.match
will try to match from the beginning of the string. Read more about the differences between re.search
and re.match
here).
Note: The best solution to this problem is, using str.isalnum
, like this
>>> "123abcdABCD".isalnum()
True
>>> "_".isalnum()
False
This will return True
only if the entire string is full of alphanumeric characters. But, if you want to see if any of the characters in the string is alphanumeric, then you need to use any
function like this
>>> any(char.isalnum() for char in "_!@#%^$()*")
False
>>> any(char.isalnum() for char in "_!@#%^a()*")
True
That's because "_"
is not matching the regex and nothing is returned. You can simply use
def contains_alphanumeric( input):
r=re.match('[0-9a-zA-Z]+', input)
if r==None:
return False
else:
return True
You didn't mention what you're trying to do with the code, specifically, but I'm a fan of regular expressions and use them frequently in my code. It may use more CPU cycles than some other options, but I do like the flexibility.
If you wanted to look at each character individually, this compares the results of the expression with the original string:
import re
def main():
data = "This is a @#%(*ing string."
match = re.findall(re.compile(r"[a-z0-9]",re.IGNORECASE),data)
if len(match) != len(data):
print("Uh-oh, spaghettios!")
else:
print("All good in the hood.")
if __name__ == '__main__':
main()
That will use re.findall() to match the expression and return a list of the results. In this particular instance, looking for only alpha-numeric characters:
>>> print(match)
['T', 'h', 'i', 's', 'i', 's', 'a', 'i', 'n', 'g', 's', 't', 'r', 'i', 'n', 'g']
Keep in mind that anything put within "[]" would be taken as a literal character, unless a range is used, and you can use "()" in re.match() to call specific groups of results.
Please don't hesitate to ask more questions or take a look at the "re" module information at https://docs.python.org/2/library/re.html