3

Im using the below regular expression to check if a string contains alphanumeric or not but I get result = None.

>>> r = re.match('!^[0-9a-zA-Z]+$','_')
>>> print r
None
thefourtheye
  • 233,700
  • 52
  • 457
  • 497
user1050619
  • 19,822
  • 85
  • 237
  • 413
  • What is `!` doing there? – vks Mar 12 '15 at 15:57
  • ! - i'm using as a not equal to. If the string does not contain any alphanumeric characters then give me a match object. – user1050619 Mar 12 '15 at 15:58
  • 2
    Before jumping to regex i would just do `def contains_alnum(s):return any(c.isalnum() for c in s)`. If you end up with a performance bottleneck then by all means explore the regex option. I have no idea which approach would be faster. – Steven Rumbalski Mar 12 '15 at 15:59

3 Answers3

2

The ! doesn't have any special meaning in RegEx, you need to use ^ to negate the match, like this

>>> re.match('^[^0-9a-zA-Z]+$','_')
<_sre.SRE_Match object; span=(0, 1), match='_'>

In Python 2.x,

>>> re.match('^[^0-9a-zA-Z]+$','_')
<_sre.SRE_Match object at 0x7f435e75f238>

Note: this RegEx will give you a match, only if the entire string is full of non-alphanumeric characters.

If you want to check if any of the characters is non-alphanumeric, then you need to use re.search and drop the + and $, like this

>>> re.search('[^0-9a-zA-Z]', '123abcd!')
<_sre.SRE_Match object; span=(7, 8), match='!'>

It means that find any character other than 0-9, a-z and A-Z, anywhere in the string. (re.match will try to match from the beginning of the string. Read more about the differences between re.search and re.match here).

Note: The best solution to this problem is, using str.isalnum, like this

>>> "123abcdABCD".isalnum()
True
>>> "_".isalnum()
False

This will return True only if the entire string is full of alphanumeric characters. But, if you want to see if any of the characters in the string is alphanumeric, then you need to use any function like this

>>> any(char.isalnum() for char in "_!@#%^$()*")
False
>>> any(char.isalnum() for char in "_!@#%^a()*")
True
Community
  • 1
  • 1
thefourtheye
  • 233,700
  • 52
  • 457
  • 497
0

That's because "_" is not matching the regex and nothing is returned. You can simply use

def contains_alphanumeric( input):
   r=re.match('[0-9a-zA-Z]+', input)
   if r==None:
      return False
   else:
      return True
Shivendra
  • 1,076
  • 2
  • 12
  • 26
0

You didn't mention what you're trying to do with the code, specifically, but I'm a fan of regular expressions and use them frequently in my code. It may use more CPU cycles than some other options, but I do like the flexibility.

If you wanted to look at each character individually, this compares the results of the expression with the original string:

import re

def main():
    data = "This is a @#%(*ing string."

    match = re.findall(re.compile(r"[a-z0-9]",re.IGNORECASE),data)

    if len(match) != len(data):
        print("Uh-oh, spaghettios!")
    else:
        print("All good in the hood.")

if __name__ == '__main__':
    main()

That will use re.findall() to match the expression and return a list of the results. In this particular instance, looking for only alpha-numeric characters:

>>> print(match)
['T', 'h', 'i', 's', 'i', 's', 'a', 'i', 'n', 'g', 's', 't', 'r', 'i', 'n', 'g']

Keep in mind that anything put within "[]" would be taken as a literal character, unless a range is used, and you can use "()" in re.match() to call specific groups of results.

Please don't hesitate to ask more questions or take a look at the "re" module information at https://docs.python.org/2/library/re.html

MuffintopBikini
  • 1,042
  • 7
  • 11