0
import re
p2b = open('repattern2b.txt').read().rstrip()

I need to write a regular expression pattern that matches strings representing numbers written in scientific notation. But in addition for this pattern , ensure group 1 is the sign of the mantissa (if there is a sign); group 2 is the mantissa, but only if it is not 0 (that exception makes the pattern simpler); group 3 is the exponent.

for example: If

m = re.match(the pattern,’9.11x10^-31’) 

then m.groups() is

(None, '9.11', '-31'). 

There should be no more groups.

Below is the regular expression I wrote for 'repattern2b.txt':

^([+-]?)([1-9].[0-9]+)x10^([1-9][0-9]*)$

But I got the error:

54 *Error: re.match(p2b,'0').groups() raised exception; unevaluated: (None, None, None)
55 *Error: re.match(p2b,'5').groups() raised exception; unevaluated: (None, '5', None)
56 *Error: re.match(p2b,'5.0').groups() raised exception; unevaluated: (None, '5.0', None)
57 *Error: re.match(p2b,'5.2x10^31').groups() raised exception; unevaluated: (None, '5.2', '31')
58 *Error: re.match(p2b,'5.2x10^-31').groups() raised exception; unevaluated: (None, '5.2', '-31')
59 *Error: re.match(p2b,'5.2x10^+31').groups() raised exception; unevaluated: (None, '5.2', '+31')
60 *Error: re.match(p2b,'-5.2x10^-31').groups() raised exception; unevaluated: ('-', '5.2', '-31')

It seems that my regular expression raises an exception, but I am not sure why. Can someone help me to fix it? Thanks in advance.

zhangdi
  • 119
  • 1
  • 11

2 Answers2

0

An exception is because occurring your re.match is returning None. You then cannot access None.groups().

Why is it returning None for everything? You have a ^ in the middle of the expression and in a regular expression that indicates the start of a line. For example, you use it for exactly that at the start of your expression.

Compare:

>>> re.match(r"^([+-]?)([1-9].[0-9]+)x10^([1-9][0-9]*)$",'5.2x10^31')
None

with:

>>> re.match(r"^([+-]?)([1-9].[0-9]+)x10\^([1-9][0-9]*)$",'5.2x10^31')
<_sre.SRE_Match object; span=(0, 9), match='5.2x10^31'>
donkopotamus
  • 22,114
  • 2
  • 48
  • 60
  • Hi, after changing that, only re.match(p2b,'5.2x10^31').groups() does not raise an exception but the rest of the cases are still raising exceptions. – zhangdi Jan 25 '17 at 22:42
  • Yes ... but there were all sorts of other problems with your expression ... – donkopotamus Jan 26 '17 at 02:50
0

Comparing the regex with the test data, there are several issues:

  • the plus/minus of the exponent is not in the regex
  • the ^ in the middle of the string is not escaped
  • the 10^... might not be present in the data, but it is in the regex
  • the first . might not be present in the data, but it is in the regex
  • the question mark after the first plus/minus must be outside the group if you want a None when the sign is missing

Maybe this works:

import re

p2b = '^([+-])?(([1-9].?[0-9]*)|0)(x10\^([+-]?[1-9][0-9]*))?$'

for s in ['-5.2', '+1.2', '0', '5.', '5.0', 
          '5.2x10^31', '5.2x10^-31', '5.2x10^+31', '-5.2x10^-31']:
    try: 
        a = re.match(p2b, s).groups()
        a = (a[0], a[2], a[4])
        print s, ": ", a
    except Exception as e: 
        print s, ": ",  e

Here are some explanations:

p2b =  re.compile(r"""
        ^                       # start of line
        ([+-])?                 # maybe a sign
        (
           (
               [1-9].?[0-9]*    # accept 1, 2, 5., 5.2, not 0
           ) | 0                # 0 will not be in a group 
        )
        (  
            x10\^               # the x10... will be skipped later
              (
                 [+-]?          # exponent might have a sign
                 [1-9][0-9]*    # one or more digits, not starting with 0
              )
        )?                      # The x10... might be missing
        $                       # end of line
        """, re.VERBOSE)

This is the output:

-5.2 :  ('-', '5.2', None)
+1.2 :  ('+', '1.2', None)
0 :  (None, None, None)
5. :  (None, '5.', None)
5.0 :  (None, '5.0', None)
5 :  (None, '5', None)
15 :  (None, '15', None)
5.2x10^31 :  (None, '5.2', '31')
5.2x10^-31 :  (None, '5.2', '-31')
5.2x10^+31 :  (None, '5.2', '+31')
-5.2x10^-31 :  ('-', '5.2', '-31')

The a[2] would contain 'x10^-31', so I skip it, there are certainly better solutions.

maij
  • 4,094
  • 2
  • 12
  • 28
  • Hey, your code is great but there is only one problem. 0 : should be (None, None, None) instead of (None, '0', None) – zhangdi Jan 26 '17 at 03:40