0

When I started writing one function, i got Syntax error. I tried execute line at REPL - and it's worked. But i want to do it at IDE. Can somebody help me?

My code:

def sentence_splitter(file_name):
    with open(file_name) as f:
        input_str = f.read()
        period_indexes = get_periods(input_str)
        for el in period_indexes:
            sub_str = input_str[el - 14:el + 14]
            if not re.search(r'\.\s+[A-Za-z]{1,3}\w+', sub_str) and # Error here
            re.search(r'\.\d+', sub_str) and
            re.search(r'\.\s+[a-z]+', sub_str) and
            re.search(r'([A-Za-z\.]+\.\w+){1,50}', sub_str) and
            re.search(r'\w+\.[\.,]+', s):
                pass
Absolut
  • 1,220
  • 1
  • 9
  • 11

3 Answers3

3

You need parentheses around your if statement:

def sentence_splitter(file_name):
    with open(file_name) as f:
        input_str = f.read()
        period_indexes = get_periods(input_str)
        for el in period_indexes:
            sub_str = input_str[el - 14:el + 14]
            if not (re.search(r'\.\s+[A-Za-z]{1,3}\w+', sub_str) and # Error here
            re.search(r'\.\d+', sub_str) and
            re.search(r'\.\s+[a-z]+', sub_str) and
            re.search(r'([A-Za-z\.]+\.\w+){1,50}', sub_str) and
            re.search(r'\w+\.[\.,]+', s)):
                pass

Technically backslashes will work, but parentheses are more Pythonic, see PEP8: http://www.python.org/dev/peps/pep-0008/#maximum-line-length

Chris
  • 1,416
  • 18
  • 29
  • It depends if you want the `not` to apply to the whole statement or just one condition. `not (1==1 and 1==2)` is true, but `(not 1==1 and 1==2)` is false. – Chris Nov 11 '13 at 18:44
  • Don't tell me! ;-) Maybe you want to add this explanation to your answer, after all, the default behaviour if they were all on one line would be that the `not` applies only to the first term. I.e., you fixed the error (+1) but also changed (fixed?) the condition. – tobias_k Nov 11 '13 at 18:50
2

Your conditional spans multiple lines. You need to add the line continuation character \.

if not re.search(r'\.\s+[A-Za-z]{1,3}\w+', sub_str) and \
            re.search(r'\.\d+', sub_str) and \
            re.search(r'\.\s+[a-z]+', sub_str) and \
            re.search(r'([A-Za-z\.]+\.\w+){1,50}', sub_str) and \
            re.search(r'\w+\.[\.,]+', s):

Further information about this is available in PEP8 and this answer.

One note specific to your code:

re.search(r'\w+\.[\.,]+', s)
                          ^---- This variable is not assigned 
                                (likely should be sub_str)
Community
  • 1
  • 1
Andy
  • 49,085
  • 60
  • 166
  • 233
1

In your last regexp:

re.search(r'\w+\.[\.,]+', s)

You perform a search on s, which is not defined. All the other regexps perform a search on substr, which is probably what you want. That would raise a NameError though, not a SyntaxError.

Additionally, you probably want to refactor your code to make it easier to read, as explained in my comment to your question.

bitgarden
  • 10,461
  • 3
  • 18
  • 25