0

I'm going crazy as I learn python.

Here is a code fragment:

import re

class Spam:
    def egg(self, pat):
        print pat


attribute_pattern = r'(\s[a-z\-]+=".*?")*'

ok_uber_string = '<(us-patent-grant)'  r'(\s[a-z\-]+=".*?")*'   '>(.*?)(</\1>)'
bad_uber_string = '<(us-patent-grant)'  attribute_pattern   '>(.*?)(</\1>)'
pat = re.compile(bad_uber_string)

the line with bad_uber_string will not compile, get a SyntaxError: invalid syntax

This has to be a user error, what am I doing wrong?

Thanks Pat

fishtoprecords
  • 2,394
  • 7
  • 27
  • 38

2 Answers2

4

Python will automatically glue string literals together:

some_string = "this will " "be one string"

In all other cases, you want to use the + operator to concatenate a value to a string:

bad_uber_string = '<(us-patent-grant)' + attribute_pattern + '>(.*?)(</\1>)'

See also: https://stackoverflow.com/a/1732454/65295

Community
  • 1
  • 1
Seth
  • 45,033
  • 10
  • 85
  • 120
3

Automatic concatenation only works for string literals. To concatenate strings which aren't string literals, use the + operator

>>> "foo" "bar"
'foobar'
>>> bar = "bar"
>>> "foo" bar
  File "<stdin>", line 1
    "foo" bar
            ^
SyntaxError: invalid syntax
>>> "foo" + bar
'foobar'

The reason for this is simple -- the automatic concatenation is done at parse time, not runtime:

>>> def foo():
...    return "foo" "bar"
... 
>>> dis.dis(foo)
  2           0 LOAD_CONST               1 ('foobar') 
              3 RETURN_VALUE         

Due to the dynamic nature of python, it has no way of determining (in general) whether bar contains a string, or a float or any other user defined type until runtime. And the special case where it is simple enough to determine ahead of time, isn't "special enough to break the rules" (import this).

mgilson
  • 300,191
  • 65
  • 633
  • 696