2

Hello I am trying to retrieve the following string from a file

neighbors= {5 7 9 11 13 14 15 16 17 }

The pattern {number1 number2... } varies, some are short some are too long. I want to find such a pattern. My logic is to retrieve the statement "neighbors= {" which is followed by a number and a space as a repetition till the program finds the last closed braces. Can some one help me out with the syntax?

Thanks

Blender
  • 289,723
  • 53
  • 439
  • 496
Rkz
  • 1,237
  • 5
  • 16
  • 30
  • 1
    What are you trying to do? And what does **some are short some are too long** mean? – Blender May 23 '11 at 04:37
  • Does not make much sense to me this question... –  May 23 '11 at 04:38
  • I have mentioned that the "pattern varies" so obviously the next statement speaks about the pattern. Well to makes things clear for you the statements are in the form neighbors= {5 7 9 11 13 14 15 16 17 } or neighbors= {5 7 9 11 13 14 } or neighbors= {5 7 9 11 13 14 15 .....} – Rkz May 23 '11 at 04:50

3 Answers3

2

I think you're looking for this:

import re
FOO = """neighbors= {5 7 9 11 13 14 15 16 17 }"""
match = re.search('(neighbors\s*=\s*\{\s*(\d+\s*)+\})', FOO)
print match.group(1)

The regex is portable, of-course to many different languages.

Running that yields...

neighbors= {5 7 9 11 13 14 15 16 17 }

But the regex will match an arbitrary number of digits in curly-braces.

EDIT

Illustrating with re.findall() and re.compile()...

import re
FOO = """neighbors= {5 7 9 11 13 14 15 16 17 }"""
COMPILE = re.compile('(neighbors\s*=\s*\{\s*(\d+\s*)+\})')
match = re.findall(COMPILE, FOO)
print match[0]

Running the second code returns...

neighbors= {5 7 9 11 13 14 15 16 17 }

Although you should remember that .findall() was meant for multiple occurrences of the regex match inside a target string. The examples provided have not illustrated a need for .findall()

Mike Pennington
  • 41,899
  • 19
  • 136
  • 174
  • Thanks, it definitely works with re.search function. I can append those results to make a list. But can I do the same regex which you have given, (neighbors\s*=\s*\{\s*(\d+\s*)+\}) using the re.complie and re.findall function? I tried with findall but I donot get any result. – Rkz May 23 '11 at 06:18
  • @user690682, if you use `.findall()` remember that it returns a list of strings instead of an `re` match object. There is no problem using this regex with `.compile()` or `.findall()` as long as you understand the particular return values to expect from each function. – Mike Pennington May 23 '11 at 06:33
1

this is about what you asked for:

neighbors= \{ (\d+ )+\}

making it more tolerant to some optional spaces around the {} brakets:

neighbors= ?\{ ?(\d+ )+(\}|\d+\})

or a shorter one:

neighbors\s*=\s*\{[\d\s]+\}
bw_üezi
  • 4,483
  • 4
  • 23
  • 41
1

I would take the whole line with the word neighbors in it, extract the string that's between the braces, split by space and then I'd have an array of strings which can be converted to integers

Tudor Constantin
  • 26,330
  • 7
  • 49
  • 72