0

I've looked at several posts and other forums to find an answer related to my question, but nothing has come up specific to what I need. As a heads up, I'm new to programming and don't possess the basic foundation that most would.

I know bash, little python, and decent with RE.

I'm trying to create a python script, using RE's to parse through data and give me an output that I need/want.

My output will consist of 4 values, all originating from one line. The line being read in is thrown together with no defined delimiter. (hence the reason for my program)

In order to find one of the 4 values, I have to say look for 123- and give me everything after that but stop here df5. The 123- is not constant, but defined by a regular expression that works, same goes for df5. I assigned both RE's to a variable. How can I use those variables to find what I want between the two... Please let me know if this makes sense.

kindall
  • 178,883
  • 35
  • 278
  • 309
  • Use `str.format()`. Examples here: http://stackoverflow.com/questions/1875676/python-2-6-str-format-and-regular-expressions and http://stackoverflow.com/questions/4199642/python-string-formatting-a-regex-string-that-uses-both-and-as-character – 2rs2ts Jun 20 '13 at 16:07
  • Also, please show us your code! – 2rs2ts Jun 20 '13 at 16:07

2 Answers2

3
import re
start = '123-'
stop = 'df5'
regex = re.compile('{0}(.*?){1}'.format(re.escape(start), re.escape(stop)))

Note that the re.escape() calls aren't necessary for these example strings, but it is important if your delimiters can ever include characters with a special meaning in regex (., *, +, ? etc.).

Andrew Clark
  • 202,379
  • 35
  • 273
  • 306
  • The 123- portion is NOT a constant value, I have defined it with a variable (through regular expression). I want to use that variable to define the starting point and use df5 as the end point. Only what's between the two, is what I want. Am I doing this the hard way? – user2505945 Jun 20 '13 at 16:56
  • The last line shows how to create a regular expression using two variables `start` and `stop`, that could be any string. Replace `start` and `stop` with the names of your actual variables, I just used your example strings in my code to illustrate what it does. – Andrew Clark Jun 20 '13 at 17:00
0

How about a pattern "%s(.*?)%s" % (oneTwoThree, dF5)? Then you can do a re.search on that pattern and use the groups function on the result.

Something on the lines of


pattern = "%s(.*?)%s" % (oneTwoThree, dF5)
matches = re.search(pattern, text)
if matches:
    print matches.groups()

re.findall, if used instead of re.search, can save you the trouble of grouping the matches.

Atmaram Shetye
  • 993
  • 7
  • 15
  • The 123- portion is NOT a constant value, I have defined it with a variable (through regular expression). I want to use that variable to define the starting point and use df5 as the end point. Only what's between the two, is what I want. Am I doing this the hard way? – user2505945 Jun 20 '13 at 16:55
  • That is the reason I had suggested "%s(.*?)%s" % (oneTwoThree, dF5). Here, oneTwoThree and df5 are your variables which could contain "123-" or "df5" strings. So inside re.compile, you can use that line instead of the hardcoded string. I have edited the same in my answer now. – Atmaram Shetye Jun 20 '13 at 17:05