write a regex for compile runtime keyword

Question

text = """ Pratap
pandey
age
25
student
"""
keyword = "age"

re_compile = re.compile('((.*\n+){2})keyword((.*\n+){2})')
re_result = re.findall(re_compile, text)

I want to write a regex for extracting two lines before keyword and two lines after keyword when keyword is matched, with variable.

If i try with re_compile = re.compile('((.*\n+){2})age((.*\n+){2})'), then works, age instead of keyword, but i want to extract with variable name — Pratap, Jun 28 '18 at 09:35
I don't think this is possible, once compiled, the regex can't be modified. Instead you can use regex without compiling them before, modifying the string each time you change the keyword. This topic could help you https://stackoverflow.com/questions/6930982/how-to-use-a-variable-inside-a-regular-expression — Thibault D., Jun 28 '18 at 09:45
@andrew input: Pratap pandey age 25 student, and if it founds keyword age then print line — Pratap, Jun 28 '18 at 09:53
I think your answer could be found in here: [SO Question](https://stackoverflow.com/questions/6930982/how-to-use-a-variable-inside-a-regular-expression) — ndrwnaguib, Jun 28 '18 at 09:55
@Pratap Was any of the answer helpful to you? Feel free to [upvote&accept](http://stackoverflow.com/tour). — wp78de, Jul 01 '18 at 18:47

C.Holloway · Answer 1 · 2018-07-02T04:48:04.463

0

I'm not completely sure what you are asking. I think what you are trying to ask is how you put in the value of a variable named "keyword"

This is how you would do that

re.compile(f"(((.*\n+){{2}})\\s*{keyword}\\s*\n((.*\n+){{2}}))")

If you define keyword = <some value>, then the code above will work.

Btw. you need to use group 1 when extracting to get what you're looking for.

edited Jul 02 '18 at 04:48

answered Jun 28 '18 at 11:25

C.Holloway

11
4

extract two lines before keyword and two lines after keyword when keyword is match – Pratap Jun 29 '18 at 06:00

Sven-Eric Krüger · Answer 2 · 2018-06-29T07:56:23.090

0

Possible Solution in Python 2.7

You can use regular expressions uncompiled and put some string formatting in it.

from __future__ import print_function

import re

text = """ Pratap
pandey
age
25
student
"""
keywords = ("age", "else")

for key in keywords :
    print(re.findall(r'(.*\n+)(.*\n+){}\n+(.*\n+)(.*\n+)'.format(key), text))

Output:

[(' Pratap\n', 'pandey\n', '25\n', 'student\n')]
[]

(*) Edited regular expression.

edited Jun 29 '18 at 07:56

answered Jun 28 '18 at 11:54

Sven-Eric Krüger

1,277
12
19

thanks but didn't work. mine main problem is to extract two lines before keyword and two lines after keyword when keyword is match – Pratap Jun 29 '18 at 04:31
@Pratap Then your regular expression may be incorrect... This will be the output of the whole: Two _identical_ lines before a keyword followed by _another two_ identical lines, e.g. `"abcdef\nabcdef\nKEY\nghijk\nghijk\n"`... if and only if `keyword = "KEY\n"` – Sven-Eric Krüger Jun 29 '18 at 06:16
@Pratap Have a look at my changes. – Sven-Eric Krüger Jun 29 '18 at 07:56

wp78de · Answer 3 · 2018-06-29T08:15:38.560

To match two lines before and after the keyword use a regex like this:

(?:.*(?:\r?\n)+){2}age(?:.*(?:\r?\n|$)+){3}

Demo

Explanation:

(?:.*(?:\r?\n|$)+){3} actually, you need to match 3 of those blocks since the first newline is found directly after the keyword (age) and the next is found the end of line 4 (25). Therefore, a third repetition is needed.

However, since this could be the end of the string, I've added $ as an alternative. I've also added an optional \r before \n which comes handy if your strings may contain Windows line endings, otherwise remove them.

Sample code:

import re
regex = r"(?:.*(?:\r?\n)+){2}age(?:.*(?:\r?\n|$)+){3}"
test_str = (" Pratap\n"
    "pandey\n"
    "age\n"
    "25\n"
    "student")

matches = re.finditer(regex, test_str, re.MULTILINE)
for match in matches:
    matchNum = matchNum + 1    
    print (match.group())

write a regex for compile runtime keyword

3 Answers3