select group based on same value in regular Expression

Question

I have a following content

ONE
1234234534564   123
34erewrwer323   123
123fsgrt43232   123
TWO
42433412133fr   234
fafafd3234132   342
THREE
sfafdfe345233   3234
FOUR
324ereffdf343   4323
fvdafasf34nhj   4323
fsfnhjdgh342g   4323

Consider ONE,TWO,THREE and FOUR are separate group.In that I want match only ONE and FOUR, based on the condition of second value of each line in the every group must be same and it will match group that has more than one line in that..How can I do that in regular expression

I have already tried following regex, but its not up to the mark

\w+\n\w+\t(\d+)(\n\w+\t\1){2,}

Try [`r'(?m)^[A-Z]+\r?\n\S+\s+(\d+)(?:\r?\n\S+\s+\1)+$'`](https://regex101.com/r/BTv18D/3) — Wiktor Stribiżew, Jun 25 '18 at 10:41
@WiktorStribiżew that's fine..how can i pass that regex code in python to print only selected group? — pavithran G, Jun 25 '18 at 10:45

score 1 · Answer 1 · answered Jun 25 '18 at 10:46

1

You may use

r'(?m)^[A-Z]+\r?\n\S+\s+(\d+)(?:\r?\n\S+\s+\1)+$'

See the regex demo.

Details

(?m) - enable re.MULTILINE mode to make ^ / $ match start and end of lines respectively
^ - start of a line
[A-Z]+ - 1+ uppercase ASCII letters (adjust as you see fit)
\r?\n - a line break like CRLF or LF
\S+ - 1+ non-whitespace chars
\s+ - 1 whitespaces (or use \t if a tab is the field separator)
(\d+) - Capturing group 1, one or more digits
(?:\r?\n\S+\s+\1)+ - one or more repetitions of a line break followed with 1+ non-whitespaces, 1+ whitespaces and the same value as in Group 1 since \1 is a backreference to the value stored in that group
$ - end of line.

In Python, use re.finditer:

for m in re.finditer(r'(?m)^[A-Z]+\r?\n\S+\s+(\d+)(?:\r?\n\S+\s+\1)+$', text):
    print(m.group())

See the Python demo.

answered Jun 25 '18 at 10:46

Wiktor Stribiżew

607,720
39
448
563

how can i pass the whole text value content through the text file itself in python? – pavithran G Jun 25 '18 at 10:57
@pavithranG Use [`f.read()`](https://stackoverflow.com/questions/7409780/reading-entire-file-in-python) instead of `text`. – Wiktor Stribiżew Jun 25 '18 at 11:00
1

@RavinderSingh13 I practiced it here, on SO. Refer to rexegg.com and regular-expressions.info, lots of stuff is explained there. – Wiktor Stribiżew Jun 25 '18 at 11:21

select group based on same value in regular Expression

1 Answers1