1

I wrote the following code to match a pattern but I can't get it.import re

pattern = re.compile(r"(\w+) (\w+)")
match = pattern.findall("Hello Chelsea Hello ManU")
print(match)

Out:[('Hello', 'Chelsea'), ('Hello', 'ManU')] What I try to achieve is.

[('Hello', 'Chelsea') , ('Chelsea', 'Hello') , ('Hello', 'ManU') ]

pattern = re.compile(r"(\w+) (\w+)")
match = pattern.findall("Hello Chelsea Hello")
print(match)

Out:[('Hello', 'Chelsea')]

What I try to achieve is.

[('Hello', 'Chelsea') , ('Chelsea', 'Hello') ]

Why regex ignore the two words if match found for a later search? How to achieve that output. Thank you.

2 Answers2

2

Use the newer regex module:

import regex as re

s = "Hello Chelsea Hello ManU"

matches = re.findall(r'\b(\w+) (\w+)\b', s, overlapped = True)
print(matches)
# [('Hello', 'Chelsea'), ('Chelsea', 'Hello'), ('Hello', 'ManU')]
Jan
  • 42,290
  • 8
  • 54
  • 79
0

If you just want pairs of words, why use regex?

s = "Hello Chelsea Hello ManU".split()
out = [(s[i], s[i+1]) for i in range(len(s)-1)]
saq7
  • 1,528
  • 1
  • 12
  • 25