The goal is to extract 100 characters before and after the keyword "bankruptcy".
str = "The company announced bankruptcy on jan 1, 1900. Many more companies announced bankruptcy in 1920s."
pattern = r"(?i)\s*(?:\w|\W){0,100}\b(?:bankruptcy)\b\s*(?:\w|\W){0,100}"
import re
output = re.findall(pattern, str)
Expected output:
['The company announced bankruptcy on jan 1, 1900. Many more companies announced bankruptcy in 1920s.',
'The company announced bankruptcy on jan 1, 1900. Many more companies announced bankruptcy in 1920s.']
Current output: ['The company announced bankruptcy on jan 1, 1900. Many more companies announced bankruptcy in 1920s.']
Is there a way to resolve overlapping indexes using re.findall
?