0

I'm trying to process a text file and recognize certain patterns using regular expressions. I want my program to recognize patterns such as:

Pattern\n
Pattern \n
Pattern  \n

etc. I want to be able to recognize the pattern with any number of white spaces or tabs (is there a difference?) between "Pattern" and the carriage return.

I've looked at How to ignore whitespace in a regular expression subject string? but I don't understand why they have a slash in the front and back of the expression.

How do I use regex to do this?

Community
  • 1
  • 1
ZebraSocks
  • 85
  • 1
  • 9

2 Answers2

0

In the example you linked, the slashes at the start and end are part of the pattern, so they aren't relevant to the regex part of the answer.

If I understand your question, pattern will always be contiguous. In that case, it doesn't matter how much whitespace is after it, you'll always find it. If you want to make sure you find pattern while capturing that whitespace, use something like

import re
lines = 'hello\nhello     \n'
pattern = 'hello'
results = re.findall(pattern + r'\s*\n', lines)
print(results)
>>> ['hello\n', 'hello     \n']

If you don't care about the whitespace, just search for pattern.

BlivetWidget
  • 10,543
  • 1
  • 14
  • 23
0

Yes there is a difference. Most regex engines have flags you can specify to search for whitespace or newlines. for example in python to match whitespace you would do re.match(r'\s', yourVar) the '\s' matches whitespace. the two slashes you are referring to '//' that is used in many languages to represent the regular expression. e.g. /\s/ your expression would go between the two.

Hope this helps