Python: How to force a string literal as is without being interpreted as a regex pattern

Question

I want to find all occurrences of a given phrase in a passage. The phrases are user inputs and cannot be predicted beforehand.

One solution is to use regex to search (findall, finditer) the phrase in the passage:

import re

phrase = "24C"
passage = "24C with"

inds = [m.start() for m in re.finditer(phrase, passage)]

Then the result is

inds = [0]

Because the phrase matches the passage at index 0 and there is only one occurrence.

However, when the phrase contains characters that have special meanings in regex, things are trickier

import re

phrase = "24C (75F)"
passage = "24C (75F) with"

inds = [m.start() for m in re.finditer(phrase, passage)]

Then the result is

inds = []

This is because the parentheses are interpreted specially as a regex pattern, but this is not desirable as I only want to have literal matches.

Is there anyway to enforce the phrase to be treated as string literal, not a regex pattern?

You could always use a while loop and iterate from the last matched position + 1 like in this post: https://codereview.stackexchange.com/questions/146834/function-to-find-all-occurrences-of-substring — ctwheels, Sep 28 '17 at 19:46
Thought of that but it comes with a catch: we need to handle the word boundaries ourselves. If regex can do it with one line, being bug-free and readable, I think it is a better idea to leverage existing libraries. Thank you for your input! — Yo Hsiao, Sep 28 '17 at 19:57

score 4 · Accepted Answer · answered Sep 28 '17 at 19:34

4

You can use re.escape() to force regex to treat the string as literal:

import re
phrase = "24C (75F)"
passage = "24C (75F) with"
inds = [m.start() for m in re.finditer(re.escape(phrase), passage)]
print(inds)

Output:

[0]

answered Sep 28 '17 at 19:34

Ajax1234

Excellent! Exactly what I was looking for. Just add some info: the official doc says "Escape all the characters in pattern except ASCII letters, numbers and '_'." But in effect, unicode will be treated literally without an issue. Sweet! – Yo Hsiao Sep 28 '17 at 19:43

1 Answers1