0

I use re.findall(p, text) to match a pattern generally, but now I came across a question:

I just want p to be matched as a normal string, not regex.

For example: p may contain '+' or '*', I don't want these characters have special meanings as in regex. In another word, I want p to be matched character by character.

In this case p is unknown to me, so I can't add '\' into it to ignore special character.

Niklas B.
  • 92,950
  • 18
  • 194
  • 224
Zhu Shengqi
  • 3,632
  • 3
  • 24
  • 29
  • If you don't know `p`, how can you use it as a regex? – Marcin Apr 04 '12 at 14:36
  • 1
    @Marcin: He has no a priori information about `p`, so he can't hardcode the already escaped string. Don't see why this was downvoted? – Niklas B. Apr 04 '12 at 14:36
  • @NiklasB. Well, maybe, but why couldn't he escape special characters if he has the string? – Marcin Apr 04 '12 at 14:38
  • @Marcin: I think how this is done is the actual question here. The "I can't add '\' into it to ignore special character" is referring to the manual escaping, probably. – Niklas B. Apr 04 '12 at 14:38

3 Answers3

11

You can use re.escape:

>>> p = 'foo+*bar'
>>> import re
>>> re.escape(p)
'foo\\+\\*bar'

Or just use string operations to check if p is inside another string:

>>> p in 'blablafoo+*bar123'
True
>>> 'foo+*bar foo+*bar'.count(p)
2

By the way, this is mainly useful if you want to embed p into a proper regex:

>>> re.match(r'\d.*{}.*\d'.format(re.escape(p)), '1 foo+*bar 2')
<_sre.SRE_Match object at 0x7f11e83a31d0>
Niklas B.
  • 92,950
  • 18
  • 194
  • 224
  • I want to use re.findall(), so I think re.escape() is best for me! :) – Zhu Shengqi Apr 04 '12 at 14:43
  • @ZhuShengqi: To search for a verbatim string, `re.findall()` is essentially useless; `res.findall("ab", "abcabcabc")` results in `["ab", "ab", "ab"]`. You probably want `str.count()`. – Sven Marnach Apr 04 '12 at 14:44
  • @Zhu: Yep, if you don't *need* regular expressions, don't use them. Circumstances where the escaping might be useful are (a) You want to integrate `p` into a more complex regex (b) You want to match against a list of regular expressions of which some are just plain text searches and some are more complex. – Niklas B. Apr 04 '12 at 14:46
2

If you don't need a regex, and just want to test if the pattern is a substring of the string, use:

if pattern in string:

If you want to test at the start or end of the string:

if string.startswith(pattern): # or .endswith(pattern)

See the string methods section of the docs for other string methods.

If you need to know all locations of a substring in a string, use str.find:

offsets = []
offset = string.find(pattern, 0)
while offset != -1:
    offsets.append(offset)
    # start from after the location of the previous match
    offset = string.find(pattern, offset + 1)
agf
  • 171,228
  • 44
  • 289
  • 238
0

You can use .find on strings. This returns the index of the first occurence of the "needle" string (or -1 if it's not found). e.g.

>>> a = 'test string 1+2*3'
>>> a.find('str')
5
>>> a.find('not there')
-1
>>> a.find('1+2*')
12
huon
  • 94,605
  • 21
  • 231
  • 225