-2

I have to find only two digit numbers in this case 32, but it is matching and printing 323, 32222,

Code:

import re

s = """32 M 32 L 32 S 32 K 324 J 32555 A 32222 B 8888

32 small again 32 324 567 323 yes 32 else again not 323 32 32-123"""

pattern = "32" # Also tried with "32/s" but if 32 present at end it does not match and also tried with "32{2}" still not working
# As per below answers, i used pattern = "\\b32\\b"
# But it is also matching 32-123, in my case only 32 must be matched 
result = re.findall(pattern, s)
print(result)
print(len(result))

Expected Output: ['32', '32', '32', '32', '32', '32', '32', '32'] # length is 8 because string s contains 8 times 32 digits 8

Vidya
  • 547
  • 1
  • 10
  • 26

4 Answers4

3

Try using pattern = "\\b32\\b" \b - is a word bounder and you should use double \

  • 1
    Not a python expert, why the double `\\` is needed? – Christian Baumann Oct 01 '20 at 07:01
  • Because single slash is used as escape sequence character and hence to read a slash you have to prefix it with another slash. Other choice can be to use raw strings if this feels confusing. – Shashikanth Reddy Oct 01 '20 at 07:09
  • This answers looks correct what if number starts like something 32-123, it is matching but in my case it should not match – Vidya Oct 01 '20 at 07:13
  • It won't match, because here \b is used. [Here](https://www.regular-expressions.info/wordboundaries.html) you can read some detailed information how it works. If you new at regular expressions there is an [excellent book](https://www.oreilly.com/library/view/introducing-regular-expressions/9781449338879/). It's possible to read it during for couple of evenings. It's really useful. – Evgenii Minikh Oct 01 '20 at 21:22
3

Use the regex r'\b\d{2}\b' to get two-digit numbers separated by the word boundaries.

enter image description here

Vishnudev Krishnadas
  • 10,679
  • 2
  • 23
  • 55
1

You can try:

pattern = r'\b32\b'

The 'r' at the start of the pattern string designates a python "raw" string which passes through backslashes without change.

\b allows you to perform a “whole words only” search using a regular expression in the form of \bword\b.

Shradha
  • 2,232
  • 1
  • 14
  • 26
0

You can apply word boundary in your pattern like this:

import re

s = """32 M 32 L 32 S 32 K 324 J 32555 A 32222 B 8888
32 small again 32 324 567
323 yes 32 else again not 323 32"""

pattern = r"\b32\b" 
result = re.findall(pattern, s)
print(result)

To search any two digit numbers below pattern can be used:

pattern = r"\b\d\d\b"