Regular expression matching in Python

Question

I want to find two similar strings with at least one error. I want to use pythons built in re library.

example

import re

re.match(r"anoother","another") #this is None indeed

it should return True and find if it has one or two typos.

i have looked for a long re documentations but i have no idea how to use this knowledge when there is one type

a="this is the anoother line\n"
b="this is the another line\n"
c=re.search(r"{}".format(a),b) #how to write regex code here? 
#c =True  #it should return True

I expect return True

re.any_regex_func(r"anyregex this is anoother line anyregex","this is another line")

if it has more than one type return false

I don't think regex is the right tool here. You might try looking at algorithms for determining edit distance. — Mark, May 18 '19 at 16:46
Google something like "python fuzzy string matching", regex is probably not what you are looking for. — benvc, May 18 '19 at 16:47
In the standard library there is "difflib" module for such tasks. — Michael Butscher, May 18 '19 at 16:48
Ok. i am new to regex that is why i asked if it is possible . i can write the algorithm without regex. — Nursultan Beloved, May 18 '19 at 16:49

Casimir et Hippolyte · Accepted Answer · 2019-06-24T01:35:29.363

What you are looking for is called fuzzy matching but unfortunately the re module doesn't provide this feature.

However the pypi/regex module has it and is easy to use (you can set the number of character insertion, deletion, substitution and errors allowed for a group in the pattern). Example:

>>> import regex
>>> regex.match(r'(?:anoother){d}', 'another')
<regex.Match object; span=(0, 7), match='another', fuzzy_counts=(0, 0, 1)>

The {d} allows deletions for the non-capturing group, but you can set the maximum allowed writing something like {d<3}.

score 0 · Answer 2 · answered May 18 '19 at 17:05

I'm not so sure about the variances of another. But, maybe we could add a middle capturing groups with negative lookbehind and pass your desired anothers and fail those undesired ones. Maybe, here we could define our expression similar to:

^((.+?)(another?|anoother?)(.+))$

RegEx

If this wasn't your desired expression, you can modify/change your expressions in regex101.com.

RegEx Circuit

You can also visualize your expressions in jex.im:

Python Demo

# -*- coding: UTF-8 -*-
import re

string = "this is the other line\n"
expression = r'^((.+?)(another?|anoother?)(.+))$'
match = re.search(expression, string)
if match:
    print("YAAAY! \"" + match.group(1) + "\" is a match  ")
else: 
    print(' Sorry! No matches!')

Output

 Sorry! No matches!

Regular expression matching in Python

2 Answers2

RegEx

RegEx Circuit

Python Demo

Output