-1

I want to find two similar strings with at least one error. I want to use pythons built in re library.

example

import re

re.match(r"anoother","another") #this is None indeed

it should return True and find if it has one or two typos.

i have looked for a long re documentations but i have no idea how to use this knowledge when there is one type

a="this is the anoother line\n"
b="this is the another line\n"
c=re.search(r"{}".format(a),b) #how to write regex code here? 
#c =True  #it should return True

I expect return True

re.any_regex_func(r"anyregex this is anoother line anyregex","this is another line")

if it has more than one type return false

Emma
  • 27,428
  • 11
  • 44
  • 69

2 Answers2

1

What you are looking for is called fuzzy matching but unfortunately the re module doesn't provide this feature.

However the pypi/regex module has it and is easy to use (you can set the number of character insertion, deletion, substitution and errors allowed for a group in the pattern). Example:

>>> import regex
>>> regex.match(r'(?:anoother){d}', 'another')
<regex.Match object; span=(0, 7), match='another', fuzzy_counts=(0, 0, 1)>

The {d} allows deletions for the non-capturing group, but you can set the maximum allowed writing something like {d<3}.

Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
0

I'm not so sure about the variances of another. But, maybe we could add a middle capturing groups with negative lookbehind and pass your desired anothers and fail those undesired ones. Maybe, here we could define our expression similar to:

^((.+?)(another?|anoother?)(.+))$

enter image description here

RegEx

If this wasn't your desired expression, you can modify/change your expressions in regex101.com.

RegEx Circuit

You can also visualize your expressions in jex.im:

enter image description here

Python Demo

# -*- coding: UTF-8 -*-
import re

string = "this is the other line\n"
expression = r'^((.+?)(another?|anoother?)(.+))$'
match = re.search(expression, string)
if match:
    print("YAAAY! \"" + match.group(1) + "\" is a match  ")
else: 
    print(' Sorry! No matches!')

Output

 Sorry! No matches!
Emma
  • 27,428
  • 11
  • 44
  • 69