0

lets say I have a long string str_1:

**str_1** : 'Computer once meant a person who did computations, but now the term almost universally refers to automated electronic machinery. The first section of this article focuses on modern digital electronic computers and their design'

and I have a string str_2 to look for in the long string:

**str_2** : 'The second section of this article focuses on modern digital electronic computers and their design'

The str_2 actually exist in the str_1 but the word 'second' in str_2 is written as 'first' in str_1 the only difference is this.

What I want is to search a sentence in a string with some errors. I want to find the str with some percentage of errors then I will check the mistakes. Is there a way to do this. Thank you.

holdenweb
  • 33,305
  • 7
  • 57
  • 77
s900n
  • 3,115
  • 5
  • 27
  • 35
  • use an algorithm or a library which computes string distance, i.e levenshtein, etc... – gold_cy Jan 31 '19 at 13:42
  • Without any code, this becomes a programming problem. Unfortunately stackoverflow primarily exists to help people get their code working. In the most general cases this seems like a rather difficult problem, to which there are known solutions. – holdenweb Jan 31 '19 at 13:43
  • I duped you to the canonical 'fuzzy string comparison' post. I'd go with the `fuzzywuzzy` or `fuzzyset` libraries here. – Martijn Pieters Jan 31 '19 at 13:53

2 Answers2

0

You can use jaccard similarity score between sentences to determine the similarity between the two. Link Here

taurus05
  • 2,491
  • 15
  • 28
0

You could use a simple regex like

The (?:first|second) section of this article focuses on modern digital electronic computers and their design

See a demo on regex101.com.


However, this seems like some text / corpus problem so you might narrow down the sentences and use other "fuzzy" logic.
Jan
  • 42,290
  • 8
  • 54
  • 79