-1

I have two strings that are 90% equal but I want to ignore the 10% of it where they are not equal. For example.

String s1 = "Hi my name is Bob. About me: Useless information. I am a male."

String s2 = "Hi my name is Bob. About me: Different useless information. I am a male."

Even-though these two strings are different, I want a way to compare the strings and return true that they are equal. What is the best way to approach this problem. Are there any string utilities that already exist that can help me achieve this effect?

Tushar
  • 85,780
  • 21
  • 159
  • 179
Ogen
  • 6,499
  • 7
  • 58
  • 124
  • 2
    See [Similarity String Comparison in Java](http://stackoverflow.com/questions/955110/similarity-string-comparison-in-java) –  Feb 09 '16 at 03:22
  • 1
    I don't think this is a duplicate to the question posed as it is not about how similar the strings are, but rather do key components of two strings match assuming that the two strings match some convention. – klog Feb 09 '16 at 03:51

2 Answers2

0

There is an implementation available in java called LevenshteinDistance.java.

It computes the percentage of similarity between two different strings.

Hope this helps. =)

Wiki Reference

0

Using regular expressions via the java.util.regex.Pattern and java.util.regex.Matcher classes you can create groups in your regular expression to pick out the details that need to match.

For example your regex could look like: "Hi my name is \(.*\)\. About me: \(.*\)\. I am a \(.*\)\.". Then you could use the Matcher.group(int) method and compare the values in the groups between your two strings assuming both strings match the regex first.

My regex example probably isn't perfect, but hopefully you get the idea.

klog
  • 486
  • 3
  • 10
  • I think you are very close to an answer. How can I use the regex to ignore the about me part though? I just need a function that inputs two strings and outputs if they are equal regardless of what it says in the about me part. – Ogen Feb 09 '16 at 03:36
  • 1
    Assuming your strings are pretty rigid in form, just don't compare the about me part. A group is defined by the parentheses. So if you verify both match the regex then retrieve group 1 from both Matchers and then group 3. If the values of each group match, you've matched as much as you've wanted. – klog Feb 09 '16 at 03:40
  • Thanks I followed your advice and it worked. – Ogen Feb 09 '16 at 03:45
  • 1
    I saw that my answer was up voted momentarily and then down voted can that person let me know why my answer isn't worthy? – klog Feb 09 '16 at 03:53