How to compare these strings in python?

Question

I have the following string:

1679.2235398,-1555.40390834,-1140.07728186,-1999.85500108

and I'm using a steganography technique to store it in an image. Now when I retrieve it back out of the image, sometimes I got it back in a complete form and I have no issue with that. Where in other occasions, the retrieved data are not fully retrieved (due to a modification/alteration being occurred on the image), so the result something look like this:

1679.2235398,-1555.I8\xf3\x1cj~\x9bc\x13\xac\x9e8I>[a\xfdV#\x1c\xe1\xea\xa0\x8ah\x02\xed\xd1\x1c\x84\x96\xe2\xfbk*8'l

Notice that, only "1679.2235398,-1555." are correctly retrieved, while the rest is where the modification has been occurred. Now, how do I compute (in percentage) how much I successfully retrieved? Since the length is not the same, I can't do a character by character comparison , it seems that I need to slice or convert the modified data into some other form to match the length of the original data.

Any tips?

Does it not work to use the percentage of the original string that shows in the output? — , Jan 04 '17 at 12:26
No sure I've got what you mean, but what I want is something like this: is 1=1, is 6=6, and so on. — amsr, Jan 04 '17 at 12:42

score 0 · Accepted Answer · edited May 23 '17 at 10:30

A lot of this is going to depend on the context of your problem, but you have a number of options here.

If your results always look like that, you could just find the longest common subsequence, then divide by the length of the original string for a percentage.

Levenshtein distance is a common way of comparing strings, as the number of characters required to change to turn one string into another. This question has several answers discussing how to turn that into a percentage.

If you don't expect the strings to always come out in the same order, this answer suggests some algorithms used for DNA work.

Very interesting methods. I'm going to try both of them and see how the results look. Thanks JERM — amsr, Jan 04 '17 at 22:28

score 0 · Answer 2 · answered Jan 04 '17 at 13:06

Well it really depends.. My solution would be something like this:

I would start with all the longest string possible and check if they are in the new string if original_string in new_string: 'something happens here'. that would be inside a loop that wld decrease the size of the original string and get all combinations possible. so the next one wld be N-1 long and have 2 possible combinations (cutting off the first number or the last number), and so on, until u get to a specific threshold, or to 1 long strings.
the loop can store the longest string you find in a log inside the if conditional, and afterward you can just check the results. hope that helps.

It seems this is a good approach to tackle it. Many thanks I will test it. — amsr, Jan 04 '17 at 22:24

How to compare these strings in python?

2 Answers2