0

I'm trying to calculate the Levenshtein distance (with weight) between two lists of numbers.

import textdistance

S1 = [1, 2, 3, 7,  9, 15, 19, 20]
S2 = [1, 2, 3, 7, 98, 99, 20]

textdistance.levenshtein.similarity(S1, S2)
textdistance.levenshtein.distance(S1, S2)

Except here every operation has a cost of 1, and I'd like to define different cost values when replacing/removing certain digits, for example, 1 - 20 all have a cost of 1, but 98 has a cost of 10 and 99 has a cost of 15 etc. I did some searches and found this library, however, it works with string, and converting ints to string doesn't work, since it iterates char by char, and 9 can't be compared to 98 anymore for example.

Is there any workaround or existing library that does this, or perhaps can achieve the same result, or do I need to implement it from scratch?

VLL
  • 9,634
  • 1
  • 29
  • 54
yvng pei
  • 23
  • 2
  • How about converting your values to their corresponding ASCII character before using the weighted-levenshtein lib ? For instance : `[chr(i) for i in S1]` As long as your values are below 128 (number of ASCII characters), this should work. – rochard4u Aug 22 '22 at 09:56
  • @rochard4u Thanks for the suggestion, I did a quick search and wondering what happens to 1 to 31, and for example 127? I wasn't able to print those. Edit: or well, it prints as `['\x01', '\x02', '\x03', '\x07', '\t', '\x0f', '\x13', '\x14']`, can these be treated as a single char? For example comparing 9 to 98, in ASCII would it be comparing two chars `'\t'` to 1 char `'b'`? – yvng pei Aug 22 '22 at 11:50
  • Yes they are treated as a single character ; every unicode character can be written using the hexadecimal representation `\x00` to `\xFF`. String (char) comparison are made using the unicode values. See [How are strings compared?](https://stackoverflow.com/questions/4806911/how-are-strings-compared) – rochard4u Aug 22 '22 at 12:00
  • @rochard4u Thanks! For now, I will use this workaround and investigate further if the number of digits exceeds 128. – yvng pei Aug 23 '22 at 08:34

0 Answers0