1

I need to transform a source string to target string and express the same as an operation (D,A,TYPE) ie(deletion , addition, PREFIX\SUFFIX) to the source string which will transform it to target string on applying these operation to the suffix of the source string, or the prefix of the source string

eg:

activities->activity
(ies,y,suffix)

center->centre
(er,re,suffix)

solutions->solution
(s,None,suffix)

solution ->solutions
(None,s,suffix)

could->would
(c,w,prefix)

the following code does get the suffixes but also gets all other matches but I need this only for suffixes, beside it does not output the correct format as required by me.

from difflib import SequenceMatcher
a = "ACTIVITY"
b = "ACTIVITIES"


s = SequenceMatcher(None, a, b)
for tag, i1, i2, j1, j2 in s.get_opcodes():
   print ("%7s a[%d:%d] (%s) b[%d:%d] (%s)" %
          (tag, i1, i2, a[i1:i2], j1, j2, b[j1:j2]))

Also I don't need any corrections to be output if there is no sufficient suffix/prefix match eg: strings like enough, beyond should yield no match, Which can be possibly gleaned from:

difflib.SequenceMatcher(None,'especially','particularly').ratio()
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
stackit
  • 3,036
  • 9
  • 34
  • 62

2 Answers2

2

This doesn't need difflib as it can be done by finding the common prefix or suffix and removing that from the strings. Consider:

>>> from os.path import commonprefix
>>> c = commonprefix(['activities', 'activity'])
>>> c
'activit'
>>> [v[len(c):] for v in ['activities', 'activity']]
['ies', 'y']

As I don't have a commonsuffix function, I simply reverse the strings and then apply the commonprefix function:

>>> c = commonprefix(['could'[::-1], 'would'[::-1]])
>>> c
'dluo'
>>> [v[len(c):][::-1] for v in ['could'[::-1], 'would'[::-1]]]
['c', 'w']

To combine that into a function:

from os.path import commonprefix

def suffix_prefix(a, b):

   pre = commonprefix([a, b])
   suf = commonprefix([a[::-1], b[::-1]])
   if len(pre) > len(suf):
       return tuple(v[len(pre):] for v in (a, b)) + ('suffix',)
   else:
       return tuple(v[len(suf):][::-1] for v in (a[::-1], b[::-1])) + ('prefix',)

print suffix_prefix('activities', 'activity')
print suffix_prefix('could', 'would')
print suffix_prefix('solutions', 'solution')

Prints:

('ies', 'y', 'suffix')
('c', 'w', 'prefix')
('s', '', 'suffix')

I'l leave the string formatting to you.

Dan D.
  • 73,243
  • 15
  • 104
  • 123
1

I think you might be over complicating this a little. Its reasonably easy to find the shared suffix of two strings in python without using difflib. For examples see this stackexchange question. In particular this is implemented in the os module as os.path.commonprefix. Using this we can find the difference in suffix for two strings which share a prefix or the difference of the prefix in two strings that share a suffix (the second way just by reversing the strings and using the first).

Here is an example (which could certainly be neatened up):

import os

def find_change(string_1, string_2):
    shared_prefix = os.path.commonprefix([string_1, string_2])

    if len(shared_prefix) > 0:
        return "{}->{}\n({},{},{})".format(string_1, string_2, string_1[len(shared_prefix):],string_2[len(shared_prefix):], "SUFFIX")

    string_1_reversed = string_1[::-1]
    string_2_reversed = string_2[::-1]

    shared_suffix_reversed = shared_prefix = os.path.commonprefix([string_1_reversed, string_2_reversed])

    if len(shared_suffix_reversed) > 0:
        shared_suffix = shared_suffix_reversed[::-1]
        return "{}->{}\n({},{},{})".format(string_1, string_2, string_1[:-len(shared_suffix)], string_2[:-len(shared_suffix)], "PREFIX")

    return None 

print(find_change("could", "would"))
print(find_change("hello", "helmet"))
Community
  • 1
  • 1
or1426
  • 929
  • 4
  • 7
  • i actually wanted the ("could","c","w")) as output given could, would – stackit Sep 06 '15 at 15:13
  • thanks,the question asks to transform the two input strings to the correction output(d,e,type) which will transform the source to target – stackit Sep 06 '15 at 15:15