2

I am using Google diff-match-patch JAVA plugin to create patch between two JSON strings and storing the patch to database.

diff_match_patch dmp = new diff_match_patch();    
LinkedList<Patch> diffs = dmp.patch_make(latestString, originalString);
String patch = dmp.patch_toText(diffs); // Store patch to DB

Now is there any way to use this patch to re-create the originalString by passing the latestString?

I google about this and found this very old comment @ Google diff-match-patch Wiki saying,

Unpatching can be done by just looping through the diff, swapping DIFF_INSERT with DIFF_DELETE, then applying the patch.

But i did not find any useful code that demonstrates this. How could i achieve this with my existing code ? Any pointers or code reference would be appreciated.

Edit:

The problem i am facing is, in the front-end i am showing a revisions module that shows all the transactions of a particular fragment (take for example an employee details), like which user has updated what details etc. Now i am recreating the fragment JSON by reverse applying each patch to get the current transaction data and show it as a table (using http://marianoguerra.github.io/json.human.js/). But some JSON data are not valid JSON and I am getting JSON.parse error.

VishwaKumar
  • 3,433
  • 8
  • 44
  • 72
  • Ideally it's possible to create the old string if you have the next iteration and the changes.. but i wouldn't recommend this approach as diff match patch is not 100% accurate..some html, spacing errors are there...u can rather save the old steing and new string...anyways its just a string.. what difference does it make...can u explain the use-case? – We are Borg Jun 29 '16 at 12:42
  • Thanks for replying. I have updated the answer. Hope that makes sense. – VishwaKumar Jun 29 '16 at 13:10
  • Exactly, for this I wouldn't recommend you saving the diff-match-patch object, why add a layer of complexity? Just save the text which was changed. Or you can save the delta, which contains the information which was changed. Then if you do that, you would need to identify in the text where it was changed. So, lets take this line "My name is borg", edited to "My name is stackoverflow". Under this situation you would have to save delta for name change, but while re-applying the logic to get original text, you would need to replace stackoverflow exactly with borg, so you have tosearch forit.Contd – We are Borg Jun 30 '16 at 10:38
  • This adds 2 layers of complexity in your core logic and also the problem you have is diff-match-patch makes mistakes if text is in HTML,or some spaces are there or line breaks, etc. Under this circumstances, your entire logic will go kaput. This is why I am not recommending saving deltas in DB. – We are Borg Jun 30 '16 at 10:39

1 Answers1

0

I was looking to do something similar (in C#) and what is working for me with a relatively simple object is the patch_apply method. This use case seems somewhat missing from the documentation, so I'm answering here. Code is C# but the API is cross language:

static void Main(string[] args)
{
    var dmp = new diff_match_patch();

    string v1 = "My Json Object;            
    string v2 = "My Mutated Json Object"

    var v2ToV1Patch = dmp.patch_make(v2, v1);
    var v2ToV1PatchText = dmp.patch_toText(v2ToV1Patch);  // Persist text to db

    string v3 = "Latest version of JSON object;

    var v3ToV2Patch = dmp.patch_make(v3, v2);
    var v3ToV2PatchTxt = dmp.patch_toText(v3ToV2Patch);  // Persist text to db

    // Time to re-hydrate the objects 

    var altV3ToV2Patch = dmp.patch_fromText(v3ToV2PatchTxt);
    var altV2 = dmp.patch_apply(altV3ToV2Patch, v3)[0].ToString(); // .get(0) in Java I think           

    var altV2ToV1Patch = dmp.patch_fromText(v2ToV1PatchText);
    var altV1 = dmp.patch_apply(altV2ToV1Patch, altV2)[0].ToString(); 

}

I am attempting to retrofit this as an audit log, where previously the entire JSON object was saved. As the audited objects have become more complex the storage requirements have increased dramatically. I haven't yet applied this to the complex large objects, but it is possible to check if the patch was successful by checking the second object in the array returned by the patch_apply method. This is an array of boolean values, all of which should be true if the patch worked correctly. You could write some code to check this, which would help check if the object can be successfully re-hydrated from the JSON rather than just getting a parsing error. My prototype C# method looks like this:

private static bool ValidatePatch(object[] patchResult, out string patchedString)
{
    patchedString = patchResult[0] as string;

    var successArray = patchResult[1] as bool[];

    foreach (var b in successArray)
    {
        if (!b)
            return false;
    }

    return true;
}
ste-fu
  • 6,879
  • 3
  • 27
  • 46