0

I have a database that employees use to add comments, along with other information. The comments can get rather long and I'd like to know if there's a way to get only the text that changed.

Example:

$before_text = "This is a long piece of text where the employee has made a comment about the STARTER of their project. Notes and information go here, blah, blah, blah...";

$after_text = "This is a long piece of text where the employee has made a comment about the STATUS of their project. Notes and information go here, blah, blah, blah...";

When I compare the two, I get the fact that the text has changed from $before_text to $after_text, but I'd like to end up with a variable like this:

$change = "'STARTER' changed to 'STATUS'"; 

... so that I can put that into a log. Some of these comments are really long and I'd had to end up with a log that has two large entries to describe what changed.

Is there a way to extract only the text that has changed between two text/string variables?

Mr_Thomas
  • 857
  • 3
  • 19
  • 39
  • What if the first and last words are changed? – Patrick Q Sep 18 '17 at 16:14
  • It doesn't matter to me where the strings are different. I just want to show the difference, not the entire changed variable. – Mr_Thomas Sep 18 '17 at 16:15
  • Right, but what would you want/expect the `$change` variable to contain in that case? – Patrick Q Sep 18 '17 at 16:16
  • *Hm...*, maybe a ternary operator with a regex (with a CASE) and another function to get the string lengths maybe, I'm thinking out loud here of course. – Funk Forty Niner Sep 18 '17 at 16:17
  • If the `$before_text` = "This is a long piece..." and the `$after_text` = "That is a long piece..." I would expect the `$change` to be "'This' changed to 'That'" – Mr_Thomas Sep 18 '17 at 16:18
  • You're still missing my point. What if _both_ the first _and_ last words are changed? – Patrick Q Sep 18 '17 at 16:19
  • Make an array of each and diff the array? This has a potential for highlighting *all* of the differences. – Jay Blanchard Sep 18 '17 at 16:21
  • https://stackoverflow.com/questions/40280782/how-to-find-diff-between-two-string-in-sql check this thread – Jacob H Sep 18 '17 at 16:21
  • @PatrickQ, you're right, I didn't understand what you were saying. Basically, what do you do if there are multiple edits. In answer to your question: I don't know. I suppose my question would change to include multiple changes between two strings. – Mr_Thomas Sep 18 '17 at 16:21
  • 1
    If it`s just the words "STARTER" and "STATUS" that are in question, then that should be simple enough and compare it from an exploded string (and possibly a ternary). Thing is, are those the only words and inside those 2 strings? – Funk Forty Niner Sep 18 '17 at 16:26
  • 1
    Another thing to consider is that they may change one occurrence of a word that appears multiple times in the comment. Without providing context of _where_ in the comment that word was changed, the log isn't particularly helpful. You might want to reconsider what you're trying to accomplish here. – Patrick Q Sep 18 '17 at 16:26
  • @PatrickQ, I'm trying to avoid storing two very long strings in a log file when only one or two words may have changed between those two strings. The people I'm doing this for want the detail: WHEN something changes they want to see WHAT changed. It's cumbersome to look through two long strings to find what changed between them. – Mr_Thomas Sep 18 '17 at 16:30
  • What if I parse every word? For instance "`This` at position 1 changed to `That`"; `STARTER` at position 16 changed to `STATUS`". I honestly don't know how easy or difficult this is. – Mr_Thomas Sep 18 '17 at 16:35
  • Personally, my suggestion would be to store both strings, and then use something like https://github.com/adaptivemedia/php-text-difference (I've never used it, but a quick read makes me think it might be suitable) to handle the differences on the _display_ side of the equation, not the storage side. – Patrick Q Sep 18 '17 at 16:37
  • Are you trying to highlight the difference in `$after_text` and `$after_text`. Like how stackoverflow shows the changes in our answer after editing it ?? – Akshay N Shaju Sep 18 '17 at 16:38
  • 1
    May be it will help you http://pear.php.net/package/Text_Diff/docs/latest/Text_Diff/Text_Diff.html – siddhesh Sep 18 '17 at 16:39

2 Answers2

1

Hope you are trying to show the difference in $before_text and $after_text

<?php
$string_old = "hello this is a demo page";
$string_new = "hello this is a beta page";
$diff = get_decorated_diff($string_old, $string_new);
echo "<table>
<tr>
    <td>".$diff['old']."</td>
</tr>
<tr>
    <td>".$diff['new']."</td>
</tr>
</table>";

and here is the function 'get_decorated_diff'

function get_decorated_diff($old, $new){
$from_start = strspn($old ^ $new, "\0");        
$from_end = strspn(strrev($old) ^ strrev($new), "\0");

$old_end = strlen($old) - $from_end;
$new_end = strlen($new) - $from_end;

$start = substr($new, 0, $from_start);
$end = substr($new, $new_end);
$new_diff = substr($new, $from_start, $new_end - $from_start);  
$old_diff = substr($old, $from_start, $old_end - $from_start);

$new = "$start<ins style='background-color:#ccffcc'>$new_diff</ins>$end";
$old = "$start<del style='background-color:#ffcccc'>$old_diff</del>$end";
return array("old"=>$old, "new"=>$new);
}

Which will return the following

enter image description here

but when multiple changes comes.. it may be complex !

Akshay N Shaju
  • 355
  • 4
  • 17
  • Hmm, you've hit upon an interesting idea: Why not just add color to the words that have changed? I think that would be do-able. – Mr_Thomas Sep 18 '17 at 16:57
  • Yes -- that's correct. I only need to know WHAT changed. – Mr_Thomas Sep 18 '17 at 17:00
  • So here it is.... `$start` means text before the changed word & `$end` means text after the changed word So for your requirement ... just throw only the `$new_diff` and `$old_diff` `$new = "$new_diff"; $old = "$old_diff"; return array("old"=>$old, "new"=>$new);` – Akshay N Shaju Sep 18 '17 at 17:04
0

Here is something quick & dirty to get you started. I created an array of each item, diffed the array to get the new value, then used the index of the new value to get the new value.

$before_text = "This is a long piece of text where the employee has made a comment about the STARTER of their project. Notes and information go here, blah, blah, blah...";

$after_text = "This is a long piece of text where the employee has made a comment about the STATUS of their project. Notes and information go here, blah, blah, blah...";

$arr1 = explode(' ', $before_text);
$arr2 = explode(' ', $after_text);

$diff = array_diff($arr1, $arr2);
print_r($diff);
$new = $arr2[key($diff)];
echo $new;

this returns:

Array
(
    [16] => STARTER
)
STATUS

But here is a cautionary tale: if the user changes multiple words or does some other odd things you're going to have to do some looping and sorting to get it close to correct. YMMV

Jay Blanchard
  • 34,243
  • 16
  • 77
  • 119
  • 1
    My mileage ALWAYS varies... I'll give this a whirl and see what happens. I fully expected there **will** be multiple edits between the `before` and `after`. (I knew this wasn't going to be easy) – Mr_Thomas Sep 18 '17 at 16:50
  • 1
    I'm not sure I would encourage getting started down this path at all. Especially when there are already tools out there that do the heavy lifting for you. "other odd things" includes basic changes like adding or removing any text. Maybe as a "can I do this" project, but I wouldn't fight this fight in a real-world, work project. – Patrick Q Sep 18 '17 at 16:51
  • The `horde_text_diff` project (not with Pear anymore) suggested by @siddhesh seemed to be the closest to what I was looking for. I agree with your assessment, though -- not to be used in production. – Mr_Thomas Sep 18 '17 at 16:54