17

Currently I'm manually creating a string where I concatenate all the values in each row in my table. I'm hashing this string for each row to get a hash value for the current values (/status) of the row, which I'm later is using to determine if the row has changed.

Instead of doing this manually, is there an build-in way i mySQL to get a unique hash value for each row?

dhrm
  • 14,335
  • 34
  • 117
  • 183

4 Answers4

26

you could do something like

SELECT MD5(concat(field1, field2, field3, ...)) AS rowhash

but you can't get away from listing which fields you want, as concat(*) is not an option (syntax error).

Marc B
  • 356,200
  • 43
  • 426
  • 500
  • 3
    This won't work if one of the fields is NULL. You should use COALESCE(fieldX, 'some_NULL_placeholder') for NULLable fields. – Mrskman Jul 29 '20 at 09:10
  • @Mrskman The doc says it does skip any NULL values after the separator argument(at least it's so since MySQL 8.0). (https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_concat-ws) – aderchox Aug 11 '20 at 03:46
  • 3
    @aderchox CONCAT_WS() is not same as CONCAT() ;-) CONCAT() returns NULL if any argument is NULL. https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_concat – Mrskman Aug 13 '20 at 09:04
  • Depending on the content of your columns, this could be very unsave as explained here: https://stackoverflow.com/a/63407323/3735561 – LukasKroess May 11 '23 at 09:44
6

It's better to use concat_ws(). e.g. two adjacent column: 12,3 => 1,23 .

Sorry, this still has some problems. Think about the null value, empty string, string can contain ',', etc...

A program is required to generate the hash statement, which should replace null to specific value (for null-able columns), and also use the seldom used char/byte as separator.

Frank
  • 61
  • 1
  • 2
4

There are problems with CONCAT, e.g. CONCAT('ab', 'c') vs CONCAT('a', 'bc'). Two different rows, but result is the same. You could use CONCAT_WS(';', 'ab', 'c') to get ab;c but in case of CONCAT_WS(';', ';', '') vs CONCAT_WS(';', '', ';') you still get the same result.

Also CONCAT(NULL, 'c') returns NULL.

I think the best way is to use QUOTE:

SELECT MD5(CONCAT(QUOTE(c1), QUOTE(c2), QUOTE(c3))) AS row_hash FROM t1;

Result of: select (concat(quote('a'), quote('bc'), quote('NULL'), quote(NULL), quote('\''), quote('')));

is: 'a''bc''NULL'NULL'\''''

Also, don't use GROUP_CONCAT() to get hash of table, it has limit: https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_group_concat_max_len

Instead, CHECKSUM TABLE might be better, but you can't skip columns with CHECKSUM TABLE https://dev.mysql.com/doc/refman/5.7/en/checksum-table.html

RaDim
  • 630
  • 6
  • 6
2

Well I made a little script that could do excactly what you want, and maybe what others want... so here it goes...for PHP that is... first you have to make a list of columns of the table, then you make a "case when" statement for each column based on their type and put that in the concat_ws statement and then you hash it with sha1...i've used this method on very large tables (600000+ records) and the speed is quite good when selecting all records. also I think that it is faster to concat the required data in a concat_ws and explode it in php or whatever you are using, but that is just a hunch...

<?
$query= mysql_query("SHOW COLUMNS FROM $table", $linklive);
        while ($col = mysql_fetch_assoc($query)) {
            $columns[] = mysql_real_escape_string($col['Field']);
            if ($col['Key'] == 'PRI') {
                $key = mysql_real_escape_string($col['Field']);
            }
            $columnsinfo[$col['Field']] = $col;
        }
        $dates = array("date","datetime","time");
                    $int = array("int","decimal");
                    $implcols = array();
                    foreach($columns as $col){
                        if(in_array($columnsinfo[$col]['Type'], $dates)){
                            $implcols[] = "(CASE WHEN (UNIX_TIMESTAMP(`$col`)=0 || `$col` IS NULL) THEN '[$col EMPTY]' ELSE `$col` END)";
                        }else{
                            list($type, $rest) = explode("(",$columnsinfo[$col]['Type']);
                            if(in_array($columnsinfo[$col]['Type'], $dates)){
                                $implcols[] = "(CASE WHEN ( `$col`=0 || `$col` IS NULL ) THEN '[$col EMPTY]' ELSE `$col` END)";
                            }else{
                                $implcols[] = "(CASE WHEN ( `$col`='' || `$col` IS NULL ) THEN '[$col EMPTY]' ELSE `$col` END)";
                            }
                        }
                    }
                    $keyslive = array();
                    //echo "SELECT $key SHA1(CONCAT_WS('',".implode(",", $columns).")) as compare FROM $table"; exit;
                    $q = "SELECT $key as `key`, SHA1(CONCAT_WS('',".implode(", ",$implcols).")) as compare FROM $table";
    ?>
SomeOne_1
  • 808
  • 10
  • 10