Alternative for EXCEPT for MariaDB/MySQL comparing all columns

Question

I know MariaDB and MySQL don't support EXCEPT. I would like to find an alternative to something like this:

SELECT * FROM table
EXCEPT
SELECT * FROM backup_table

Where the table and backup_table have the same schema.

All the posts I've seen suggests that I compare a single column using "WHERE column IN (...)". The problem in my case is that I need to compare all the columns between the two tables for every table. I'm hoping to write this as procedure or function looping through all the tables, looking for any changes in the database. Basically, I want to find out all the records that have been updated or inserted in all my tables.

You can do `WHERE (column1, column2, ....) NOT IN (SELECT column1, column2, ....)` — Uueerdo, Feb 27 '17 at 23:30

spencer7593 · Accepted Answer · 2017-02-28T00:42:21.017

If I was faced with that task, I'd use an anti-join pattern. That's an outer join, to return all rows from the current table, along with "matching" rows from the backup table. Then in the WHERE clause, we exclude all rows that had an exact match. Returning rows that don't match.

  SELECT t.*
    FROM mytable t
    LEFT
    JOIN backup_mytable s
      ON s.id        <=> t.id
     AND s.col_two   <=> t.col_two
     AND s.col_three <=> t.col_three
     AND ... 
 WHERE s.id IS NULL

This assumes that the column id is guaranteed to be non-NULL. The PRIMARY KEY column (or any column that is part of the PRIMARY KEY of the table, or any column that has a NOT NULL constraint would serve.)

This query only returns the rows that don't match a row in backup table. It doesn't indicate whether its row that doesn't exist, or whether a value of a column was changed.

And to get rows in the original table that don't match rows in the backup table, just swap the table names.

For the special case of a table with all columns defined as NOT NULL, we could take a shortcut on the join predicates.

    FROM mytable t
 NATURAL
    LEFT
    JOIN backup_mytable s
   WHERE s.id IS NULL

That's equivalent to a LEFT JOIN with a USING clause of all columns that are named the same in both tables.

    FROM mytable t
    LEFT
    JOIN backup_mytable s
   USING (id, col_two, col_three, ...)
  WHERE s.id IS NULL

That's equivalent to specifying an equality comparison on every column (if both tables have the same columns)

    FROM mytable t
    LEFT
    JOIN backup_mytable s
      ON s.id        = t.id
     AND s.col_two   = t.col_two
     AND s.col_three = t.col_three

Any occurrences of NULL values in any of the columns are going to screw with the equality comparison, and return NULL.

And that's why the first query uses the null-safe comparison <=> (spaceship) operator. NULL <=> NULL will return TRUE, where NULL = NULL will return NULL.

For that first query pattern, rather than tediously typing out all of those comparisons of every column, I would use SQL to help me generate the SQL I need.

 SELECT CONCAT('   AND s.`',c.column_name,'` <=> t.`',c.column_name,'`') AS `-- stmt`
   FROM information_schema.columns c
  WHERE c.table_schema = 'mydatabase'
    AND c.table_name = 'mytable'
  ORDER BY c.ordinal_position

I'd take the rows returned by that query, and paste that in

SELECT t.*
  FROM ... t
  JOIN ... s
    ON 1=1
    -- paste here --
 WHERE s.id IS NULL
ORDER BY t.id

If I needed query that matched on just the id column, and needed to identify which columns had changed, I'd use expressions in the SELECT list. For example:

 SELECT s.`id`        <=> t.`id`         AS `match_id`
      , s.`col_one`   <=> t.`col_one`    AS `match_col_one`
      , s.`col_three` <=> t.`col_three`  AS `match_col_three`
  FROM mytable t
  JOIN backup_mytable s
    ON s.id = t.id
HAVING NOT match_col_one

Here referencing the column alias in the SELECT list in a HAVING clause, to exclude rows that have the same value of col_one; returning rows where col_one is different.

Again, I would use SQL against information_schema.columns to help speed up the query writing process.

score 1 · Answer 2 · edited May 23 '17 at 12:00

1

I don't have much experience with MySQL or MariaDB, but I found this as an answer to a different question that might be of use to you.

SELECT  *
FROM    match m
WHERE   NOT EXISTS
        (
        SELECT  1
        FROM    email e
        WHERE   e.id = m.id
        )

Credit goes to Quassnoi from the following post: MySQL SELECT x FROM a WHERE NOT IN ( SELECT x FROM b ) - Unexpected result

edited May 23 '17 at 12:00

Community

1
1

answered Feb 27 '17 at 23:28

Stewie

11
3

Your suggestion would work in case I'm looking at existence of an id in each of the two tables. Although that information is important, I am also looking to compare every column in the table for changes. But thanks for the suggestion. – Steven Mar 06 '17 at 14:32

Rainer Rillke · Answer 3 · 2018-11-10T11:22:14.990

Starting with version 10.3.0, MariaDB added support for missing set operations, including but not limited to EXCEPT.

CREATE TABLE `table` ( `item` VARCHAR(1) NOT NULL );
CREATE TABLE `backup_table` ( `item` VARCHAR(1) NOT NULL );
INSERT INTO `table` VALUES ( 'a' ), ( 'b' ), ( 'c' );
INSERT INTO `backup_table` VALUES ( 'a' ), ( 'b' ), ( 'd' );

SELECT * FROM `table`
EXCEPT
SELECT * FROM `backup_table`;

+------+
| item |
+------+
| c    |
+------+

Alternative for EXCEPT for MariaDB/MySQL comparing all columns

3 Answers3

Linked