1

I'm new to PHP as well as the field and have been tasked with finding inconsistencies amongst tables from different mySQL databases. Every table should be the same between databases (column names, column count), with the exception of the actual data held in it, which they are not. I will need to identify the offending columns, the table they are in and, the database they are in. Below is my code thus far:

<?php
chdir(<path>);
include(<file>);
include(<file>);

$db = new db($hostname, $user, $pass, $commondatabase);

// Get baseline DB columns
//$db->select_db('<baseDB>');
//$btq = "SHOW TABLES";
//$btr = $db->query($btq);
//while($trows = $db->fetch_assoc($btr) {
//      $bcq = "SHOW COLUMNS FROM ".$btr[<key>];
//      $bcr = "$db->query($bcq);
//      $basecolumns[] = $bcr;
//}

$dbq = "SELECT * FROM <commonDB>.<DBnames> ORDER BY <DBnamesID>";
$dbr = $db->query($dbq);

while($dbrow = $db->fetch_assoc($dbr)) {
echo "\n";
print_r($dbrow['dbname']);
echo "\n";
        $db->select_db($dbrow['dbname']);
        $tq = "SHOW TABLES";
        $tr = $db->query($tq);
        while($table = $db->fetch_assoc($tr)) {
/*print_r($dbtables);*/
                $cq = "SHOW COLUMNS FROM ".$table['Tables_in_'.$dbrow['dbname']];
                $cr = $db->query($cq);
                $dbcols = [];
                while($col = $db->fetch_assoc($cr)) {
/*print_r($col);*/      $dbcols = $col;
                        // Do check against baseline here
                        //if($dbcols != $basecolumns) {
                        //      $badcolumns[] = $dbcols;
                        //}
                }
        }
}


$db->close();
?>

Currently it will loop down to the columns, but the output is not visually pleasing nor very manageable. My first concern is to get this working to where I can loop down to the columns and actually get them in their own arrays to be checked against the baseline and I've hit a wall, so any direction would be much appreciated. Currently each column is being assigned to its own array versus an array of all the column names for the database and table the loop is on. TIA!

Coty
  • 11
  • 3
  • You can query `INFORMATION_SCHEMA.COLUMNS` rather than querying the tables. – Barmar Jun 25 '21 at 20:22
  • Thank you Barmar. I thought about going that route initially, but was unsure of how to extract the values since they're all crammed into one field with other information around them. – Coty Jun 25 '21 at 20:26
  • Just use subqueries to select the table name and column names, ignoring the other columns. Then do a full outer join between the tables in the base DB and the other DBs. See https://stackoverflow.com/questions/4796872/how-to-do-a-full-outer-join-in-mysql?noredirect=1&lq=1 for how to emulate that in MySQL. – Barmar Jun 25 '21 at 20:30
  • 1
    "the output is not visually pleasing" do understand that `print_r()` is a debugging tool, it is not meant for end user output. – miken32 Jun 25 '21 at 21:08
  • Miken, correct and that is what I'm using it for in this case. This script is to be ran on the backend though and the individual running it will want it to print to the terminal the array of the inconsistent columns, their table and, their DB to reference for their research into what's being held in the inconsistent columns and if that data is needed/wanted. What I mean specifically by "not visually pleasing", is that the results (columns), are being assigned to their own individual arrays versus being pushed to an encompassing array of all the columns for a table. – Coty Jun 28 '21 at 12:08

1 Answers1

0

Here is what I came up with. Not sure if it's the most DRY way to go about it, but it works and achieves the results I was looking for. Thank you to Barmar for sending me down the right path.

<?php

chdir(<path>);
include(<file>);
include(<file>);

$db = new db($hostname,$user,$pass,$commondatabase);

$db->select_db(<database>);

// Get tables from baseline database
$q="select distinct <column-name> from <table-name> where <where-param> ORDER BY <order-by-param>";
$r=$db->query($q);
while($tables=$db->fetch_assoc($r)) {
        $table = $tables[<key>];

        $x = array();
        $x["$table"]=array();

        // Get table columns from baseline database
        $q="select <column-name> from <table-name> where <where-param> and <where-param> ORDER BY <order-by-param>";
        $r2=$db->query($q);
        while ($columns=$db->fetch_assoc($r2)) {
                $x["$table"][]=$columns[<key>];
        }

        // Check other databases for this table
        $q="select * from $commondatabase.<table-with-database-names> ";
        $q.="where <where-param> order by <order-by-param>";
        $r2=$db->query($q);
        while ($databases=$db->fetch_assoc($r2)) {

                $y = array();
                $y["$table"]=array();

                // Get table columns in $databases[<key>]
                $q="select <column-name> from <table-name> where <where-param> and <where-param> ORDER BY <order-by-param>";
                $r3=$db->query($q);
                while ($columns=$db->fetch_assoc($r3)) {
                        $y["$table"][]=$columns[<key>];
                }

                // Check against baseline
                $z1 = array_diff($x["$table"],$y["$table"]);
                $z2 = array_diff($y["$table"],$x["$table"]);

                // Echo out comparison results
                if (empty($z1) && empty($z2)) {
                        //echo "all good.\n";
                } else {
                        echo "Difference found in {$databases[<key>]}.{$tables[<key>]}:";
                        if (!empty($z1)) print_r($z1);
                        if (!empty($z2)) print_r($z2);
                }
        }
}

$db->close();
?>
Coty
  • 11
  • 3