I'm writing a code to backup a MySQL database into a binary file. I'm aware of mysqldump
but for some reasons I can't use trivial methods. What I'm currently doing:
- read schema definition
- sort tables by foreign key dependencies
- select rows in all tables (100 rows each time) and write them in a binary file
Definition of Dependency: A table T1
depends on existence of table T2
if and only if at least there is one foreign key in T1
pointing to T2
's key.
A numeric value is assigned to each table. This value specifies order of tables. For tables with no dependency, this value is 0
for others it's maximum value of tables that current table depends on them; plus one. If there is a -1
in the set of values of depended tables, value of current table remains undefined (-1
). Initially value of all tables is -1
which means unspecified.
This is C++ code:
// tablesQueue: Queue of all tables
// orderedQueue: Resulting order
while(! tablesQueue.isEmpty())
{
bool satisfied = true;
foreach(TableNode* parent, tablesQueue.head()->referencedTables)
{
if(parent->degreeOfFreedom == -1)
{
satisfied = false;
break;
}
else
// handle error blah blah ...
}
if(satisfied)
{
int max =0;
foreach(TableNode* parent, tablesQueue.head()->referencedTables)
max = (max < parent->degreeOfFreedom+1) ?
parent->degreeOfFreedom+1 : max;
tablesQueue.head()->degreeOfFreedom = max;
orderedQueue.enqueue(tablesQueue.dequeue());
}
else
{
tablesQueue.enqueue(tablesQueue.dequeue());
}
}
If there is a cycle in dependency graph of tables, this algorithm does not terminate.
Normally is this OK to have such a design for tables? For example two tables having foreign key to each other. Surprisingly I found that example databases provided by Oracle for MySQL (sakila
) have a lot of such cycles. I'm supposing that it's possible to remove all that cycles by adding a third table [?]