Here three working sqlite flavored implementations (once sqlite is being used, column types not being enforced are acceptable, only integer primary keys were used in order to act as rowid):
In all cases, sqlite foreign key PRAGMA is set to true: PRAGMA foreign_keys = 1;
Simple implementation - one fixed table for each source/level (constrained by foreign keys)
The following design/implementation utilizes one table for each type of database and level. Tables references one each other with foreign keys to ensure correctness. For example, a mongo collection
can't be child of a mysql database
. Only in the connection
level all database types share the same table, but it could be different if it is expected different properties for each kind of connection
.
create table databasetype(name primary key) without rowid;
insert into databasetype values ('mysql'),('elasticsearch'),('mongo'),('sqlserver');
create table datatype(name primary key) without rowid;
insert into datatype values ('int'),('str'); -- you can differentiate varchar if you will
create table connection(id integer, hostname, databasetype, primary key(id), foreign key(databasetype) references databasetype(name));
create table mysqldatabase(id integer, connectionid, name, primary key(id), foreign key(connectionid) references connection(id));
create table mysqltable(id integer, databaseid, name, primary key(id), foreign key(databaseid) references mysqldatabase(id));
create table mysqlfield(id integer, tableid, name, datatype, datalength, primary key(id), foreign key(tableid) references mysqltable(id), foreign key(datatype) references datatype(name));
create table elasticsearchindex(id integer, connectionid, name, primary key(id), foreign key(connectionid) references connection(id));
create table elasticsearchfield(id integer, indexid, name, datatype, datalength, primary key(id), foreign key(indexid) references mysqltable(id), foreign key(datatype) references datatype(name));
create table mongodatabase(id integer, connectionid, name, primary key(id), foreign key(connectionid) references connection(id));
create table mongocollection(id integer, databaseid, name, primary key(id), foreign key(databaseid) references mongodatabase(id));
create table mongofield(id integer, collectionid, name, datatype, datalength, primary key(id), foreign key(collectionid) references mongocollection(id), foreign key(datatype) references datatype(name));
create table sqlserverdatabase(id integer, connectionid, name, primary key(id), foreign key(connectionid) references connection(id));
create table sqlserverschema(id integer, databaseid, name, primary key(id), foreign key(databaseid) references sqlserverdatabase(id));
create table sqlservertable(id integer, schemaid, name, primary key(id), foreign key(schemaid) references sqlserverschema(id));
create table sqlserverfield(id integer, tableid, name, datatype, datalength, primary key(id), foreign key(tableid) references sqlservertable(id), foreign key(datatype) references datatype(name));
Loading data representing the first table:
insert into connection(hostname, databasetype) values ('remote:1234', 'mysql');
insert into mysqldatabase(connectionid, name) select id, 'sales' from connection where hostname='remote:1234';
insert into mysqltable(databaseid, name) select id, 'user' from mysqltable where name='sales';
insert into mysqlfield(tableid, name, datatype, datalength) select id, 'name', 'str', 80 from mysqldatabase where name='product';
insert into mysqlfield(tableid, name, datatype) select id, 'age', 'i32' from mysqldatabase where name='product';
Trying invalid manipulations of data:
insert into mysqlfield(tableid, name, datatype) values (2, 'newfield', 'qubit');
-- Error: FOREIGN KEY constraint failed
In order to pretty-print the whole tree it is necessary to do a manual join of all tables involved.
Graph like implementation - one table representing the tree, other the hierarchy (constrained by triggers)
Here the element
table is used to represent each element/node in the tree. Its level
column explicitly classifies each element as an database
, table
, etc. Here sqlite's rowid is being used as the primary key, but it is easy to change it to a regular id.
In the previous implementation, foreign keys were used to ensure model correctness. Now triggers are used for this job. They decide which parent level accepts which child level, as it is allowed for the respective dbtype - those rules are specified on the element_type
table.
Lastly, an exra table element_properties
, is used to allow extra properties to be attached to any elements, such as field type.
create table db_type(name primary key) without rowid;
insert into db_type values ('mysql'),('elasticsearch'),('mongo'),('sqlserver');
create table element_type(parentlevel, childlevel, dbtype, primary key(parentlevel, childlevel, dbtype), foreign key(dbtype) references db_type(name)); --not using without rowid to be able to have null parent level
insert into element_type values
(null, 'connection', 'mysql'),
('connection', 'database', 'mysql'),
('database', 'table', 'mysql'),
('table', 'field', 'mysql'),
(null, 'connection', 'elasticsearch'),
('connection', 'index', 'elasticsearch'),
('index','field', 'elasticsearch'),
(null, 'connection', 'mongo'),
('connection', 'database', 'mongo'),
('database', 'collection', 'mongo'),
('collection', 'field', 'mongo'),
(null, 'connection', 'sqlserver'),
('connection', 'database', 'sqlserver'),
('database', 'schema', 'sqlserver'),
('schema', 'table', 'sqlserver'),
('table', 'field', 'sqlserver');
create table element(id integer, parentid, name, level, dbtype, primary key(id), foreign key(parentid) references element(id), foreign key(dbtype) references db_type(name));
create table element_property(parentid, name, value, primary key(parentid, name), foreign key(parentid) references element(id)) without rowid;
-- trigger to guarantee that new elements will conform hierarchy
create trigger element_insert before insert on element
begin
select iif(count(*)>0, 'ok', raise(abort,'invalid parent-child insertion')) from element_type etc join element_type etp on (etp.childlevel, etp.dbtype)=(etc.parentlevel, etc.dbtype) where (etc.dbtype, etc.parentlevel, etc.childlevel)=(new.dbtype, (select level from element ei where ei.rowid=new.parentid), new.level);
end;
-- trigger to guarantee that updated elements will conform hierarchy
create trigger element_update before update on element
begin
select iif(count(*)>0, 'ok', raise(abort,'invalid parent-child update')) from element_type etc join element_type etp on (etp.childlevel, etp.dbtype)=(etc.parentlevel, etc.dbtype) where (etc.dbtype, etc.parentlevel, etc.childlevel)=(new.dbtype, (select level from element ei where ei.rowid=new.parentid), new.level);
end;
-- trigger to guarantee that hierarchy removal must respect existing elements (no delete cascade used)
create trigger element_type_delete before delete on element_type
begin
select iif(count(*)>0, raise(abort,'can''t remove, entries found in the element table using this relationship'), 'ok') from element etc join element etp on etp.rowid=etc.parentid and etp.dbtype=etp.dbtype where etc.dbtype=old.dbtype and (etp.level,etc.level)=(old.parentlevel, old.childlevel);
end;
-- trigger to guarantee that hierarchy changes must respect existing elements
create trigger element_type_update before update on element_type
begin
select iif(count(*)>0, raise(abort,'can''t change, entries found in the element table using this relationship'), 'ok') from element etc join element etp on etp.rowid=etc.parentid and etp.dbtype=etp.dbtype where etc.dbtype=old.dbtype and (etp.level,etc.level)=(old.parentlevel, old.childlevel) and (etp.level,etc.level)!=(new.parentlevel, new.childlevel);
end;
Loading data representing the first table:
insert into element(name, level, dbtype) values ('remote:1234', 'connection', 'mysql');
insert into element(name, level, dbtype, parentid) values ('sales', 'database', 'mysql', (select id from element where (level, name, dbtype)=('connection', 'remote:1234', 'mysql')));
insert into element(name, level, dbtype, parentid) values ('user', 'table', 'mysql', (select id from element where (level, name, dbtype)=('database', 'sales', 'mysql')));
insert into element(name, level, dbtype, parentid) values ('name', 'field', 'mysql', (select id from element where (level, name, dbtype)=('table', 'user', 'mysql')));
insert into element(name, level, dbtype, parentid) values ('age', 'field', 'mysql', (select id from element where (level, name, dbtype)=('table', 'user', 'mysql')));
insert into element_property(name, value, parentid) values ('fieldtype', 'varchar', (select id from element where (level, name, dbtype)=('field', 'name', 'mysql')));
insert into element_property(name, value, parentid) values ('fieldlength', 80, (select id from element where (level, name, dbtype)=('field', 'name', 'mysql')));
insert into element_property(name, value, parentid) values ('fieldtype', 'integer', (select id from element where (level, name, dbtype)=('field', 'age', 'mysql')));
Trying invalid manipulations of data:
insert into element(name, level, dbtype, parentid) values ('documents', 'collection', 'mysql', (select id from element where (level, name, dbtype)=('database', 'sales', 'mysql')));
-- Error: invalid parent-child insertion
update element_type set childlevel='specialfield' where dbtype='mysql' and (parentlevel, childlevel)=('table','field');
-- Error: can't change, entries found in the element table using this relationship
Pretty-printing the tree:
create view elementree(path) as
with recursive cte(id, name, depth, dbtype, level) as (
select id, name, 0 as depth, dbtype, level from element where parentid is null
union all
select el.id, el.name, cte.depth+1 as depth, el.dbtype, el.level from element el join cte on el.parentid=cte.id
order by depth desc
)
select substring(' ',0,2*depth)||name||' ('||dbtype||'-'||level||')' from cte;
select * from elementree;
-- remote:1234 (mysql-connection)
-- sales (mysql-database)
-- user (mysql-table)
-- documents (mysql-table)
-- name (mysql-field)
-- age (mysql-field)
Minimalist DRY graph like implementation - one table with only names representing the tree and only one auxiliar table
Here again it is used an element
table to represent each element in the tree. Differently from the previous case, the table has less information and the type of each element - whether it is a database
or a table
is implicitly inferred instead of explicitly determined by a column. By simply adding an user
as a child of sales
, it is inferred that user
is a mysql table
, once it is child of a mysql database
- sales
, which is adatabase
because it is child of a mysql connection
, which is child of the mysql
root element. Dbtypes are root elements in this tree, all their children are inferred to be of this dbtype.
Here the hierarchypath
table was used to tell the hierarchy that has be followed in the element
tree. For the user confort, (s)he will only have to insert a (>
separated) string representing the hierarchy path, starting from dbtype. The hierarchy view will desconstruct this string to the hierachy structure. One example of a hirearcy path would be: mysql>connection>database>table>field
.
Note that again, sqlite's rowid is used as table id. Remember that it is not possible to see rowid by simply select * from table;
, it is hidden by default, it is needed to explicitly select it: select rowid,* from table;
.
create table element(name, parentrowid, foreign key(parentrowid) references element(rowid));
-- dbtypes are the root elements
insert into element(name) values ('mysql'),('elasticsearch'),('mongo'),('sqlserver');
create table hierarchypath(path);
insert into hierarchypath values
('mysql>connection>database>table>field'),
('elasticsearch>connection>index>field'),
('mongo>connection>database>collection>field'),
('sqlserver>connection>schema>database>table>field');
Loading data:
insert into element select 'remote:1234',rowid from element where (name,coalesce(parentrowid,-1))=('mysql',-1); --returning rowid; -- returning only works for sqlite 3.35+
insert into element select 'sales',rowid from element where rowid=5;
insert into element select 'user',rowid from element where rowid=6;
insert into element select 'name',rowid from element where rowid=7;
insert into element select 'age',rowid from element where rowid=7;
Pretty-printing:
create view hierarchy(root, depth, name) as
with recursive hierarchycte(root, depth, name, remaining) as (
select substr(path, 0, instr(path, '>')) as root, 0 as depth, substr(path, 0, instr(path, '>')) as name, substr(path, instr(path, '>')+1)||'>' as remaining from hierarchypath
union all
select root, depth+1 as depth, substr(remaining, 0, instr(remaining, '>')) as name, substr(remaining, instr(remaining, '>')+1) as remaining from hierarchycte where instr(remaining, '>') > 0
)
select root, depth, name from hierarchycte where depth>=0;
create view elementhierarchy(root, depth, name) as
with recursive elementcte(root, depth, name, rowid, parentrowid) as (
select name as root, 0 as depth, name, rowid, parentrowid from element where parentrowid is null
union all
select elcte.root, elcte.depth+1, el.name, el.rowid, el.parentrowid from elementcte elcte join element el on el.parentrowid=elcte.rowid
order by depth desc
)
select root, depth, name from elementcte;
create view elementree as
with recursive elementcte(root, depth, name, rowid, parentrowid) as (
select name as root, 0 as depth, name, rowid, parentrowid from element where parentrowid is null
union all
select elcte.root, elcte.depth+1, el.name, el.rowid, el.parentrowid from elementcte elcte join element el on el.parentrowid=elcte.rowid
order by depth desc
)
select substring(' ',0,2*h.depth-2)||eh.name||' ('||h.root||'-'||h.name||')' from (select *,row_number() over () as originalorder from elementhierarchy) eh join hierarchy h on (eh.root,eh.depth)=(h.root,h.depth) where h.depth>0 order by originalorder;
select * from elementree;
-- remote:1234 (mysql-connection)
-- sales (mysql-database)
-- user (mysql-table)
-- age (mysql-field)
-- name (mysql-field)
Triggers were not implemented here, but it would be good to do so. One example would be to avoid inserting more levels than allowed.
It would be wiser to store the hierarchy in the desconstructed form seen on the view
hierarchy, by doing the desconstruction in insertion time instead of every select query to avoid cpu consumption. Here it was left this way to differentiate it more from other implementations.
Here the last level entity, the field
have no properties as shown on previous implementations. In this model it would be necessary to add one or two extra levels to the hierarchy: ...table>field>fieldpropertyandvalue
or ...table>field>fieldproperty>fieldpropertyvalue
, in the first case an example of fieldpropertyandvalue
would be datatype=integer
and an example of separated property and values would be respectively datatype
and integer
. This approach where any properties are new nodes in the graph is closer to the approach used by RDF stores.
To conclude it must be stated that it would be possible to use specialized graph databases, using their own query languages like cypher in neo4j and sparql in others or even other languages, but since the graph design overall is simple, a relational database suffice our needs.