Using a relational database is viable solution. For your needs - fast insert, update, delete - I'd use an Adjacency List with an additional customizations as such:
id
parent_id
cardinality -- sort order for all nodes with the same parent_id
depth -- distance from the root node
Calculating cardinality
and depth
is either done with code or - preferably - a database trigger for any insert, delete or update. In addition, for retrieving an entire hierarchy with one SELECT statement, a hierarchy bridge table is called for:
id
descendent_id
This table would also be populated via the same trigger mentioned above and serves as a means for retrieving all nodes above or beneath a given id
.
See this question for additional detail around Adjacency List, Hierarchy Bridge and other approaches for storing hierarchical data in a relational database.
Finally to provide some additional clarification on the options you listed:
- Flat Files: a combination of linked lists and memory mapped files would probably serve, but you're really just rolling your own at that point, where a SQL or NoSQL solution would probably do better.
- SQL: this would be my approach - tooling is the best here for data manipulation, backup and recovery.
- XML: this is also a possibility with a database, very vendor specific, you'll need to study the syntax for node insert, update and delete. Can be very fast if the database offers an XML data type.
- NoSQL: if you're talking key-value storage, the typical approach for hierarchical data appears to be materialized path, but this would require recalculating the entire path for all affected nodes on change, which is likely slow. Instead consider the Java Content Repository (JCR) - Apache Jackrabbit is an implementation - entire API centered around representing hierarchical structured data and persisting it - perhaps too heavyweight for the problem you're trying to solve.
- voodoo: um...
Update
If you implement all pieces from this answer, add is cheap, re-sort is small cost, move is expensive. Trade-off is fast hierarchy traversal reads - for instance find a node's complete ancestry in one operation. Specifically, adding a leaf is an O(1) operation. Re-sort means updating cardinality all peer nodes coming after the moved node. Move means update of (1) cardinality for source and destination peer nodes coming after, (2) moved - and descendant - node depth, and (3) removal and addition of ancestry to hierarchy bridge table.
However, go with an Adjancency List alone (i.e. id, parent_id
) and write becomes cheap, reads for one level are cheap, but reads that traverse the hierarchy are expensive. The latter would then require using recursive SQL such Oracle's CONNECT BY or Common Table Expressions as found in SQL Server and other RDBMSs.