How to handle data scheme change on non-SQL databases?

Question

As I experienced about non-SQL databases, one of biggest problem was schema change. Adding or removing columns on SQL database is easy operation and server guarantees data stability during scheme change. So it can handle data schema changes during service advancing. But how to non-SQL databases (especially objective style systems) handle those schema change? Is there reliable method?

It depends on the database in question, they're all different. — skaffman, Feb 13 '11 at 12:34

score 2 · Accepted Answer · answered Feb 14 '11 at 21:25

I agree with Skaffman, non-SQL databases covers a broad spectrum of products. Each one tends to provide different levels of schema management.

For example, key/value pair databases, like Oracle Berkeley DB are schema-free. What is placed in the key/value pair is an opaque structure, which is known to the applications that access it. In this case, I've often seen applications implement a field within the key/value pair data structure to indicate the schema version. The application, when reading or writing the record will take the appropriate action based on the schema version it finds. This can be advantageous for some applications, since schema changes can be applied as required on a given read/write operation rather than in bulk.

Another example, XML databases, like Oracle Berkeley DB XML store data in XML format which is self describing. Although it's common for most XML documents within a collection to have the same schema, it's certainly possible and even desirable for the schema to have additional or fewer attributes for a given document(s). These non-SQL databases employ query languages like XQuery, that allow you to query the structure (attributes) of the data as well as the content.

In yet another example, object-based data stores, like the Data Persistence Layer API provided with Berkeley DB can support application-directed schema evolution as part of the underlying API, as described here.

However, even with SQL databases it's only easy to change the schema on the surface. The application usually has to be made aware of any schema changes in order to operate properly. Adding a column in a SQL database can adversely affect applications that tend to do "SELECT *", while renaming or removing a column can adversely affect applications that assumed the existence of that column. SQL databases make schema change "easy" in the sense that there's a SQL command that allows you to add, drop and rename columns. The schema management requirements up the stack still need to be thought through and implemented properly.

Bottom line, typically schema evolution is either managed by the database engine, the application or the intervening API layer. As for how "easy" it is, depends a lot on the application layers above it and how they are affected by the schema change.

If you can be more specific about the problem that you're trying to solve we might be able to provide more specific suggestions. In particular, which database are you using and how do you see your schema evolving?

Regards,

Dave

Thanks. I cared about SQL database engine feature like constraint which helps data integrity. With the features, data itself can be integral even application try to wrong operation. This is important because many kind of client should operate simultaneously for single data storage. If data integrity is gained outside of database, the system very hard to be expanded, this will limits it's application. Constraint layer may be separated from data engine, but it requires making a kind of data gateway which can handle all type of client... — eonil, Feb 15 '11 at 05:07
It is hard to imagine a situation that database engine complains like "It's prohibited because the field is designed to be referenced from other entity." about removing a field from an entity. — eonil, Feb 15 '11 at 05:11
Again, it depends on the implementation of the database engine in question. For example, although Berkeley DB is a key/value pair database at it's core, there are at least two ways in which to handle constraints: a) use the SQL API which allows you to implement SQL contraints and schemas, or b) use the Foreign Key Reference function directly in the key/value API documented here: http://bit.ly/ebGUU0. In either case, the Berkeley DB engine will enforce the constraints that you define. — dsegleau, Feb 15 '11 at 06:40

How to handle data scheme change on non-SQL databases?

1 Answers1

Linked