2

I am building a system that allows front-end users to define their own business objects. Defining a business object involves creating data fields for that business object and then relating it to other business objects in the system - fairly straight forward stuff. My question is, what is the most efficient storage strategy?

The requirements are:

  • Must support business objects with potentially 100+ fields (of all common data types)
  • The system will eventually support hundreds of thousands of business object instances
  • Business objects sometimes display data and aggregates from their relationships with other business objects
  • Users must be able to search for business objects by their data fields (and fields from related business objects)

The two possible solutions I can envisage are:

  • Have a dynamic schema such that when a new business object type is created a new table is created for storing instances of that object. The object's fields become columns in the storage table.
  • Have a fixed schema where instance data fields are stored as rows in basically a big long table.

I can see pros and cons to both approaches:

  • the dynamic schema allows me to index search columns
  • the dynamic tables are potentially limited in width by the max column size
  • dynamic schemas rule out / cause issues with replication
  • the static schema means less or even no dynamic sql generation
  • my guess is the static schema may perform like a dog when it comes to searching across 100,000+ objects

So what is the best soution? Is there another approach I haven't thought of?

Edit: The requirement I have been given is to build a generic system capable of supporting front-end user defined business objects. There will of course be restrictions on how these objects can be constructed and related, but the requirement itself is not up for negotiation.

My client is a service provider and requires a degree of flexibility in servicing their own clients, hence the need to create business objects.

Deduplicator
  • 44,692
  • 7
  • 66
  • 118
flesh
  • 23,725
  • 24
  • 80
  • 97
  • Is it a requirement that you must use a RDBMS? There are other kinds of DBs more geared towards this kind of flexibility. – nawroth Jul 03 '09 at 10:26

6 Answers6

5

I think your problem matches very well to a graph database like Neo4j, as it's built for the requested kind of flexibility from the beginning. It stores data as nodes and relationships/edges, and both nodes and relationships can hold arbitrary properties (in a key/value fashion). One important difference to a RDBMS is that a graph database won't need to lookup the relationships in a big long table (like in your fixed schema solution), so there should be a significant performance gain there. You can find out about language bindings for Neo4j in the wiki and read what others say about it in this stackoverflow thread. Disclaimer: I'm part of the Neo4j team.

Community
  • 1
  • 1
nawroth
  • 4,321
  • 1
  • 24
  • 21
4

Without much understanding of your situation...

Instead of writing a general purpose one-size-fits-all business objects system (which is the holy grail for Oracle, Microsoft, SAS, etc.), why not do it the typical way, where the requirements are gathered, and a developer designs and implements the users' business objects in an effective manner?

If your users are typical, they will create a monster, which will end up running slow, and they will hate it. Most users will view the data as an Excel sheet, and not understand relationships like: parent/child. As a result there will be some crazy objects built, and impossible-to-solve reports. You'll be forced to create scripts to manually convert many old objects to better and properly defined ones, etc...

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
KM.
  • 101,727
  • 34
  • 178
  • 212
2

Your requirements sound a little bit like an associative database with a front end to compose and edit entities.

I agree with KM above, unless you have a very compelling reason not to, you would be better off using a traditional approach. There are a lot of development tools and practices that allow you to build a robust and scalable system. Otherwise you will have to implement much of this yourself.

Alex Peck
  • 4,603
  • 1
  • 33
  • 37
2

I don't know the best way to do this, because it sounds like something that has already been implemented by others. If I were asked to implement this feature, I would recommend buying a wheel instead of reinventing it.

Perhaps there are reasons you have to invent your own? If so, then you should add those reasons to the requirements you listed.


If you absolutely must be this generic, I still recommend buying a system that has been architected for this requirement. Not just the storage requirements, which are the least of the problems your customer will have; but also: how do you keep the customer from screwing up totally when given this much freedom. Some of the commercial systems already meet this challenge without going out of business because of customers messing up.

If you still need to do this on your own, then I suggest that your requirements (or perhaps those of another vendor?) must include: allow the customer to get it right, and help keep the customer from getting it wrong. You'll need some sort of UI to allow the customer to define these business objects, and the UI should validate the model that the customer builds.

I recommend a UI that works at a conceptual level. As an example, see NORMA, a Visual Studio add-in for Object-Role Modeling (the "other" ORM). Consider it as a example only, if your end users cannot afford a Visual Studio Standard license. Otherwise, you'll find that it is extensible, already produces many types of artifact (from SQL in various dialects to code), and will validate the model to see that it makes sense. End users would also be able to enter sample data that they believe should be valid, and the system will validate the data against the model.

If your customers are producing sensible (if dynamic) business objects, then the question of storage will be much simpler.

John Saunders
  • 160,644
  • 26
  • 247
  • 397
  • john - thanks for providing at least a considered and non sarcastic answer, something you wouldnt think too difficult. something i didnt make clear, users are NOT free to do anything they like - object creation will of course be through a UI and several layers of validation and abstraction from the database. my customers ARE 'producing sensible (if dynamic) business objects' - and how best to store these objects is exactly the question i was asking. – flesh Jul 03 '09 at 08:33
  • Thank you. I suggest you edit your question to say this. Otherwise, for the next several years, you'll be seeing new answers that say, "you're crazy". Also, may I presume there's a good reason why you have to do this yourself, and not base it on something commercial, like Siebel or salesforce.com? – John Saunders Jul 03 '09 at 09:12
2

Have you thought about an XML based solution? The requirements suggested to me "Build a system that allows users to dynamically generate an XML Schema and work with XML documents based on that schema." I don't know enough about storing and querying XML documents to comment on your original question.

Another possibility might be to leverage NHibernate's ability to generate database schemas. If you can dynamically generate business objects, then you can generate XML mappings or Fluent mappings and use that to generate a normalized database schema.

Jamie Ide
  • 48,427
  • 16
  • 81
  • 117
1

Every user that I have ever talked to has always wanted "everything" in their project. Part of the job of gathering requirements is to guide the user, not just write down everything they say.

Your only hope is to build several template objects, that they can add properties to, you could code your application to handle each type of these objects, but allow the user to still slightly modify each as necessary.

You need to inform the user upfront of the major flaws this type of design has. This will help you in the end, when it runs slow, or if they screw up and need help fixing something. I'd put this in writing.

How many possible objects would they really need? Perhaps you could set these up using your system first. I have developed several very customizable systems over the years and when the user is sitting at an empty screen, it is like a deer in the headlights.

In any event, good luck.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
racer x
  • 1,502
  • 1
  • 9
  • 3