113

Where should an JDBC-compliant application store its SQL statements and why?

So far, I managed to identify these options:

  • Hardcoded in business objects
  • Embedded in SQLJ clauses
  • Encapsulate in separate classes e.g. Data Access Objects
  • Metadata driven (decouple the object schema from the data schema - describe the mappings between them in metadata)
  • External files (e.g. Properties or Resource files)
  • Stored Procedures

What are the “Pros” and “Cons” for each one?

Should SQL code be considered “code” or “metadata”?

Should stored procedures be used only for performance optimisation or they are a legitimate abstraction of the database structure?

Is performance a key factor the decision? What about vendor lock-in?

What is better – loose coupling or tight coupling and why?

EDITED: Thank you everyone for the answers – here is a summary:

Metadata driven i.e. Object Relational Mappings (ORM)

Pros:

  • Very abstract - DB server can be switched without the need to change the model
  • Wide-spread - practically a standard
  • Cuts down the amount of SQL needed
  • Can store SQL in resource files
  • Performance is (usually) acceptable
  • Metadata driven approach
  • (Database) vendor independence

Cons:

  • Hides SQL and true developers intentions
  • SQL difficult to be reviewed/changed by DBA
  • SQL might still be needed for odd cases
  • Can force usage of a proprietary query language e.g. HQL
  • Does not lend itself to optimisation (abstraction)
  • Can lack referential integrity
  • Substitutes for lack of SQL knowledge or lack of care to code in the DB
  • Never match native database performance (even if it comes close)
  • Model code is very tight coupled with the database model

Hardcoded/encapsulated in DAO layer

Pros:

  • SQL is kept in the objects that access data (encapsulation)
  • SQL is easy to write (speed of development)
  • SQL is easy to track down when changes are required
  • Simple solution (no messy architecture)

Cons:

  • SQL cannot be reviewed/changed by DBA
  • SQL is likely to become DB-specific
  • SQL can become hard to maintain

Stored Procedures

Pros:

  • SQL kept in the database (close to data)
  • SQL is parsed, compiled and optimised by the DBMS
  • SQL is easy for DBA to review/change
  • Reduces network traffic
  • Increased security

Cons:

  • SQL is tied to the database (vendor lock-in)
  • SQL code is harder to maintain

External files (e.g. Properties or Resource files)

Pros

  • SQL can be changed without a need to rebuild the application
  • Decouples the SQL logic from the application business logic
  • Central repository of all SQL statements – easier to maintain
  • Easier to understand

Cons:

  • SQL code can become un-maintainable
  • Harder to check the SQL code for (syntax) errors

Embedded in SQLJ clauses

Pros:

  • Better syntax checking

Cons:

  • Ties too closely to Java
  • Lower performance than JDBC
  • Lack of dynamic queries
  • Not so popular
Adrian
  • 6,013
  • 10
  • 47
  • 68
  • Good questions but maybe a bit too much to answer all at once. It would take a few pages to answer all of these imho :p – NickDK Nov 02 '09 at 15:36
  • +1 Good question! You should add "ORM" per @ocdecio. Also add "sprinkled everywhere in your Java code" (which I've seen and has to be about the worst). – Jim Ferrans Nov 02 '09 at 15:38
  • 2
    I disagree quite strongly with "SQL code is harder to maintain" under Stored Procedures. In my XP SQL was easier to maintain once it went into the database. Partly for a reason used in External files (Central repository of all SQL statements – easier to maintain), plus the parameters are easier to manager. – Michael Lloyd Lee mlk Nov 04 '09 at 09:46
  • 1
    In my opinion, you've missed one option: The usage of views. You can express complex SQL in views and then just express simple selects on those views (using any type of abstraction: DAO, SQLJ, ORM's, etc). You'll have similar pros as with stored procedures, but I dont't think you'll have any of their cons... – Lukas Eder Apr 16 '11 at 23:07

15 Answers15

32

Usually, the more the application grows in terms of size and/or reusability, the more the need is to externalize/abstractize the SQL statements.

Hardcoded (as static final constants) is the first step. Stored in a file (properties/xml file) is the next step. Metadata driven (as done by an ORM like Hibernate/JPA) is the last step.

Hardcoded has the disadvantage that your code is likely to become DB-specific and that you need to rewrite/rebuild/redistribute on every change. Advantage is that you have it in 1 place.

Stored in a file has the disadvantage that it can become unmaintainable when the application grows. Advantage is that you don't need to rewrite/rebuild the app, unless you need to add an extra DAO method.

Metadata driven has the disadvantage that your model code is very tight coupled with the database model. For every change in the database model you'll need to rewrite/rebuild/redistribute code. Advantage is that it is very abstract and that you can easily switch from DB server without the need to change your model (but ask yourself now: how often would a company switch from DB server? likely at least only once per 3 years, isn't it?).

I won't call stored procedures a "good" solution for this. They have an entirely different purpose. Even though, your code would be dependent on the DB / configuration used.

BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555
22

I don't know if this is optimal, but in my experience they end up hardcoded (i.e. String literals) in the DAO layer.

flybywire
  • 261,858
  • 191
  • 397
  • 503
  • 5
    It's probably not optimal but it's what I do too. Easy to write, easy to track down, and no messy architecture to worry about. – James Cronen Nov 02 '09 at 15:35
  • 21
    And all the time spent tracking down the hard-coded SQL is job security. If you're the only person who knows where the SQL is, you can't be fired. – S.Lott Nov 02 '09 at 15:49
  • 1
    Yes, using the DAO design pattern http://en.wikipedia.org/wiki/Data_Access_Object not the Microsoft Library linked in the question. – Mark Nov 02 '09 at 16:17
  • 11
    It always amazes me that many people who are careful to build nicely-architected, clean OO Java code are the same people who tolerate writing messy, non-performant SQL and just sticking it as strings in random places. If your SQL is just strings in your DAO layer, then I can pretty much guarantee you don't have DBA on your team. At least not a *good* DBA. – Daniel Pryden Nov 02 '09 at 16:59
  • 3
    -1. It is OK to have DAOs but at the very least move the queries to a property file somewhere so a DBA can review and tweak them as appropriate! – cethegeek Nov 02 '09 at 17:45
  • 2
    Then what a good DBA would answer to this question? – svachon Nov 02 '09 at 17:56
  • 1
    Can someome please articulate why you want SQL (vs SP calls) other than simple view selects to be stored in properties files and strewn across code? This makes no sense to me. – Jé Queue Nov 02 '09 at 18:00
  • 3
    My experience has been, if you're doing straight JDBC, putting the query string in a Data Access Object layer is probably the best approach if you can't use an ORM solution. Once caveat is that make sure everyone on the same page with coding standards for the DAO classes if you go that way. I've been down both the resource bundle and stored procedure routes and both have been absolute maintenance nightmares since it spreads out the data access logic across multiple layers, so adding a column to a query requires you to change things in different places. – Jason Gritman Nov 02 '09 at 18:27
  • 2
    I don't think a real DBA will make a code review for you. He will not check every query, not in your code and not in your properties file. He will just help you optimize slow queries. – flybywire Nov 12 '09 at 19:15
12

I don't think anyone will give you the pro/con break down you want as it is a rather large question. So instead here is what I've used in the past, and what I will be using going forward.

I use to use SQL hardcoded in the DAL. I thought this was fine until the DBAs wanted to play with the SQL. Then you have to dig it out, format it and fire it over to the DBAs. Who will laugh at it and replace it all. But without the nice question marks, or the question marks in the wrong order and leave you to stick it back in the Java code.

We have also used a ORM, and while this is great for developers our DBAs hated it as there is no SQL for them to laugh at. We also used a odd ORM (a custom one from 3rd party supplier) which had a habit of killing the database. I've used JPA since and was great, but getting anything complicated using it past the DBAs is a up hill battle.

We now use Stored Procedures (with the call statement hardcoded). Now the first thing everyone will complain about is that you are tied to the database. You are. However how often have you changed database? I know for a fact that we simply could not even attempt it, the amount of other code dependent on it plus retraining our DBAs plus migrating the data. It would be a very expensive operation. However if in your world changing DBs at a drop of a hat is required SPs are likely out.

Going forward I would like to use stored procedures with code generation tools to create Java classes from Oracle packages.

Edit 2013-01-31: A few years and DBAs later and we now use Hibernate, going to SQL (stored procs in the DB) only when absolutely required. This I think is the best solution. 99% of the times the DBs don't need to worry about the SQL, and the 1% they do it is in a place they are already comfortable with.

Michael Lloyd Lee mlk
  • 14,561
  • 3
  • 44
  • 81
  • 1
    +1 for the idea of writing stored procedures and then generating Java code from them, not the other way around. – Daniel Pryden Nov 02 '09 at 16:55
  • My take on it is that the Java layer should NOT mimmick or map in any way the DB layer. I think if you're trying to abstract the Oracle package, then make another package or more wrapping procedures. I try and logically completely separate the two by practice. – Jé Queue Nov 02 '09 at 18:05
  • 2
    @Xepoch: I actually agree -- perhaps I should have worded my comment differently. Your database should be a reflection of your data model (entity relationship model), and your object model should also be a reflection of your data model (although not necessarily identical). So they should be related at least. In terms of generating Java code from stored procedures, the point is that the API for access to your database should be derived from the structure of your data model, not your data model being derived from the structure of your objects. – Daniel Pryden Nov 02 '09 at 22:20
  • You might be interested in using http://www.jooq.org. It does exactly what you said: "code generation to create Java classes from Oracle packages". Apart from that, it ships with a SQL-like DSL, similar to LINQ in C#, should you need to express SQL in Java, which you cannot put inside a stored procedure. – Lukas Eder Apr 16 '11 at 23:12
11

By using an ORM (such as hibernate) you hopefully will have no SQL statements to worry about. Performance is usually acceptable and you get vendor independence as well.

Otávio Décio
  • 73,752
  • 17
  • 161
  • 228
  • 13
    -1 you'll have HQL statements and most of the issues remain regarding HQL. Will they be inside the code (string literals), named queries in annotations, named queries in xml files, stored in properties files? – flybywire Nov 02 '09 at 15:31
  • Definitely ORM. It is essentially the "metadata-driven" approach and so wide-spread it is practically a standard. – SingleShot Nov 02 '09 at 15:31
  • 1
    @flybywire - with Hibernate it is a rarity to resort to HQL. For 98% of cases, query by example and by criteria (i.e. using objects) is all that's needed. – SingleShot Nov 02 '09 at 15:33
  • 2
    @SingleShot, I don't agree. If it is something more complex than select by id, I think it **is** done with HQL. I would say criteria and example are used when doing user-interface driven search as in a search screen in a library catalog. But let's see what others think. – flybywire Nov 02 '09 at 15:38
  • 3
    @SingleShot - I very much disagree. We use a lot of HQL, especially for reporting queries. Some HQL features have not been supported by criteria at all (using custom SQL functions, constructors in select clause). QBE can sometimes lead to more problems then it solves. – javashlook Nov 02 '09 at 15:46
  • 4
    "With Hibernate it is a rarity to resort to HQL" is by far the funniest thing I've heard today. QBE is ridiculous; and while you may have to **resort to** Criteria for UI queries, well-defined queries (reporting / service interaction / etc...) should all be in HQL. – ChssPly76 Nov 02 '09 at 22:29
  • I don't think resorting to HQL is a good solution but I don't think that doing everything absolutely in Hibernate is good as well. Perhaps an hybrid solution would work better - Criteria for a few cases and SQL for more complex cases. – Ravi Wallau Nov 03 '09 at 05:56
  • 1
    -1 ORM, Performance, Portability. Pick one – Andomar Nov 26 '09 at 23:13
10

Should SQL code be considered “code” or “metadata”?

Code.

Should stored procedures be used only for performance optimization or they are a legitimate abstraction of the database structure?

Stored procedures allow for reuse, including inside of other stored procedures. This means that you can make one trip to the database & have it execute supporting instructions - the least amount of traffic is ideal. ORM or sproc, the time on the wire going to the db & back is something you can't recoup.

ORM doesn't lend itself to optimization because of its abstraction. IME, ORM also means a lack of referencial integrity - make a database difficult to report from. What was saved in complexity, has now increased to be able to get the data out in a workable fashion.

Is performance a key factor the decision? What about vendor lock-in?

No, simplicity is. Vendor lockin happens with the database as well - SQL is relatively standardized, but there are still vendor specific ways of doing things.

OMG Ponies
  • 325,700
  • 82
  • 523
  • 502
  • 4
    +1 for calling SQL code. Too many ORM tools try to hide SQL, when in fact it is often the best language for expressing what you're trying to do. And I agree with your opinion that stored procedures are better than ORM, although I doubt that will be a popular opinion here. – Daniel Pryden Nov 02 '09 at 16:53
9

The fear of vendor lock-in in the java world is interesting.

I hope you haven't paid $50000 pr CPU for Oracle Enterprise, and then only used the least common denominator in order to switch to Mysql any minute. As any good DBA will tell you, there are subtle differences between the different big name databases, especially with regard to locking models and how they achieve consistency.

So, don't make a decision on how to implement your SQL calls only based on the principle of vendor agnostic SQL - have a real (business) reason for doing so.

hennings
  • 121
  • 3
  • 1
    Oh, nothing like that! The main concern is to allow support team to alter SQL statements (e.g. for tuning, DBA related tasks) and improve visibility about what the application does (in relation with the database. The support team does not have Java know-how and they will not be happy to look drill down into the code. The application will be a new addition to an large estate of existing ones all using an Ingres database. – Adrian Nov 12 '09 at 22:41
6

SQL inside Stored Procedures is optimized by the database system and compiled for speed - that's its natural home. SQL is understood by the database system, parsed by the database system. Keep your SQL in the database if you can; wrap it in stored procedures or functions or whatever units of logic the database system provides, and make simple calls to it using any one of the tools you or anybody else has mentioned.

Why store SQL code for the database system outside the db? Often for speed of development. Why use ORM mapping? - Some say ORM mapping provides compatibility across different database systems; however rarely in the real world does an application ever shift away from the database platform upon it was built especially when it starts using advanced features like replication, and for the rare occasion it does happen that the database system is swapped out, some work is warranted. I believe one of ORM's drawbacks it often substitutes for lack of SQL knowledge or lack of care to code in the db. Also ORM will never match native database performance even if it comes close.

I'm standing on the side of keeping SQL code in the database and making simple calls to it through any API or interface you wish to use. Also abstract away the point at which your database calls are made by putting those calls behind an abstract class or OO interface (expressed by methods), so if you ever do swap in a new kind of data source it will be seamless to the business layer.

John K
  • 28,441
  • 31
  • 139
  • 229
  • +1 Nice point of view. You'll be interested in this blog post, I think: http://database-programmer.blogspot.com/2010/12/historical-perspective-of-orm-and.html. – Lukas Eder Apr 16 '11 at 23:16
5

The only question you ask that has a definite answer is "Is SQL code or metadata?" It is most definitely code and as such should be kept in some kind of source code control and have a system for easily updating to the latest version and rolling back when not if things go wrong.

I've seen three ways of doing SQL in an application and each has their pros & cons. There is no best way, but the best thing is just pick one that works well with your application and stick with it.

  • ORM - this cuts down on the amount of SQL you need to write and handles lots of details for you. You will need to do some custom SQL. Make sure you have an ORM that handles this gracefully.
  • Data Access Objects - keep the SQL in the objects that access the data. This encapsulates your database and makes it so the rest of your application doesn't need to know about the underlying DB structure, just the interface to these objects.
  • Stored Procedures - this keeps all your SQL in your database and makes it easy for your DBA's to know what is going on. All you need to do is have your code call the stored procs
Kenny Drobnack
  • 171
  • 2
  • 12
4

We happen to use the iBatis SQL mapper, which is closer to the metal than ORMs like Hibernate. In iBatis you put the SQL statements into resource files (XML), which need to be in the classpath.

Your list of approaches seems pretty comprehensive if you add @ocdecio's ORM option. I would say that using an ORM and using an SQL mapper and resource files are the two best approaches. I'd steer clear from SQLJ, which hasn't seen much uptake and ties you too closely to Java. Also stay away from stored procedures, since they tie you to a specific database vendor (standards are almost non-existent for stored procedures).

Jim Ferrans
  • 30,582
  • 12
  • 56
  • 83
4

Like most of us, I've seen the whole gammut but we need to consider SQL a first-class language. I've even seen SQL stored in the DB that is pulled down then executed back up.

The most successful systems I've seen employ stored procedures, functions and views.

Stored procs keep the SQL text back at the DB and allow for relatively immediate change by those DEPLOYING and CUSTOMIZING (which requires a lot of proper design to support it).

All projections should be via views and simple selects for the same reasons, all projection logic should be contained within the view.

Jé Queue
  • 10,359
  • 13
  • 53
  • 61
2

I suggest using DAOs with a factory layout. So the example objects you need would be:

public class CoolBusinessObject
public class DAOFactory.java
public implementation CoolBusinessOjectDAO
public class CoolBusinessOjectDAOOracleImpl implements CoolBusinessOjectDAO

This style layers the data interaction, so you should only have to change one layer of code if you switch databases, or move to ORM technologies.

Jay
  • 4,994
  • 4
  • 28
  • 41
2

There isn't really any substantial difference between these three:

  1. Hardcoded in business objects
  2. Embedded in SQLJ clauses
  3. Encapsulate in separate classes e.g. Data Access Objects

I'm assuming that you're going to embed SQL code in a string form directly into your Java code. While 1 and 3 will probably use JDBC directly (or some tool like Apache DbUtils), 2 adds a preprocessor technology to the stack, generating the relevant JDBC code prior to compilation.

So, essentially, if these solutions involve embedding SQL, you might as well use any of these technologies:

  • JPA Criteria API, modelling JPQL as an internal domain-specific language in Java
  • jOOQ, modelling SQL as an internal domain-specific language in Java

There might also be other tools to help you embed SQL in Java in a more typesafe manner than through SQLJ or through actual string concatenation.

Lukas Eder
  • 211,314
  • 129
  • 689
  • 1,509
1

From what experience , I have had, hard coding sql statements in the DAO objects is what is widely used, though , I think it should be the least preferred method. The best practice should be to store the sql statements in a properties file. And get the statements in the DAO object through an interface to properties files, say java.util.Properties. The sql statements can be interspersed with '?'s to pass parameters , through a Prepared Statement approach.

Such an approach helps decouple the sql logic from the application business logic. This makes available a central repository of all sql statements , which makes modification easier, eliminating the need to search for database statements within application logic.Understandability improves too.

The Machine
  • 1,221
  • 4
  • 14
  • 27
  • Some people might object that this will introduce the risk of SQL injection. What is your opinion about that? – Adrian Nov 02 '09 at 16:14
  • 2
    If you're storing the SQL code outside the application anyway, what is the advantage of storing it as strings somewhere over storing it in the database layer (as stored procedures)? Stored procedures can make more effective use of your database's optimizer, so they will nearly always outperform prepared statements. – Daniel Pryden Nov 02 '09 at 22:23
  • 1
    Hi Daniel, I didn't mean , you dont write sql procs, I just meant, you call the stored procs, in the way , I mentioned. It helps you to have a better control, over passing parameters to the stored proc , as well. – The Machine Nov 09 '09 at 06:10
1

Mine end up in resource bundles. I know it's not normal but it's the easiest for me and anyone "other than me" to maintain. It's straightforward and logical.

I'm actually curious to see if anyone uses my approach also.

Brett Ryan
  • 26,937
  • 30
  • 128
  • 163
  • Curious why don't you keep the SQL at the DB? – Jé Queue Nov 02 '09 at 18:06
  • @Xepoch - How do you mean? The statements are in resource bundles (properties files) which are within the same package as the entities, so customer.properties relates to Customer.class. Data is stored in the DB. – Brett Ryan Nov 03 '09 at 11:28
1

As a rexem wrote SQL statments are code - they should be treated like code, not externalized (unles you have good reason) but placed with code that process SQL data from/to that statements. Todays framework ORMs/iBatis offer a lot of simplifications for day-to-day JDBC development.

Some answers to your question you'll find in this question:) The problem how your SQL statments will be stored depends of king of your application. What are your needs? High security, ease of writing code or maintenance, crossplatform or vendor lock-in? The next question do you need pure SQL or ORM framework will be good?

* Hardcoded in business objects
* Encapsulate in separate classes e.g. Data Access Objects

Simplest solution (P), hard to maintain (C)

* Embedded in SQLJ clauses

Beter syntax checking (P), lack od dynamic queries (C), lower perfomance than JDBC (C), no so popular (C)

* Metadata driven (decouple the object schema from the data schema - describe the mappings between them in metadata)

It must be specific case you should do that (C) or if you mean ORM (P) ;)

* External files (e.g. Properties or Resource files)

Easy to mantain (P) but harder to check for errors (C)

* Stored Procedures

High secuirty (P), code hard to mantain an vendor lock-in problems (C)

Community
  • 1
  • 1
cetnar
  • 9,357
  • 2
  • 38
  • 45