304

For some of the apps I've developed (then proceeded to forget about), I've been writing plain SQL, primarily for MySQL. Though I have used ORMs in python like SQLAlchemy, I didn't stick with them for long. Usually it was either the documentation or complexity (from my point of view) holding me back.

I see it like this: use an ORM for portability, plain SQL if it's just going to be using one type of database. I'm really looking for advice on when to use an ORM or SQL when developing an app that needs database support.

Thinking about it, it would be far better to just use a lightweight wrapper to handle database inconsistencies vs. using an ORM.

cletus
  • 616,129
  • 168
  • 910
  • 942
hydrapheetz
  • 3,540
  • 3
  • 17
  • 12
  • 2
    Standardization, security, maintainability, language abstraction, DRY, etc. – Ben Aug 06 '14 at 16:52
  • Performance with ORM can be near SQL, depends on if you use it correctly and with correct settings... See ho to make EF6.x 5x faster: https://www.linkedin.com/pulse/orm-read-performance-ef-vs-dapper-balazs-hideghety – baHI Feb 26 '17 at 10:41
  • For ORM architecture and how-to (what to avoid), here's my another link: https://www.linkedin.com/pulse/get-know-your-orm-avoid-bad-habits-balazs-hideghety – baHI Feb 26 '17 at 10:42
  • Object-Relational mapping (ORM) is already very popular in many programming languages and one of the best alternatives for SQL. I was inspired from method chaining style to create CQL for my TRIADB project. healis.eu/triadb/#latest-release – Athanassios Jul 10 '19 at 07:30
  • 4
    ORM is an excuse for not knowing SQL or being lazy to write it. If you do not know SQL, use ORM. But don't waste your time thinking u would do less work when using ORM, that would be a false belief. Portability agreed...go with ORM if your project needs portability across databases. Speed? go all the way with SQL because ORM is not fast. Reliability I'd stick with SQL. Security, at least nothing much to worry about in both. The bottom line, it depends on your project requirement, especially regarding database portability. – pasignature Oct 21 '20 at 11:18

12 Answers12

304

Speaking as someone who spent quite a bit of time working with JPA (Java Persistence API, basically the standardized ORM API for Java/J2EE/EJB), which includes Hibernate, EclipseLink, Toplink, OpenJPA and others, I'll share some of my observations.

  1. ORMs are not fast. They can be adequate and most of the time adequate is OK but in a high-volume low-latency environment they're a no-no;
  2. In general purpose programming languages like Java and C# you need an awful lot of magic to make them work (eg load-time weaving in Java, instrumentation, etc);
  3. When using an ORM, rather than getting further from SQL (which seems to be the intent), you'll be amazed how much time you spend tweaking XML and/or annotations/attributes to get your ORM to generate performant SQL;
  4. For complex queries, there really is no substitute. Like in JPA there are some queries that simply aren't possible that are in raw SQL and when you have to use raw SQL in JPA it's not pretty (C#/.Net at least has dynamic types--var--which is a lot nicer than an Object array);
  5. There are an awful lot of "gotchas" when using ORMs. This includes unintended or unexpected behavior, the fact that you have to build in the capability to do SQL updates to your database (by using refresh() in JPA or similar methods because JPA by default caches everything so it won't catch a direct database update--running direct SQL updates is a common production support activity);
  6. The object-relational mismatch is always going to cause problems. With any such problem there is a tradeoff between complexity and completeness of the abstraction. At times I felt JPA went too far and hit a real law of diminishing returns where the complexity hit wasn't justified by the abstraction.

There's another problem which takes a bit more explanation.

The traditional model for a Web application is to have a persistence layer and a presentation layer (possibly with a services or other layers in between but these are the important two for this discussion). ORMs force a rigid view from your persistence layer up to the presentation layer (ie your entities).

One of the criticisms of more raw SQL methods is that you end up with all these VOs (value objects) or DTOs (data transfer objects) that are used by simply one query. This is touted as an advantage of ORMs because you get rid of that.

Thing is those problems don't go away with ORMs, they simply move up to the presentation layer. Instead of creating VOs/DTOs for queries, you create custom presentation objects, typically one for every view. How is this better? IMHO it isn't.

I've written about this in ORM or SQL: Are we there yet?.

My persistence technology of choice (in Java) these days is ibatis. It's a pretty thin wrapper around SQL that does 90%+ of what JPA can do (it can even do lazy-loading of relationships although its not well-documented) but with far less overhead (in terms of complexity and actual code).

This came up last year in a GWT application I was writing. Lots of translation from EclipseLink to presentation objects in the service implementation. If we were using ibatis it would've been far simpler to create the appropriate objects with ibatis and then pass them all the way up and down the stack. Some purists might argue this is Bad™. Maybe so (in theory) but I tell you what: it would've led to simpler code, a simpler stack and more productivity.

ospider
  • 9,334
  • 3
  • 46
  • 46
cletus
  • 616,129
  • 168
  • 910
  • 942
  • 2
    I've been inspired to post another (although community wiki'd) question just to collect resources on things like this. Regarding the last paragraph: I like simplicity. Probably too much. – hydrapheetz Jan 30 '09 at 08:19
  • I generally agree with what you're saying, but I'm not sure you're completely right on 5). JPA by default doesn't cache everything for ever. It only caches stuff during the persistence context, which in a lot of cases is tied to a transaction. During a transaction you also don't see updates done in another transaction in raw SQL (depending on your isolation level). Therefor, the l1 cache in JPA is generally not really a problem. When you use an extended persistence context or an l2 cache, then you might miss external updates, but both are not the default in JPA. – Arjan Tijms Jul 18 '10 at 10:04
  • arjan, yes the l2 cache is not on by default. But almost every time you will turn it on. JPA without l2 is very slow and database aggressive. So in mosto of the cases l2 is on. So in enterprise systems you probably have also another sources that can change records in the db, and that is really a problem. – darpet Nov 05 '10 at 12:50
  • 3
    iBATIS is great, but maybe you'd care to try jOOQ: http://jooq.sourceforge.net. Its main focus is precisely to stay close to SQL for the 6 reasons you mentioned. – Lukas Eder Dec 19 '10 at 15:08
  • I don't agree to many of these points. I know that ORM's are not easy to use and introduce their own problems. But it really depends on the kind of software you write. Not every project can benefit from an ORM. They don't work well in combination with non-OO code. They work pretty well if you put the whole logic into the server based on an entity model. Yes you have DTO's, but you don't implement your business logic on them. There may be a bit of plain SQL to cover the last few percent which are not good enough by the ORM. Everything else is plain simple OO programming. – Stefan Steinegger Apr 05 '11 at 16:01
  • 6
    +1 for point 3. Many feel that using ORM relieves you from having a thorough understanding of SQL. Thing is that once you can/learn to do gymnastics with SQL, you'll probably find yourself moving away from ORM... very quickly. – Ryan Fernandes May 12 '11 at 05:11
  • 1
    Point 2, 3, 5 -- problems of Java, not ORM. Try Rails active record. – powerboy Aug 27 '11 at 06:58
  • 4
    So, now it's the end of 2013 and as we all know, nothing might be more misleading than "old facts" - so may I ask you if your points are still the same? If not, it would be great if you could write a blogpost / update your answer accordingly. – Dominik Dec 10 '13 at 07:56
  • 3
    var doesn't produce a dynamic type in .NET, variables with the dynamic keyword are a dynamic types in .NET. var is still statical typing. See http://stackoverflow.com/questions/961581/whats-the-difference-between-dynamicc-4-and-var – Fazi Nov 13 '14 at 15:58
  • What are the recommendations for a language like PHP where ORM comes as a part of the package when you are using a framework? – mithilatw Jun 13 '16 at 08:33
  • 2
    100% agree with this. By opinion, based on 17 years of development experience across numerous systems is that straight SQL absolutely trumps ORM by orders of magnitude in any application which is actually large/data heavy. – Manachi Jun 22 '17 at 00:40
  • 2
    I totally agree. After 5 years of using NHibernate, I really don't want to be anywhere near an ORM again. I know SQL pretty well, I can deal with a few selects and inserts here and there. – Alejandro Jun 23 '17 at 17:29
  • 1
    @Dominik It's 2020 now, I still find this post very insightful. – ospider Aug 02 '20 at 13:55
  • @ospider uh well, how long did you wait for this? ;-) you absolutely got me here! I was young, fresh from university and thought "oh cmon". 7 years later, I just want persistence, abstracted by a contract from my domain layer. May it be an ORM or else, whatever fits. (I personally prefer lightweight convenience layers without much magic like dapper) – Dominik Aug 02 '20 at 17:43
  • @Dominik me too. After 4 years of coding professionally, I finally find out that ORM is not essential, it's just a tool, it's useful somewhere, but useless elsewhere. – ospider Aug 11 '20 at 08:03
205

ORMs have some nice features. They can handle much of the dog-work of copying database columns to object fields. They usually handle converting the language's date and time types to the appropriate database type. They generally handle one-to-many relationships pretty elegantly as well by instantiating nested objects. I've found if you design your database with the strengths and weaknesses of the ORM in mind, it saves a lot of work in getting data in and out of the database. (You'll want to know how it handles polymorphism and many-to-many relationships if you need to map those. It's these two domains that provide most of the 'impedance mismatch' that makes some call ORM the 'vietnam of computer science'.)

For applications that are transactional, i.e. you make a request, get some objects, traverse them to get some data and render it on a Web page, the performance tax is small, and in many cases ORM can be faster because it will cache objects it's seen before, that otherwise would have queried the database multiple times.

For applications that are reporting-heavy, or deal with a large number of database rows per request, the ORM tax is much heavier, and the caching that they do turns into a big, useless memory-hogging burden. In that case, simple SQL mapping (LinQ or iBatis) or hand-coded SQL queries in a thin DAL is the way to go.

I've found for any large-scale application you'll find yourself using both approaches. (ORM for straightforward CRUD and SQL/thin DAL for reporting).

joeforker
  • 40,459
  • 37
  • 151
  • 246
Cameron Pope
  • 7,565
  • 2
  • 26
  • 24
  • 1
    Could you define 'large number of database rows per request'? Please :) – Mosselman Jun 23 '11 at 09:02
  • So can I integrate JPA with IBatis for example?? ANd make them work in the same transaction? – Jaime Hablutzel Oct 06 '11 at 17:23
  • 3
    Another consideration no one seems to discuss is basic state-management. This entire stack of frameworks (JSF, JPA, etc) is based on Java beans get/set methods. This is a TON of boilerplate for every table, for every column and ... here's the real anti-pattern: Just to expose every field as if it were public. In effect, having a get/set method on fields in an object/table/row is very close to violating every tenant of information hiding and encapsulation. Lastly, back to state-management ... where is the immutability option? Can or should half-set objects be allowed? No option with most. – Darrell Teague Mar 26 '13 at 19:09
  • 2
    I'd like to hone in and particularly agree on a key statement in this answer. "For applications that deal with a large number of database rows per request the ORM tax is much heavier". ORM is good only for developers and maintenance because most developers aren't very good at SQL, but if you're actually talking about performance, SQL completely trumps it. – Manachi Jun 22 '17 at 00:42
  • "most developers aren't very good at SQL"??? I would say that most developers does not know how to properly use LINQ, the power of expression trees, and ORMs in general, code generation, and many other things. But no, I do not have any basis to make such a strong statement. – Adanay Martín Nov 09 '17 at 00:10
71

I say plain SQL for Reads, ORM for CUD.

Performance is something I'm always concerned about, specially in web applications, but also code maintainability and readability. To address these issues I wrote SqlBuilder.

Max Toro
  • 28,282
  • 11
  • 76
  • 114
16

ORM is not just portability (which is kinda hard to achieve even with ORMs, for that matter). What it gives you is basically a layer of abstraction over a persistent store, when a ORM tool frees you from writing boilerplate SQL queries (selects by PK or by predicates, inserts, updates and deletes) and lets you concentrate on the problem domain.

Anton Gogolev
  • 113,561
  • 39
  • 200
  • 288
  • 3
    I was thinking of something closer to portability across database flavors. I shouldn't post questions late at night. – hydrapheetz Jan 30 '09 at 07:56
  • 1
    That's exactly what I was saying: even the most basic scenarios can potentially be subject to errors in different DBMSes - for instance, different handling of NULLs. – Anton Gogolev Jan 30 '09 at 08:17
  • 1
    An ORM gives you a layer of abstraction over the relationships between objects, but there is no great advantage with regards to the boilerplate queries you mention. In a JDBC app you can write those types of queries with a small amount of code in an abstract superclass or utility class. There is no need to repeat the boilerplate for each new table. – KevinS Jan 21 '11 at 16:03
12

Any respectable design will need some abstraction for the database, just to handle the impedance mismatch. But the simplest first step (and adequate for most cases) I would expect would be a DAL, not a heavyweight ORM. Your only options aren't those at the ends of the spectrum.


EDIT in response to a comment requesting me to describe how I distinguish DAL from ORM:

A DAL is what you write yourself, maybe starting from a class that simply encapsulates a table and maps its fields to properties. An ORM is code you don't write for abstraction mechanisms inferred from other properties of your dbms schema, mostly PKs and FKs. (This is where you find out if the automatic abstractions start getting leaky or not. I prefer to inform them intentionally, but that may just be my personal preference).

dkretz
  • 37,399
  • 13
  • 80
  • 138
10

The key that made my ORM use really fly was code generation. I agree that the ORM route isn't the fastest, in code performance terms. But when you have a medium to large team, the DB is changing rapidly the ability to regenerate classes and mappings from the DB as part of the build process is something brilliant to behold, especially when you use CI. So your code may not be the fastest, but your coding will be - I know which I'd take in most projects.

My recommendation is to develop using an ORM while the Schema is still fluid, use profiling to find bottlenecks, then tune those areas which need it using raw Sql.

Another thought, the caching built into Hibernate can often make massive performance improvements if used in the right way. No more going back to the DB to read reference data.

MrTelly
  • 14,657
  • 1
  • 48
  • 81
  • 2
    Absolutely a matter of personal taste. To me, code generation is a flaw. – dkretz Jan 30 '09 at 17:40
  • 5
    Read the second paragraph .... maybe completeness is also useful – MrTelly Mar 23 '09 at 22:34
  • Code generation is the only way to get certain tasks done quicker. Like all tools it can be powerful or lead to a disaster. Technically all languages are producing other types of code. – Banjocat Oct 06 '15 at 15:40
9

Dilemma whether to use a framework or not is quite common in modern day software development scenario.

What is important to understand is that every framework or approach has its pros and cons - for example in our experience we have found that ORM is useful when dealing with transactions i.e. insert/update/delete operations - but when it comes to fetch data with complex results it becomes important to evaluate the performance and effectiveness of the ORM tool.

Also it is important to understand that it is not compulsory to select a framework or an approach and implement everything in that. What we mean by that is we can have mix of ORM and native query language. Many ORM frameworks give extension points to plugin in native SQL. We should try not to over use a framework or an approach. We can combine certain frameworks or approaches and come with an appropriate solution.

You can use ORM when it comes to insertion, updation, deletion, versioning with high level of concurrency and you can use Native SQL for report generation and long listing

Rutesh Makhijani
  • 17,065
  • 2
  • 26
  • 22
4

There's no 'one-tool-fits-all' solution, and this is also true for the question 'should i use an or/m or not ? '.

I would say: if you have to write an application/tool which is very 'data' focused, without much other logic, then I 'd use plain SQL, since SQL is the domain-specific language for this kind of applications.

On the other hand, if I was to write a business/enterprise application which contains a lot of 'domain' logic, then I'd write a rich class model which could express this domain in code. In such case, an OR/M mapper might be very helpfull to successfully do so, as it takes a lot of plumbing code out of your hands.

Frederik Gheysels
  • 56,135
  • 11
  • 101
  • 154
1

Use an ORM that works like SQL, but provides compile-time checks and type safety. Like my favorite: Data Knowledge Objects (disclosure: I wrote it)

For example:

for (Bug bug : Bug.ALL.limit(100)) {
  int id = bug.getId();
  String title = bug.getTitle();
  System.out.println(id +" "+ title);
}

Fully streaming. Easy to set up (no mappings to define - reads your existing schemas). Supports joins, transactions, inner queries, aggregation, etc. Pretty much anything you can do in SQL. And has been proven from giant datasets (financial time series) all the way down to trivial (Android).

keredson
  • 3,019
  • 1
  • 17
  • 20
  • Your IDE can also provide such static checks directly (IDEA knows the DB structure so long you tell it where the DB is/where the DDL files are, so it can do type checks/relation checks/etc in your SQL queries/procedures/whatever) – Xenos May 20 '19 at 09:29
  • that's useful. can it do it as part of a build/CI step? how does it classify sql vs other strings? can it handle string manipulation, or only string constants? – keredson May 22 '19 at 05:14
  • I'll get blocked by abBlock, but IntelliJ parses SQL like any other language https://www.jetbrains.com/datagrip/features/ so one could integrate it to CI/CD/build (maybe by asking IJ team to isolate the SQL parsing code? maybe Sonar already has such parser). Parsing brings data type so you can add checks on them (I've done so with a custom plugin), or checks like "does JOIN columns have a FK? index?" etc. These would be neat improvments to native IJ's SQL inspections – Xenos May 22 '19 at 09:31
1

I know this question is very old, but I thought that I would post an answer in case anyone comes across it like me. ORMs have come a long way. Some of them actually give you the best of both worlds: making development more productive and maintaining performance.

Take a look at SQL Data (http://sqldata.codeplex.com). It is a very light weight ORM for c# that covers all the bases.

FYI, I am the author of SQL Data.

xcopy
  • 2,248
  • 18
  • 24
1

I'd like to add my voice to the chorus of replies that say "There's a middle ground!".

To an application programmer, SQL is a mixture of things you might want to control and things you almost certainly don't want to be bothered controlling.

What I've always wanted is a layer (call it DAL, ORM, or micro-ORM, I don't mind which) that will take charge of the completely predictable decisions (how to spell SQL keywords, where the parentheses go, when to invent column aliases, what columns to create for a class that holds two floats and an int ...), while leaving me in charge of the higher-level aspects of the SQL, i.e. how to arrange JOINs, server-side computations, DISTINCTs, GROUP BYs, scalar subqueries, etc.

So I wrote something that does this: http://quince-lib.com/

It's for C++: I don't know whether that's the language you're using, but all the same it might be interesting to see this take on what a "middle ground" could look like.

slyqualin
  • 297
  • 2
  • 12
1

One of the apps I've developed was an IRC bot written in python. The modules it uses run in separate threads, but I haven't figured out a way to handle threading when using sqlite. Though, that might be better for a separate question.

I really should have just reworded both the title and the actual question. I've never actually used a DAL before, in any language.

hydrapheetz
  • 3,540
  • 3
  • 17
  • 12
  • 4
    Well, I'm of the opinion that you should. Raw SQL all over the place is pretty abominable. – chaos Jan 30 '09 at 07:50
  • Well, yeah. There's a piece of forum software I hack on from time to time that has *tons* of mysql_query() and mysql_result() all over the place. It's nuts. – hydrapheetz Jan 30 '09 at 07:53
  • What is this "app" thing you speak of? – Zoran Pavlovic Feb 07 '13 at 18:49
  • It's funny that this question was asked over an irc bot app and became what it was (a very useful guide)! An irc bot app is on one end of the scale, and an application that has 50-100+ tables with complex joins and millions of rows of data with 20+ developers working on it is on the extreme other end of the scale. I dare say when it comes to an 'irc bot app' end of the scale, it barely matters. – Manachi Jun 22 '17 at 00:47