129

I can understand wanting to avoid having to use a cursor due to the overhead and inconvenience, but it looks like there's some serious cursor-phobia-mania going on where people are going to great lengths to avoid having to use one.

For example, one question asked how to do something obviously trivial with a cursor and the accepted answer proposed using a common table expression (CTE) recursive query with a recursive custom function, even though this limits the number of rows that could be processed to 32 (due to recursive function call limit in sql server). This strikes me as a terrible solution for system longevity, not to mention a tremendous effort just to avoid using a simple cursor.

What is the reason for this level of insane hatred? Has some 'noted authority' issued a fatwa against cursors? Does some unspeakable evil lurk in the heart of cursors that corrupts the morals of children or something?

Wiki question, more interested in the answer than the rep.

Related Info:

SQL Server Fast Forward Cursors

EDIT: let me be more precise: I understand that cursors should not be used instead of normal relational operations; that is a no-brainer. What I don't understand is people going waaaaay out of their way to avoid cursors like they have cooties or something, even when a cursor is a simpler and/or more efficient solution. It's the irrational hatred that baffles me, not the obvious technical efficiencies.

General Grievance
  • 4,555
  • 31
  • 31
  • 45
Steven A. Lowe
  • 60,273
  • 18
  • 132
  • 202
  • 1
    I think that your Edit says it all... In _almost_ all situations (that I have come across) there is a way to replace a cursor with a better performing set based situation. You say no-brainer, but you understand the diff. – StingyJack Nov 13 '08 at 18:28
  • 8
    I love the tags on this question! – sep332 Nov 14 '08 at 15:09
  • 2
    The part about recursive CTE limits being `32` is nonsense. Presumably you are thinking of recursive triggers and the max `@@NESTLEVEL` of `32`. It can be set in the query with `OPTION (MAXRECURSION N)` with default `100` and `0` meaning unlimited. – Martin Smith Nov 02 '12 at 13:33
  • @MartinSmith: the default limit is now 100, and the max is 32K http://www.sql-server-helper.com/error-messages/msg-310.aspx – Steven A. Lowe Jan 11 '16 at 23:00
  • Nope, it is still exactly the same as when I made my comment and in all versions of SQL Server that support recursive CTEs. As your link says "When 0 is specified, no limit is applied." – Martin Smith Jan 11 '16 at 23:37
  • 1
    @MartinSmith: thanks, my mistake - two mistakes, actually ;) the first was misreading the reference (I assumed 32K limit = 'unlimited') and the second was the wrong cause - in the example cited, the recursion limit of 32 came from the recursive function, not the CTE. I was probably using SQL Server 2000, or maybe 2008, at the time, hopefully it's better now :). Question edited to clarify - appreciate your correction! – Steven A. Lowe Jan 12 '16 at 03:16

13 Answers13

74

The "overhead" with cursors is merely part of the API. Cursors are how parts of the RDBMS work under the hood. Often CREATE TABLE and INSERT have SELECT statements, and the implementation is the obvious internal cursor implementation.

Using higher-level "set-based operators" bundles the cursor results into a single result set, meaning less API back-and-forth.

Cursors predate modern languages that provide first-class collections. Old C, COBOL, Fortran, etc., had to process rows one at a time because there was no notion of "collection" that could be used widely. Java, C#, Python, etc., have first-class list structures to contain result sets.

The Slow Issue

In some circles, the relational joins are a mystery, and folks will write nested cursors rather than a simple join. I've seen truly epic nested loop operations written out as lots and lots of cursors. Defeating an RDBMS optimization. And running really slowly.

Simple SQL rewrites to replace nested cursor loops with joins and a single, flat cursor loop can make programs run in 100th the time. [They thought I was the god of optimization. All I did was replace nested loops with joins. Still used cursors.]

This confusion often leads to an indictment of cursors. However, it isn't the cursor, it's the misuse of the cursor that's the problem.

The Size Issue

For really epic result sets (i.e., dumping a table to a file), cursors are essential. The set-based operations can't materialize really large result sets as a single collection in memory.

Alternatives

I try to use an ORM layer as much as possible. But that has two purposes. First, the cursors are managed by the ORM component. Second, the SQL is separated from the application into a configuration file. It's not that the cursors are bad. It's that coding all those opens, closes and fetches is not value-add programming.

S.Lott
  • 384,516
  • 81
  • 508
  • 779
  • 3
    "Cursors are how the RDBMS works under the hood." If you mean specifically SQL Server, OK, fine, I'm ignorant of that. But I have worked on the internals of multiple RDBMS (and ORDBMS) (under Stonebraker) and none of them did that. Eg: Ingres uses what amounts to "result sets" of tuples internally. – Richard T Dec 28 '08 at 16:49
  • @Richard T: I am working off second-hand information about RDBMS source; I'll amend the statement. – S.Lott Dec 28 '08 at 18:42
  • 2
    "I've seen truly epic nested loop operations written out as lots and lots of cursors." I keep seeing them too. It is hard to believe. – RussellH Dec 30 '08 at 01:26
42

Cursors make people overly apply a procedural mindset to a set-based environment.

And they are SLOW!!!

From SQLTeam:

Please note that cursors are the SLOWEST way to access data inside SQL Server. The should only be used when you truly need to access one row at a time. The only reason I can think of for that is to call a stored procedure on each row. In the Cursor Performance article I discovered that cursors are over thirty times slower than set based alternatives.

Liam
  • 27,717
  • 28
  • 128
  • 190
Galwegian
  • 41,475
  • 16
  • 112
  • 158
  • 6
    that article is 7 years old, do you think that perhaps things might have changed in the meantime? – Steven A. Lowe Nov 13 '08 at 16:46
  • 1
    I also think cursors are really slow and to be avoided, generally. However, if the OP was referring to the question I think he was, then a cursor was the correct solution there (streaming records one at a time due to memory constraints). – rmeador Nov 13 '08 at 16:49
  • the updated article does not correct the relative speed measurements, but it does provide some good optimizations and alternatives. Note that the original article says that cursors are 50 times faster than while loops, which is interesting – Steven A. Lowe Nov 13 '08 at 17:02
  • I personally think that if you need a cursor you haven't designed your database properly in the first place. – BoltBait Nov 13 '08 at 17:35
  • 6
    @BoltBait: I personally think that if you make blanket assertions like that you can't really be 45 years old :-P – Steven A. Lowe Nov 14 '08 at 03:30
  • Cursors are not that slow in Oracle, where you can use clauses like BULK-COLLECT for performance-wise code. Would be nice to do a benchmark though. – Camilo Díaz Repka Nov 14 '08 at 19:54
  • @Steven: Yeah, I'm old... and very opinionated! – BoltBait Nov 17 '08 at 19:57
  • 4
    @BoltBait: You kids get off my lawn! – Steven A. Lowe Nov 19 '08 at 23:36
  • I think it was Edwin Dijkstra quoted: premature optimization is the root of all evil... so sometimes I think, to hell with the performance argument. Use sets where sets should be applied as well in logic as well in making your algorithm understandable. But sometimes the business logic can be so much clearer and more adaptable when going row-by-row. Well that's my opinion obviously. – Paul Feb 13 '14 at 23:12
  • @Paul: that's a Knuth quote, and it's Edgar Dijkstra, but I agree with the rest of your sentence :) – Steven A. Lowe Jan 11 '16 at 23:01
  • Funny I find the business logic is almost always clearer in set-based code. – HLGEM Aug 21 '17 at 18:27
20

There's an answer above which says "cursors are the SLOWEST way to access data inside SQL Server... cursors are over thirty times slower than set based alternatives."

This statement may be true under many circumstances, but as a blanket statement it's problematic. For example, I've made good use of cursors in situations where I want to perform an update or delete operation affecting many rows of a large table which is receiving constant production reads. Running a stored procedure which does these updates one row at a time ends up being faster than set-based operations, because the set-based operation conflicts with the read operation and ends up causing horrific locking problems (and may kill the production system entirely, in extreme cases).

In the absence of other database activity, set-based operations are universally faster. In production systems, it depends.

davidcl
  • 1,187
  • 8
  • 23
  • 1
    Sounds like the exception that proves the rule. – Joel Coehoorn Nov 13 '08 at 16:51
  • 6
    @[Joel Coehoorn]: I've never understood that saying. – Steven A. Lowe Nov 13 '08 at 17:04
  • 2
    @[Steven A. Lowe] http://www.phrases.org.uk/meanings/exception-that-proves-the-rule.html understand exception as "what is left out" and note that the rule here is something like "in most situation cursors are bad". – David Lay Nov 13 '08 at 17:18
  • 1
    @delm: thanks for the link, now i understand the phrase even less! – Steven A. Lowe Nov 13 '08 at 17:20
  • 5
    @[Steven A. Lowe] Basically it's saying that if you "break a rule" with a subcase, there must be a general rule to break, ergo a rule exists. e.g. From Link: ("If we have a statement like 'entry is free of charge on Sundays', we can reasonably assume that, as a general rule, entry is charged for.") – Fry Nov 13 '08 at 17:35
  • 1
    @Fry: ok that makes sense - so how does it apply here? – Steven A. Lowe Nov 13 '08 at 17:53
  • 1
    I think hes saying that when there are locking problems / memory constraints use cursors. This implies that you shouldn't use cursors otherwise. – SapphireSun Feb 27 '10 at 04:00
  • 1
    Another example of appropriate cursor use: http://support.microsoft.com/kb/973849 . The original version used set based techniques and had locking issues. – Moe Sisko Jul 02 '13 at 04:44
  • And interestingly, the type of data set they're talking about (session state data from web applications) exactly meets the criterion I mentioned in the original post-- a data set where many rows are being affected by an operation, but the table itself receives constant production reads. The original version of this would have been death for a site with any substantial volume of traffic. – davidcl Jul 16 '13 at 17:04
9

Cursors tend to be used by beginning SQL developers in places where set-based operations would be better. Particularly when people learn SQL after learning a traditional programming language, the "iterate over these records" mentality tends to lead people to use cursors inappropriately.

Most serious SQL books include a chapter enjoining the use of cursors; well-written ones make it clear that cursors have their place but shouldn't be used for set-based operations.

There are obviously situations where cursors are the correct choice, or at least A correct choice.

davidcl
  • 1,187
  • 8
  • 23
9

The optimizer often cannot use the relational algebra to transform the problem when a cursor method is used. Often a cursor is a great way to solve a problem, but SQL is a declarative language, and there is a lot of information in the database, from constraints, to statistics and indexes which mean that the optimizer has a lot of options to solve the problem, whereas a cursor pretty much explicitly directs the solution.

Cade Roux
  • 88,164
  • 40
  • 182
  • 265
8

In Oracle PL/SQL cursors will not result in table locks and it is possible to use bulk-collecting/bulk-fetching.

In Oracle 10 the often used implicit cursor

  for x in (select ....) loop
    --do something 
  end loop;

fetches implicitly 100 rows at a time. Explicit bulk-collecting/bulk-fetching is also possible.

However PL/SQL cursors are something of a last resort, use them when you are unable to solve a problem with set-based SQL.

Another reason is parallelization, it is easier for the database to parallelize big set-based statements than row-by-row imperative code. It is the same reason why functional programming becomes more and more popular (Haskell, F#, Lisp, C# LINQ, MapReduce ...), functional programming makes parallelization easier. The number CPUs per computer is rising so parallelization becomes more and more an issue.

tuinstoel
  • 7,248
  • 27
  • 27
6

In general, because on a relational database, the performance of code using cursors is an order of magnitude worse than set-based operations.

Charles Bretana
  • 143,358
  • 22
  • 150
  • 216
  • do you have a benchmark or reference for this? i have not noticed any such drastic performance degradation... but maybe my tables don't have enough rows for it to matter (a million or less, usually)? – Steven A. Lowe Nov 13 '08 at 16:43
  • oh wait i see what you mean - but i would never advocate using cursors intead of set operations, only not going to extremes to avoid cursors – Steven A. Lowe Nov 13 '08 at 17:08
  • 3
    I remember the first time I did SQL, We had to import a 50k daily data file from a mainframe into a SQL Server database... I used a cursor, and discovered, that the import was taking about 26 hours using the cursor... When I changed to set-based operations, the process took 20 minutes. – Charles Bretana Nov 13 '08 at 17:09
6

The answers above have not emphasized enough the importance of locking. I'm not a big fan of cursors because they often result in table level locks.

Richard T
  • 4,570
  • 5
  • 37
  • 49
  • 1
    yes, thank you! Without options to prevent it (read only, forward only, etc) they certainly will, as will any (sql server) operation that proceeds to occupy several rows and then several pages of rows. – Steven A. Lowe Dec 29 '08 at 01:14
  • ?? That's a problem with your locking strategy NOT cursors. Even a SELECT statement will add read locks. – Adam Mar 13 '19 at 16:48
3

For what it's worth I have read that the "one" place a cursor will out perform its set-based counterpart is in a running total. Over a small table the speed of summing up the rows over the order by columns favors the set-based operation but as the table increases in row size the cursor will become faster because it can simply carry the running total value to the next pass of the loop. Now where you should do a running total is a different argument...

Eric Sabine
  • 1,165
  • 7
  • 8
  • 1
    If you mean by "running total" an aggregation of some kind (min, max, sum), any competent DBMS will beat the pants off of a client-side, cursor based solution, if only because the function is performed in the engine and there's no client <--> server overhead. Maybe SQL Server isn't competent? – Richard T Dec 28 '08 at 16:44
  • 1
    @[Richard T]: we're discussing server-side cursors, as within a stored procedure, not client-side cursors; sorry for the confusion! – Steven A. Lowe Dec 29 '08 at 01:15
2

Outside of the performance (non)issues, I think the biggest failing of cursors is they are painful to debug. Especially compared to code in most client applications where debugging tends to be comparatively easy and language features tend to be much easier. In fact, I contend that nearly anything one is doing in SQL with a cursor should probably be happening in the client app in the first place.

Wyatt Barnett
  • 15,573
  • 3
  • 34
  • 53
  • 2
    SQL is painful to debug, even without cursors. MS SQL step-through tools in Visual Studio don't seem to like me (they hang a lot, or don't trip breakpoints at all), so I'm usually reduced to PRINT statements ;-) – Steven A. Lowe Jun 10 '09 at 03:43
1

Can you post that cursor example or link to the question? There's probably an even better way than a recursive CTE.

In addition to other comments, cursors when used improperly (which is often) cause unnecessary page/row locks.

Gordon Bell
  • 13,337
  • 3
  • 45
  • 64
1

You could have probably concluded your question after the second paragraph, rather than calling people "insane" simply because they have a different viewpoint than you do and otherwise trying to mock professionals who may have a very good reason for feeling the way that they do.

As to your question, while there are certainly situations where a cursor may be called for, in my experience developers decide that a cursor "must" be used FAR more often than is actually the case. The chance of someone erring on the side of too much use of cursors vs. not using them when they should is MUCH higher in my opinion.

Tom H
  • 46,766
  • 14
  • 87
  • 128
  • 8
    please read more carefully, Tom - the exact phrase was "insane hatred"; "hated" was the object of the adjective "insane", not "people". English can be a bit difficult sometimes ;-) – Steven A. Lowe Nov 13 '08 at 18:55
0

basicaly 2 blocks of code that do the same thing. maybe it's a bit weird example but it proves the point. SQL Server 2005:

SELECT * INTO #temp FROM master..spt_values
DECLARE @startTime DATETIME

BEGIN TRAN 

SELECT @startTime = GETDATE()
UPDATE #temp
SET number = 0
select DATEDIFF(ms, @startTime, GETDATE())

ROLLBACK 

BEGIN TRAN 
DECLARE @name VARCHAR

DECLARE tempCursor CURSOR
    FOR SELECT name FROM #temp

OPEN tempCursor

FETCH NEXT FROM tempCursor 
INTO @name

SELECT @startTime = GETDATE()
WHILE @@FETCH_STATUS = 0
BEGIN

    UPDATE #temp SET number = 0 WHERE NAME = @name
    FETCH NEXT FROM tempCursor 
    INTO @name

END 
select DATEDIFF(ms, @startTime, GETDATE())
CLOSE tempCursor
DEALLOCATE tempCursor

ROLLBACK 
DROP TABLE #temp

the single update takes 156 ms while the cursor takes 2016 ms.

Mladen Prajdic
  • 15,457
  • 2
  • 43
  • 51
  • 3
    well yes, it proves the point that this is a really dumb way to use a cursor! but what if the update of each row depended on the value of the prior row in date order? – Steven A. Lowe Nov 13 '08 at 17:06
  • BEGIN TRAN SELECT TOP 1 baseval FROM table ORDER BY timestamp DESC INSERT table (fields) VALUES (vals, including derived value from prior record) COMMIT TRAN – dkretz Nov 13 '08 at 18:04
  • @doofledorfer: that would insert one row based on the last row by date, not update every row by a value from its prior row in date order – Steven A. Lowe Nov 14 '08 at 03:28
  • To truly use the cursor you should use WHERE CURRENT OF in the update – erikkallen Dec 28 '08 at 17:12