0

I am developing a brand new product at work that is supposed to go live soon. It's predominantly an ETL product that will be dealing with enormous volumes of data in a queue type operation. So records come in, we do work on them, and then they are picked up and sent back out. So everything that's being done in this system is done over and over again until all the records have completed processing.

I'm being told by my boss (who is open and reasonable, this isn't a demand) that I should add NOLOCK hints to all the queries.

I'm torn on this because I've always read that it's bad practice. I've also read about the READPAST hint and I'm thinking that might be a good alternative, but I want to get others' opinions as no one in my organization has used it before.

My understanding is that READPAST would pick up any records that aren't locked, and would just ignore locked records, which in some systems I could see being a problem. In this system where the job will run again a few minutes later and will likely pick up the previously locked record, I don't see that it would matter. It's not super time-sensitive, so if a record takes a few minutes longer because it's locked, that's acceptable.

Wondering what others thoughts on this are? I realize this is not a replacement for proper indexing and I'm working on that as well.

hyphen
  • 2,368
  • 5
  • 28
  • 59
  • 1
    *"I'm being told by my boss (who is open and reasonable, this isn't a demand) that I should add `NOLOCK` hints to all the queries."* Has your boss given a reason for this? There are definitely a lot considerations to think of before you add `NOLOCK` to *any* query; let along blanket add it to every single one. – Thom A Aug 27 '19 at 11:44
  • 1
    Consider Service Broker. It's explicitly designed to avoid all the pitfalls you'll naturally run into when trying to implement queues as tables. And definitely do *not* consider `NOLOCK`. The *last* thing you want in a queue system is duplicated or skipped messages, which is a definite possibility with `NOLOCK`. `NOLOCK` is only good for queries where you don't really care if the results are possibly wrong because you can easily rerun it (like a monitoring query). – Jeroen Mostert Aug 27 '19 at 11:45
  • There is another existing product where they've chosen to use NOLOCK throughout, probably for good reason, but this is a totally new product and we haven't seen run it enough to even see locking issues. – hyphen Aug 27 '19 at 11:46
  • 1
    Using `NOLOCK` throughout is rarely done for a *good* reason -- typically it's a knee-jerk reaction to people seeing their queries fail on deadlocks and not knowing how to resolve them (or considering this too onerous a task). It's a [bad habit](https://www.sentryone.com/blog/aaronbertrand/bad-habits-nolock-everywhere), and there are alternatives (like snapshot isolation). – Jeroen Mostert Aug 27 '19 at 11:47
  • I've only used a hint in the actual procedure that pulls the records for processing by the app layer, based on this post: https://www.mssqltips.com/sqlservertip/1257/processing-data-queues-in-sql-server-with-readpast-and-updlock/ – hyphen Aug 27 '19 at 11:48
  • 1
    I've never had the need to add NOLOCK to anything. If one does decide to use this table hint, one should take all consequences into account (the query may see dirty and inconsistent data). See also this question on Stack Overflow: [Is the NOLOCK (Sql Server hint) bad practice?](https://stackoverflow.com/q/1452996/243373) – TT. Aug 27 '19 at 11:49
  • I've used READPAST in combination with UPDLOCK basically in the same way they've demonstrated in the link. There are other pieces of the process that my backend etl system runs against the queue. I don't have any hints or anything on those records, but they're only operating on records that are already completed. I'm using a series of statuses to control the flow of records, and the app layer is only looking at one set of statuses and my ETL process is primarily looking at another. So I'm hoping locking won't be a huge issue(?).. – hyphen Aug 27 '19 at 11:51
  • Relevant: [Using a database table as a queue](https://stackoverflow.com/q/2177880/243373) – TT. Aug 27 '19 at 13:34

0 Answers0