0

I am the end-user of a highly updated Microsoft SQL Server DB containing dozens of tables with hunreds of millions of rows each. A banking DB is a good example for what I am working with, with the exception that in my DB UPDATE statement are rearly used and INSERT statements are used frequently (once a row as entered a table, it rarely changes). I, personally, not using any UPDATE/INSERT statement, only SELECT statement (with complex WHERE/ JOIN/ CROSS/ GROUP clues).

I have some questions about locking and using NOLOCK/READPAST.

1.how can I know if a query I am using is locking only a row or the entire table? for example, I noticed this query didn't locked other users from inserting new data to the table:

SELECT *
FROM Table

while this query did:

SELECT COUNT(Date)
FROM Table

This is of course just examples, not actual full queris I am using. As I mentioned, rows rarely changing so locking a row isn't concerning me but locking a table is highly concerning.

2.I would like to know the risks of using NOLOCK/READPAST in my queries (to revoke any concern I might have about locking a table from updating). I searched about it a lot but I could not find a full answer. I dont care If by using NOLOCK/READPAST I might get past data (that again, data i rarely changes) or I might miss some newly added data. I did read in a couple of places that using NOLOCK might cause duplicate data/ corrupted data, this is a problem for me.

3.what exactly is the diffrent between READPASY and NOLOCK? which one is "safer" regarding the concerns mentiond above

Thank you.

Yakir Shlezinger
  • 147
  • 1
  • 13
  • 1
    The risks of using `NOLOCK` are numerous but well discussed. You can find links to a "few" articles [here](https://sqlblog.org/nolock). – Thom A Feb 09 '22 at 12:37
  • 1
    `NOLOCK` could even get you an entire page read twice, if there is a page split or an index rebuild underway. A page split could happen even on a single row insert. – Charlieface Feb 09 '22 at 14:47

1 Answers1

1
  1. This is highly dependent on your servers settings. Generally speaking, you want to lock records, even when you are just reading them because you don't want data to change while you are reading it. This isn't just something that affects updated records, but also inserts. You can learn more about read commits and snapshop isolation here: https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/sql/snapshot-isolation-in-sql-server

  2. Both NOLOCK/READPAST should be avoided at all cost. There are a very small handful of scenarios where these make sense, but they are exceedingly rare. You are better off optimizing your query to perform better and reduce the amount of records being locked and the time that the records spend being locked. One case that I can see NOLOCK being useful would be a log table that only has inserts, and your query doesn't join the data to other tables, AND a dirty record wouldn't cause problems.

  3. NOLOCK doesn't lock records that it reads. The risk here is that records you are reading can literally change mid read. This means you can begin reading a record and get some values for some columns before the update was made and some column values from after the update. If another transaction rolls back you could end up reading records that were never actually committed to the database. READPAST skips any rows that are locked. If another query runs and the criteria causes rows 1-25 of 100 to be locked while you are querying the same data you are only going to see records 26-100. To your query locked rows don't exist. Great article with the details: https://www.mssqltips.com/sqlservertip/4468/compare-sql-server-nolock-and-readpast-table-hints/

You would be far better served by spending time learning to optimize your queries to reduce the number of records they need to lock, and improving the performance so that the amount of time those locks exist is kept to a minimum.

Nick Fotopoulos
  • 531
  • 5
  • 15
  • Thank you. I do spend a lot of time in optimizing my queries but when the table is hundreds of millions rows long, it is not an easy task, this is why NOLOCK/READPAST can be really helpful in some situations. I do understand thatNOLOCK can result in a corrupted data. taking into account we are delaing with an INSERT only table, and I dont have any query that involves more than 1 table. is there any real risk using READPAST? beside maybe miss some newly added rows? – Yakir Shlezinger Feb 09 '22 at 14:23
  • You will miss any records that are locked. Records can be locked for many reasons outside of INSERT/UPDATE. Other SELECTs can lock records also. If you have tight control over this process and know that nothing else really queries this table then yes you would only miss newly inserted records, but if someone else write a query in the future that pulls from this it can cause this query to start producing unexpected results. What issue is causing you to want to use NOLOCK/READPAST? They might be better solved by a well designed index. – Nick Fotopoulos Feb 09 '22 at 17:16
  • But as mentioned by @charlieface above, there are things like page splits that could cause you to read the same records multiple times and all kinds of other crazy things. – Nick Fotopoulos Feb 09 '22 at 17:23
  • 1
    Thank you very much, that covered all my questions, I understand that NOLOCK/READPAST is not a good practice and should not be used "by default" when creating a query. I dont have a problem right now regarding a too long query, I did had one a couple of days ago which got me to explore the locking mechanisem. The original problem I had has been solved (also with the help of this forum). – Yakir Shlezinger Feb 09 '22 at 21:31