SELECT returns DataRow twice

Question

I just came across a strange behaviour of a query I'm running against a SQL Server 2014.

SELECT * 
FROM ExampleTable 
WHERE ExampleTimestamp < '22.06.2016 15:35:00'

The strange thing now is that it returns a row twice, what I've not expected to happen.

From the data I can see that two column of the duplicated rows are different in each occurrence (one of them is the ExampleTimestamp column).

It is likely that the data row was changed by a different thread, but is it normal that the data row is then returned twice in one DataTable?

How can I prevent getting a row twice if it was changed during the query?

I don't think that `DISTINCT` will do the job here, as I need all columns returned. As there are differences between the two rows it will still return both of them. Additionally if I'm using distinct on the PK of that column, which datapart should the sql server return then? — Thomas Voß, Jun 22 '16 at 14:09
You're not making much sense. The reason it's returning the 2 rows is that they are different, you've just said they're different. DISTINCT also has nothing to do with removing columns — Rich Benner, Jun 22 '16 at 14:16
@Rich: Yes they are different, I just thought that by `SELECT * FROM ` the result would be one row per PK and that it is not possible to have the same PK twice in one result. — Thomas Voß, Jun 22 '16 at 14:29
so your issue is that you have duplicate data in your primary key? — Rich Benner, Jun 22 '16 at 14:47
No. My PK is OK, but I have the same PK in one query result twice with different data. I think it got updated while the query was running and it was therefore moved in the index on `ExampleTimestamp` to the end where the query found it again and added it again to the result. So it seems to be a concurrent read/write issue. — Thomas Voß, Jun 22 '16 at 14:53

score 2 · Answer 1 · answered Jun 22 '16 at 14:09

I might me wrong, but according to my latest knowlegde an UPDATE to a data row is inserting a new line and deleting the old one. You probably were placing your query exacly at the time in between.

It is possible to work around this behavior when you wrap any write access to your database in a transaction. That way you will either get the state before or after commit of the transaction but never an undefined state in between.

I hope this will help.

Saxxon

score 1 · Answer 2 · answered Jun 22 '16 at 14:09

1

Are you saying that there is one row that fits the criteria but you get 2 back in the results?

Perhaps you need to set the TRANSACTION ISOLATION LEVEL to something different to what you have.

https://msdn.microsoft.com/en-GB/library/ms173763.aspx

answered Jun 22 '16 at 14:09

AntDC

1,807
14
23

Tamas Rev · Accepted Answer · 2016-06-22T14:33:57.757

1

You need to set up transaction isolation level correctly. E.g.:

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;

Just be very careful with it because these levels have a significant impact on performance. In same cases speed is more important than consistency, sometimes the other way around.

Probably you need to run your query in a transaction:

BEGIN TRANSACTION 'tran';  
  SELECT * 
  FROM ExampleTable 
  WHERE ExampleTimestamp < '22.06.2016 15:35:00'  
COMMIT TRANSACTION 'tran';  
GO

Update: REPEATABLE READ addresses this problem, READ COMMITTED doesn't.

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;

According to the isolation level definitions, REPEATABLE READ is the least restrictive isolation level that prevents the Non Repeatable Read phenomenon.

edited Jun 22 '16 at 14:33

answered Jun 22 '16 at 14:12

Tamas Rev

7,008
5
32
49

I've read how the setting is for my current session with this query: http://stackoverflow.com/questions/1038113/how-to-find-current-transaction-level. It's set to `ReadCommitted` already. At least for my MSSMS session it's set like this. As I'm not changing the session from my program, can I expect that the setting is similar? – Thomas Voß Jun 22 '16 at 14:25
In fact if I take out the where clause from that query, all sessions have set `ReadCommitted` – Thomas Voß Jun 22 '16 at 14:26
I would expect that. Just remember that `READ COMMITTED` still allows overlapping threads. It still allows the repeatable read and phantom read problems. For your case you need to prevent 'non repeatable read' events. So `REPEATABLE READ` would suit you better. See the [MSDN docs](https://msdn.microsoft.com/en-us/library/ms378149(v=sql.110).aspx) or the related [wikipedia page](https://en.wikipedia.org/wiki/Isolation_(database_systems)#Repeatable_reads) – Tamas Rev Jun 22 '16 at 14:32
Updated the answer using the right trx isolation level. – Tamas Rev Jun 22 '16 at 14:34
1

My first thought was that I should change it server wide / database wide in my test environment to see the impact, but as this https://blogs.msdn.microsoft.com/ialonso/2012/11/26/how-to-set-the-default-transaction-isolation-level-server-wide/ article tells us, there is no way of doing it. It has to be done for each opened session individually. – Thomas Voß Jun 22 '16 at 14:48

SELECT returns DataRow twice

3 Answers3