9

I plan to add to most tables in my DB rowversion to track changes in those tables. I know that adding it will affect performance of queries.

Does anyone knows if it affect performance a little bit (few percent slower) or I should not to add rowversion to many tables, because it make DB much slower.

Martin Smith
  • 438,706
  • 87
  • 741
  • 845
Marek Kwiendacz
  • 9,524
  • 14
  • 48
  • 72
  • 2
    The additional column will make all your rows 8 bytes wider which might mean less rows per page which means more pages which means less data fits in cache and possibly deeper indexes. Percentage effect will depend upon current table definitions. I guess there will also be some overhead from generating the `rowversion` itself. For `identity` this overhead turns out to [be easily measurable](http://stackoverflow.com/questions/5999416) I've never looked at it for `rowversion` – Martin Smith Aug 27 '11 at 20:44
  • 1
    @Martin's spot on, please take note that it can also impact concurrency in different ways - depending on the existing key structure of the table it may make some updates more compatible with current usage patterns, and some updates less compatible. They won't break they just may block differently. – Aaron Bertrand Aug 27 '11 at 21:53
  • are you building datawarehouse? – pkmiec Oct 01 '11 at 14:43

2 Answers2

5

Performance difference of just adding a rowversion/timestamp column is that your rows are now 8 bytes wider.

The actual performance difference comes when you start actually using them for something. But as I point out in my answer to a similar question: RowVersion and Performance

If you're not going use the rowVersion field for checking for updated items and instead you're going to use it for consistency to ensure that the record isn't updated since you last read, then this is going to be a perfectly acceptable use and will not impact.

Such As:

UPDATE MyTable SET MyField = ' @myField
WHERE Key = @key AND rowVersion = @rowVersion

So performance when just checking the row to make sure that it's not been updated since the application last read is going to be trivial performance difference (it has to read the row to update it anyway).

But performance when trying to use a rowversion/timestamp column as a means to get all updated items since last time we checked is going to be very poor.

Community
  • 1
  • 1
Seph
  • 8,472
  • 10
  • 63
  • 94
  • Why would it be slow to get updated item since the last check? After all ROWVERSION is nothing but a BIGINT. I think it's quite the contrary in the sense that, otherwise you'd have to use DATETIME.. – Ostati Oct 01 '14 at 16:52
  • 1
    @Ostati ROWVERSION is not indexed by default and on large transaction tables the index will become heavily fragmented and require a maintenance plan to frequently rebuild the index which will impact your operations if you load is 24/7 – Seph Oct 02 '14 at 08:23
  • @Seph What would be a better alternative to ROWVERSION + INDEX for querying for all records that were updated since the last time we've checked? – Sal Feb 07 '18 at 14:54
  • If you are considering using ROWVERSION to find updated records, take a look at Change Tracking. – user1843640 Jan 21 '20 at 19:14
  • Not sure that Change Tracking is much different from a performance perspective. [this](https://www.mssqltips.com/sqlservertip/4035/sql-server-change-tracking-performance-troubleshooting/) says that performance overhead is basically the same as having an index so using rowversion and index could be equivalent. Also not having to do snapshot transactions (with the complexities on temp table and index management) could also mean that rowversion is a lighter option. Worth considering both for your specific situation – Brett Dec 07 '20 at 00:37
2

Of interest to me, the versioning we use is populated by trigger, so when I saw this I had to learn more about the performance. So, I set about setting up a test scenario. I wanted to compare what we currently use (trigger) versus a rowversion column versus another table with no versioning.

Not surprisingly, the trigger definitely performed worse than the rowversion. Rowversion was essentially identical to the update time on a table with no versioning; Some runs show the table with no versioning is faster, but about an equal # show rowversion to be faster. To me this means that there is so little overhead in using it, that random CPU and disk I/O hide the real performance difference.

SET NOCOUNT ON
GO

CREATE TABLE _TEST_BaseTest(myKey bigint PRIMARY KEY,myValue bigint,UselessColumn bigint)
CREATE TABLE _TEST_RowVersionTest(myKey bigint PRIMARY KEY,myValue bigint, RV rowversion)
CREATE TABLE _TEST_ModifiedVersionTest(myKey bigint PRIMARY KEY,myValue bigint, MV bigint)

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TRIGGER dbo._TEST_ModifiedVersionTest_Trigger
   ON  dbo._TEST_ModifiedVersionTest
   AFTER UPDATE
AS 
BEGIN
    -- SET NOCOUNT ON added to prevent extra result sets from
    -- interfering with SELECT statements.
    SET NOCOUNT ON;

    -- Insert statements for trigger here
    UPDATE tbl
    SET tbl.MV=tbl.MV+1
    FROM _TEST_ModifiedVersionTest tbl
    INNER JOIN inserted i on i.myKey=tbl.myKey
END
GO

INSERT INTO _TEST_BaseTest (myKey,myValue)
    SELECT TOP 50000 <FIELD1>,<FIELD2> FROM <SOME_TABLE>

INSERT INTO _TEST_RowVersionTest (myKey,myValue)
    SELECT myKey,myValue
    FROM _TEST_BaseTest

INSERT INTO _TEST_ModifiedVersionTest (myKey,myValue,MV)
    SELECT myKey,myValue,1
    FROM _TEST_BaseTest

DECLARE     @StartTimeBase DATETIME, @EndTimeBase DATETIME
        ,   @StartTimeRV DATETIME, @EndTimeRV DATETIME
        ,   @StartTimeMV DATETIME, @EndTimeMV DATETIME
        ,   @CNT INT=0, @Iterations INT=25

--BASE
    SET @StartTimeBase=GETDATE()
    SET @CNT=1
    WHILE @CNT<=@Iterations
    BEGIN
        UPDATE _TEST_RowVersionTest SET myValue=myValue

        SET @CNT=@CNT+1
    END
    SET @EndTimeBase=GETDATE()

--RV
    SET @StartTimeRV=GETDATE()
    SET @CNT=1
    WHILE @CNT<=@Iterations
    BEGIN
        UPDATE _TEST_RowVersionTest SET myValue=myValue

        SET @CNT=@CNT+1
    END
    SET @EndTimeRV=GETDATE()

--MV
    SET @StartTimeMV=GETDATE()
    SET @CNT=1
    WHILE @CNT<=@Iterations
    BEGIN
        UPDATE _TEST_ModifiedVersionTest SET myValue=myValue

        SET @CNT=@CNT+1
    END
    SET @EndTimeMV=GETDATE()

DECLARE @Rows INT
SELECT @Rows=COUNT(*) FROM _TEST_BaseTest

PRINT CONVERT(VARCHAR,@Rows) + ' rows updated ' + CONVERT(VARCHAR,@Iterations) + ' time(s)'
PRINT CONVERT(VARCHAR,DATEDIFF(MS,@StartTimeBase,@EndTimeBase)) + ' Base Time Elapsed (ms)'
PRINT CONVERT(VARCHAR,DATEDIFF(MS,@StartTimeRV,@EndTimeRV))     + ' Rv Time Elapsed (ms)'
PRINT CONVERT(VARCHAR,DATEDIFF(MS,@StartTimeMV,@EndTimeMV))     + ' Mv Time Elapsed (ms)'

drop TABLE _TEST_BaseTest
drop TABLE _TEST_RowVersionTest
drop table _TEST_ModifiedVersionTest
UnhandledExcepSean
  • 12,504
  • 2
  • 35
  • 51