Changing the Clustering key on an Existing Table - SQL Server 2008

Question

I've been handed a task to improve query performance on a table.

The primary key is a GUID that is created by application code, so is not sequential, and there is no separate sequential clustering key on the table.

My feeling is that it is the choice of a non-sequential GUID as the primary and clustering key that is the main culprit for poor performance. I intend to drop the clustered index on the GUID and add an INT IDENTITY as the clustering key.

The table has ~3 million rows in it.

Is it better to try and alter the table, or to create a new table, copy the existing data to it, remove the old table, and rename the new table?

EDIT: Copying 3 million rows is taking a very long time. Would dropping the index be quicker?

EDIT 2: Decided to solve the slow copying problem with hardware, and threw 20 cores at it instead of 4. It's a lot quicker now, though still far slower than i'd expect. I'd estimate it will take 30 minutes to copy the 3 million rows.

Would still appreciate a solution though as this is just a test, i still have to do it on the production server, i'd i'd rather not have it down for longer than necessary.

For info, based on advice by @ughai, my autogrowth settings are now at 500Mb.

This kind of thing is not really my specialty so would appreciate some advice as to what is the best way of going around this.

If it's relevant, the bulk of the queries that take place on this table have no joins.

EDIT: Original Table Schema

 CREATE TABLE [dbo].[IODBTaskHistory](
    [Id] [uniqueidentifier] NOT NULL,
    [Tag] [nvarchar](250) NULL,
    [Type] [int] NOT NULL,
    [SourceFilePath] [nvarchar](max) NOT NULL,
    [DestinationFilePath] [nvarchar](max) NULL,
    [Priority] [int] NOT NULL,
    [State] [int] NOT NULL,
    [SubState] [int] NOT NULL,
    [StateDescription] [nvarchar](max) NULL,
    [Progress] [decimal](5, 2) NOT NULL,
    [Date_Created] [datetime] NOT NULL,
    [Date_Queued] [datetime] NULL,
    [Date_Started] [datetime] NULL,
    [Date_Finished] [datetime] NULL,
    [Date_LastUpdated] [datetime] NULL,
    [Optional_ParentDependancyTaskId] [uniqueidentifier] NULL,
    [Optional_isParentSuccessRequired] [bit] NULL,
    [Transfer_ProgressBytes] [float] NULL,
    [Transfer_SpeedCurrentBps] [float] NULL,
    [Transfer_SpeedIntervals] [nvarchar](max) NULL,
    [IODrone_Id] [uniqueidentifier] NULL,
    [IODrone_Version] [nvarchar](max) NULL,
    [Action] [int] NOT NULL,
    [Date_TransferStarted] [datetime] NULL,
    [Optional_NotificationEmails] [nvarchar](max) NULL,
    [MaxRetryCount] [int] NULL,
    [CurrentRetryCount] [int] NULL,
    [Impersonation_Username] [nvarchar](200) NOT NULL,
    [Impersonation_Password] [nvarchar](max) NOT NULL,
    [AllowRewrite] [bit] NOT NULL CONSTRAINT [DF_IODBTaskHistory_AllowRewrite]  DEFAULT ((0)),
    [SubTag] [nvarchar](255) NULL,
    [SourceLengthBytes] [bigint] NULL CONSTRAINT [DF_IODBTaskHistory_SourceLengthBytes2]  DEFAULT ((0)),
    [IODrone_Thread] [int] NULL,
    [Date_FileSizeFetched] [datetime] NULL,
    [Date_StornextTapeRetrievalStarted] [datetime] NULL,
    [Date_StornextTapeRetrievalFinished] [datetime] NULL,
    [IOServiceAddress] [nvarchar](20) NULL,
    [LogString] [nvarchar](max) NULL,
    [NotesString] [nvarchar](max) NULL,
    [TX_Date] [datetime] NULL,
    [SlowDownUpload] [bit] NULL CONSTRAINT [DF_IODBTaskHistory_SlowDownUpload]  DEFAULT ((0)),
 CONSTRAINT [PK_IODBTaskHistory] PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

Destination Table Schema

CREATE TABLE [dbo].[IODBTaskHistoryNew](
    [Id] [uniqueidentifier] NOT NULL,
    [ClusterKey] [int] IDENTITY(1,1) NOT NULL,
    [Tag] [nvarchar](250) NULL,
    [Type] [int] NOT NULL,
    [SourceFilePath] [nvarchar](max) NOT NULL,
    [DestinationFilePath] [nvarchar](max) NULL,
    [Priority] [int] NOT NULL,
    [State] [int] NOT NULL,
    [SubState] [int] NOT NULL,
    [StateDescription] [nvarchar](max) NULL,
    [Progress] [decimal](5, 2) NOT NULL,
    [Date_Created] [datetime] NOT NULL,
    [Date_Queued] [datetime] NULL,
    [Date_Started] [datetime] NULL,
    [Date_Finished] [datetime] NULL,
    [Date_LastUpdated] [datetime] NULL,
    [Optional_ParentDependancyTaskId] [uniqueidentifier] NULL,
    [Optional_isParentSuccessRequired] [bit] NULL,
    [Transfer_ProgressBytes] [float] NULL,
    [Transfer_SpeedCurrentBps] [float] NULL,
    [Transfer_SpeedIntervals] [nvarchar](max) NULL,
    [IODrone_Id] [uniqueidentifier] NULL,
    [IODrone_Version] [nvarchar](max) NULL,
    [Action] [int] NOT NULL,
    [Date_TransferStarted] [datetime] NULL,
    [Optional_NotificationEmails] [nvarchar](max) NULL,
    [MaxRetryCount] [int] NULL,
    [CurrentRetryCount] [int] NULL,
    [Impersonation_Username] [nvarchar](200) NOT NULL,
    [Impersonation_Password] [nvarchar](max) NOT NULL,
    [AllowRewrite] [bit] NOT NULL,
    [SubTag] [nvarchar](255) NULL,
    [SourceLengthBytes] [bigint] NULL,
    [IODrone_Thread] [int] NULL,
    [Date_FileSizeFetched] [datetime] NULL,
    [Date_StornextTapeRetrievalStarted] [datetime] NULL,
    [Date_StornextTapeRetrievalFinished] [datetime] NULL,
    [IOServiceAddress] [nvarchar](20) NULL,
    [LogString] [nvarchar](max) NULL,
    [NotesString] [nvarchar](max) NULL,
    [TX_Date] [datetime] NULL,
    [SlowDownUpload] [bit] NULL,
PRIMARY KEY NONCLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY],
UNIQUE CLUSTERED 
(
    [ClusterKey] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

GO

ALTER TABLE [dbo].[IODBTaskHistoryNew] ADD  CONSTRAINT [DF_IODBTaskHistory_AllowRewriteNew]  DEFAULT ((0)) FOR [AllowRewrite]
GO

ALTER TABLE [dbo].[IODBTaskHistoryNew] ADD  CONSTRAINT [DF_IODBTaskHistory_SourceLengthBytes2New]  DEFAULT ((0)) FOR [SourceLengthBytes]
GO

ALTER TABLE [dbo].[IODBTaskHistoryNew] ADD  CONSTRAINT [DF_IODBTaskHistory_SlowDownUploadNew]  DEFAULT ((0)) FOR [SlowDownUpload]
GO

MY Copy query

    INSERT INTO [dbo].[IODBTaskHistoryNew]
           ([Id]
           ,[Tag]
           ,[Type]
           ,[SourceFilePath]
           ,[DestinationFilePath]
           ,[Priority]
           ,[State]
           ,[SubState]
           ,[StateDescription]
           ,[Progress]
           ,[Date_Created]
           ,[Date_Queued]
           ,[Date_Started]
           ,[Date_Finished]
           ,[Date_LastUpdated]
           ,[Optional_ParentDependancyTaskId]
           ,[Optional_isParentSuccessRequired]
           ,[Transfer_ProgressBytes]
           ,[Transfer_SpeedCurrentBps]
           ,[Transfer_SpeedIntervals]
           ,[IODrone_Id]
           ,[IODrone_Version]
           ,[Action]
           ,[Date_TransferStarted]
           ,[Optional_NotificationEmails]
           ,[MaxRetryCount]
           ,[CurrentRetryCount]
           ,[Impersonation_Username]
           ,[Impersonation_Password]
           ,[AllowRewrite]
           ,[SubTag]
           ,[SourceLengthBytes]
           ,[IODrone_Thread]
           ,[Date_FileSizeFetched]
           ,[Date_StornextTapeRetrievalStarted]
           ,[Date_StornextTapeRetrievalFinished]
           ,[IOServiceAddress]
           ,[LogString]
           ,[NotesString]
           ,[TX_Date]
           ,[SlowDownUpload])
    SELECT [Id]
      ,[Tag]
      ,[Type]
      ,[SourceFilePath]
      ,[DestinationFilePath]
      ,[Priority]
      ,[State]
      ,[SubState]
      ,[StateDescription]
      ,[Progress]
      ,[Date_Created]
      ,[Date_Queued]
      ,[Date_Started]
      ,[Date_Finished]
      ,[Date_LastUpdated]
      ,[Optional_ParentDependancyTaskId]
      ,[Optional_isParentSuccessRequired]
      ,[Transfer_ProgressBytes]
      ,[Transfer_SpeedCurrentBps]
      ,[Transfer_SpeedIntervals]
      ,[IODrone_Id]
      ,[IODrone_Version]
      ,[Action]
      ,[Date_TransferStarted]
      ,[Optional_NotificationEmails]
      ,[MaxRetryCount]
      ,[CurrentRetryCount]
      ,[Impersonation_Username]
      ,[Impersonation_Password]
      ,[AllowRewrite]
      ,[SubTag]
      ,[SourceLengthBytes]
      ,[IODrone_Thread]
      ,[Date_FileSizeFetched]
      ,[Date_StornextTapeRetrievalStarted]
      ,[Date_StornextTapeRetrievalFinished]
      ,[IOServiceAddress]
      ,[LogString]
      ,[NotesString]
      ,[TX_Date]
      ,[SlowDownUpload]
  FROM [dbo].[IODBTaskHistory]

Execution Plan

enter image description here

If it's not clear form the picture 99% of the plan is spent on the clustered index insert on the new Identity column

I believe `INSERT INTO SELECT` should be quicker if you are not ordering the `SELECT` and there is no auto-growth. is there any DML query running in the background against this table? can you confirm there is no blocking — ughai, Apr 13 '15 at 10:58
Not understanding your comment. I'm already using INSERT INTO SELECT, like you asked me first. — MrBliz, Apr 13 '15 at 11:01
what I meant was if there is an `ORDER BY` in your `INSERT INTO SELECT` ? can you post your query which you are using — ughai, Apr 13 '15 at 11:03
This is running on my local machine to test. Nothing else is being done at the moment. There is no order by in the select. Source table is 18GB of data so i'd expect it to take a a while, but not all day. AUTO-Growth settings are By 1MB, Unlimited Would SELECT INTO FROM be quicker? — MrBliz, Apr 13 '15 at 11:15
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/75097/discussion-between-ughai-and-mrbliz). — ughai, Apr 13 '15 at 11:17
3 million rows is relatively small for sql server. you should try running the queries that takes a long time directly from SSMS and look at the query plan. this will provide you some data to start from. — Zohar Peled, Apr 13 '15 at 11:28
Just to rule out the obvious: did you verify that the PK actually is defined as "CLUSTERED"? — , Apr 13 '15 at 13:10
Normally copying rows isn't "very slow". 100k rows per second per index are a typical rate. — usr, Apr 13 '15 at 13:26
That's what i'd expect, but i'm looking at 100,000 rows a minute. — MrBliz, Apr 13 '15 at 13:27
The optimal strategy to execute this migration depends on the schema. The copy strategy certainly will not be excessively slow. Are you OK with 5min of downtime? This should never take longer than that. Post schema, code and execution plan. — usr, Apr 13 '15 at 13:29
Offtopic - re "My feeling is that it is the choice of a non-sequential GUID as the primary and clustering key that is the main culprit for poor performance." - is your feeling backed by hard data? What kind of queries you perform on your table? If for example you always query by some date field/range, then consider using that date field as clustered index instead. — Arvo, Apr 13 '15 at 13:51
There are non clustered indexes on the fields that are regularly searched. — MrBliz, Apr 13 '15 at 13:55
The query plan is fine. Rebuild the existing CI prior to performing the migration. That way we can make sure that fragmentation does not artificially lengthen the copy. Run the query and report the speed in rows/sec. If the speed is really low we need to dig deeper. — usr, Apr 13 '15 at 14:39

score 0 · Answer 1 · answered Apr 13 '15 at 12:46

In my experience the fastest way is to drop the existing clustered index

drop index index_name on tablename;

Then recreate the clustered index:

create clustered index indexname on tablename(columnname1, columnanme2);

If you are going to copy the data then copy into a target table that does not contain any indexes. After all the inserts are done then create the clustered index first and then all the other indexes. If you create the indexes before the inserts you will likely have lots, and lots, of page splits which can take a long time.

Page splits are not going to be an issue on the clustered index since it is on an identity column, which also means that creating the clustered index *after* inserting the data does not make sense, may as well create the clustered index first and only have to write the data twice. — GarethD, Apr 13 '15 at 13:08

Changing the Clustering key on an Existing Table - SQL Server 2008

1 Answers1