9

Im looking for the fastest way to copy a table and its contents on my sql server just simple copy of the table with the source and destination on the same server/database.

Currently with a stored procedure select * into sql statement it takes 6.75 minutes to copy over 4.7 million records. This is too slow.

CREATE PROCEDURE [dbo].[CopyTable1]
AS
BEGIN
DECLARE @mainTable VARCHAR(255),
        @backupTable VARCHAR(255),
        @sql VARCHAR(255),
        @qry nvarchar(max);

SET NOCOUNT ON;

Set @mainTable='Table1'
Set @backupTable=@mainTable + '_Previous'
IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(@backupTable) AND type in (N'U'))
BEGIN
    SET @Sql = 'if exists (select * from sysobjects '
    SET @Sql = @Sql + 'where id = object_id(N''[' + @backupTable + ']'') and '
    SET @Sql = @Sql + 'OBJECTPROPERTY(id, N''IsUserTable'') = 1) ' + CHAR(13)
    SET @Sql = @Sql + 'DROP TABLE [' + @backupTable + ']'
    EXEC (@Sql)
END

IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(@mainTable) AND type in (N'U'))
    SET @Sql = 'SELECT * INTO dbo.[' + @backupTable + '] FROM dbo.[' + @mainTable + ']'
    EXEC (@Sql)
END
RobDog888
  • 239
  • 1
  • 3
  • 14
  • 3
    Do you copy data so frequently that you need a generic function to do it? With 4.7 million rows you will be FAR better off doing this in batches instead of all at once. Your transaction log is going be huge during this operation. Splitting this into batches will ease the pressure on the log and allow other processes to continue in between batches. – Sean Lange Jul 22 '14 at 21:50
  • Have you looked into setting up replication instead of a generic procedure? If you are doing it often enough that you needed to build a dynamic SQL statement for it, I can't see why mirror or transactional replication wouldn't be better. edit to add: if this is going into a datawarehouse setup...have you considered an etl tool? – Twelfth Jul 22 '14 at 21:54
  • Its part of a process for a report that is to be generated/run each week. 4.7 is the numbers in dev but in prod I think its 6M + I hoped there was something like a "sp_copytable" function that would be lightning fast lol. I think selecting and inserting is the wrong way to go with this. Perhaps an internal copy function of technique? – RobDog888 Jul 22 '14 at 21:57
  • Btw, its running on MS SQL Sever 2005 – RobDog888 Jul 22 '14 at 21:58
  • If it's a weekly thing...look into replication. These numbers will only get higher and this copy will continue to get longer and longer. The only thing you can really try to disable from a performance perspective in sql is logging...otherwise you might have to look into the hardware. Have you looked into server optimization and run a few traces to discover where your bottleneck is? I'm guessing that your hard drive writes are your bottleneck. – Twelfth Jul 22 '14 at 22:26
  • It will only grow minimally each week. The table is dropped before the select into statement in the sp. Yea its a old server both hardware wise and running sql 2005. Not much chance to improve on those – RobDog888 Jul 22 '14 at 22:36
  • Transactional replication might be a better idea then...if there is little change, then all dropping and re-copying is going to do is recopy the exact same data you just dropped plus a few rows. THis way, only the few changes made to the table get 'copied' over. – Twelfth Jul 22 '14 at 23:03
  • If you knew which rows were new, you could just copy those in rather than dropping the table. This is the same idea as transactional replication. It might be faster – Nick.Mc Jan 02 '18 at 13:27
  • Compared two solutions: 1) insert into with (tablock) select from with (nolock) 2) select into from Booth takes the same time (ca. 100 sec.) Source table has ca 10m records (ca. 100 fields), few indexes, log size was not an issue, 2008 R2. No significant difference worth to write more code than select into. – DueGe Jan 02 '18 at 12:21

2 Answers2

8

If you are concerned about speed, it seems you have two alternatives; copying by block or the BCP/Bulk insert method.

Block Transfer

DECLARE 
    @CurrentRow bigint, @RowCount bigint, @CurrentBlock bigint

SET 
    @CurrentRow = 1

SELECT 
    @RowCount = Count(*)
FROM 
    oldtable 
WITH (NOLOCK)

WHILE @CurrentRow < @RowCount
BEGIN
    SET 
        @CurrentBlock = @CurrentRow + 1000000

INSERT INTO 
    newtable 
(FIELDS,GO,HERE)
SELECT 
    FIELDS,GO,HERE
FROM (
    SELECT 
        FIELDS,GO,HERE, ROW_NUMBER() OVER (ORDER BY SomeColumn) AS RowNum 
    FROM 
        oldtable
    WITH (NOLOCK)
    ) AS MyDerivedTable
WHERE 
    MyDerivedTable.RowNum BETWEEN @startRow AND @endRow

SET 
    @CurrentRow = @CurrentBlock + 1
end

How to copy a huge table data into another table in SQL Server

BCP/Bulk Insert

SELECT 
    * 
INTO 
    NewTable 
FROM 
    OldTable 
WHERE 
    1=2

BULK INSERT 
    NewTable 
FROM 
    'c:\temp\OldTable.txt' 
WITH (DATAFILETYPE  = 'native')

What is the fastest way to copy data from one table to another

http://www.databasejournal.com/features/mssql/article.php/3507171/Transferring-Data-from-One-Table-to-Another.htm

Community
  • 1
  • 1
AmmarCSE
  • 30,079
  • 5
  • 45
  • 53
  • None of these options are suitable to achieve the highest possible throughput. They are not too bad but not optimal. – usr Jul 22 '14 at 22:06
  • xp_cmdshell is disabled by our policies so I cant use bcp out to a file and back in to the destination table – RobDog888 Jul 22 '14 at 22:07
  • Block transfer seems better than what Im doing now but I dont have an id field. I have a 3 field composite primary key set up. – RobDog888 Jul 22 '14 at 22:08
  • 2
    @usr which method would you recommend? – RobDog888 Jul 22 '14 at 22:11
  • @RobDog888, that is interesting. Trying to figure out a solution to your composite key set up. – AmmarCSE Jul 22 '14 at 22:13
  • @usr, the OP was asking for a 'faster' way. While my answer may not be most optimal, it should be faster. I really dont feel that my answer deserved a downvote. – AmmarCSE Jul 22 '14 at 22:14
  • 1
    @RobDog888, I have revised my solution to use row count instead of id. Sorry about the horrible formatting. I plan on improving it shortly. – AmmarCSE Jul 22 '14 at 22:35
  • be careful with `(NOLOCK)` it can result in duplicate rows (even duplicate primary key rows) – Alex from Jitbit Jul 03 '21 at 09:04
  • Sorry but was `@startRow` and `@endRow` ever specified anywhere or is just assumed that you're using your batch start and end for those values? – DtechNet Sep 10 '21 at 21:33
6

You seem to want to copy a table that is a heap and has no indexes. That is the easiest case to get right. Just do a

insert into Target with (tablock) select * from Source

Make sure, that minimal logging for bulk operations is enabled (search for that term). Switch to the simple recovery model.

This will take up almost no log space because only allocations are logged with minimal logging.

This just scans the source in allocation order and append new bulk-allocated pages to the target.

Again, you have asked about the easiest case. Things get more complicated when indexes come into play.

Why not insert in batches? It's not necessary. Log space is not an issue. And because the target is not sorted (it is a heap) we don't need sort buffers.

usr
  • 168,620
  • 35
  • 240
  • 369
  • 1
    Just added the WITH(TABLOCK) and it already helped allot! Down from 6.75 mins to 3 mins! Still looking into the switch for simple recovery model... – RobDog888 Jul 22 '14 at 22:34
  • 1
    @RobDog888 when you're done implementing this you should see one CPU core pegged and the disk writing at a steady pace. That's the sign that you have achieved a "straight copy". – usr Jul 22 '14 at 22:37
  • @RobDog888 "It will only grow minimally each week." This peequed my interest... Consider MERGE'ing into the target. Reading and joining both tables, and then writing only small changes can be faster by far than rewriting an entire table. – usr Jul 22 '14 at 22:39
  • oh so a differential copy based upon changes to each field in each row? – RobDog888 Jul 22 '14 at 22:41
  • @RobDog888 yeah I have answered this a few years ago. Can't seem to find it. – usr Jul 22 '14 at 22:45
  • Delete all your destination indexes and use tablock - one transaction. 223 Million rows in 57 seconds. – DtechNet Sep 10 '21 at 21:44