I am writing a data pump that gets information from the source_table
with columns code
, some_data
, oz1
, oz2
, oz3
, oz4
, and I do store it in target_table
with the same structure. In parallel, I would like to update also another table with different structure (say oz_table
) -- records code
, oz
-- that is four records instead of one record with four values (at maximum because the empty and NULL values for ozX
are not to be stored).
I am using the MERGE
command (Microsoft T-SQL) for the existing target_table
(four oz's in one record -- the old approach). The INSERTed/UPDATEd records are collected using the OUTPUT
mechanism into a table variable @info_table
. (The target records are intentionally never deleted if the source records disappear; so, there is no DELETE action.)
So far, I have a code like this:
CREATE PROCEDURE dbo.data_pump
AS
BEGIN
SET NOCOUNT ON
DECLARE @result int = -555 -- init (number of affected records)
DECLARE @info_table TABLE (
action nvarchar(10),
code int,
oz1 nvarchar(40),
oz2 nvarchar(40),
oz3 nvarchar(40),
oz4 nvarchar(40)
)
BEGIN TRANSACTION tran_data_pump
BEGIN TRY
MERGE target_table AS target
USING (SELECT code, some_data, oz1, oz2, oz3, oz4
FROM source_table) AS source
ON target.code = source.code
WHEN MATCHED AND (COALESCE(target.some_data, '') != COALESCE(source.some_data, '')
OR COALESCE(target.oz1, '') != COALESCE(source.oz1, '')
OR COALESCE(target.oz2, '') != COALESCE(source.oz2, '')
OR COALESCE(target.oz3, '') != COALESCE(source.oz3, '')
OR COALESCE(target.oz4, '') != COALESCE(source.oz4, '')
) THEN
UPDATE
SET target.some_data = source.some_data,
target.oz1 = source.oz1,
target.oz2 = source.oz2,
target.oz3 = source.oz3,
target.oz4 = source.oz4
WHEN NOT MATCHED THEN
INSERT (code, some_data,
oz1, oz2, oz3, oz4)
VALUES (source.code, source.some_data,
source.oz1, source.oz2, source.oz3, source.oz4)
OUTPUT
$action AS action, -- INSERT or UPDATE
inserted.code AS code,
inserted.oz1 AS oz1,
inserted.oz2 AS oz2,
inserted.oz3 AS oz3,
inserted.oz4 AS oz4
INTO @info_table;
SET @result = @@ROWCOUNT
COMMIT TRANSACTION tran_data_pump
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION tran_data_pump
SET @result = -1 -- transaction-failed indication
END CATCH
RETURN @result -- OK, number of the transfered records
END
It works fine, so far. Now I would like to process the @info_table
to insert/update the oz_table
. For the action UPDATE
, the records with the code
should be deleted first, and the new ones should be inserted. The order is not important, and the new number of the inserted records may be different. The NULL
or empty strings in oz's should not produce any record. For the INSERT
action, the situation is simpler just to insert the new records.
Update: The question was slightly modified to clarify the core of the question. The data tables can be defined like that:
CREATE TABLE dbo.source_table (
ID int IDENTITY PRIMARY KEY NOT NULL,
code int,
some_data nvarchar(50),
oz1 nvarchar(40),
oz2 nvarchar(40),
oz3 nvarchar(40),
oz4 nvarchar(40)
)
CREATE TABLE dbo.target_table (
ID int IDENTITY PRIMARY KEY NOT NULL,
code int,
some_data nvarchar(50),
oz2 nvarchar(40),
oz3 nvarchar(40),
oz1 nvarchar(40),
oz4 nvarchar(40)
)
CREATE TABLE dbo.oz_table (
ID int IDENTITY PRIMARY KEY NOT NULL,
code int,
oz nvarchar(40) NOT NULL
)
See the full testing script (creates database, the tables, calls the data_pump
at http://pastebin.com/wBz3Tzwn
How to do it nicely? I need the the efficient solution as the amount of date may be big, and the operation should be as fast as possible.