Split a delimited string into a table with multiple rows and columns

Question

Please could you assist i am new to SQL and am faced with the scenario below.I have used google and tried to find a solution but have failed.

I have a temporary table named TEMP with a single column named results and rows depending on however long the csv string may be. When you SELECT * FROM #TEMP (The temporary table) it returns data like below:

results

88.47,1,263759,10.00|303.53,2,264051,13.00|147.92,3,264052,6.00|43.26,4,268394,10.00| 127.7,5,269229,4.00|

Please use link below to view what results look like directly from the database:
http://design.northdurban.com/DatabaseResult.png

I need a solution that reads this data from the existing temporary table and insert it into another temporary table with rows and columns like in the link below for example:

The required output is displayed in the link below

http://design.northdurban.com/capture.png

Please could you help as i am sure this post will assist many other users as i have not found any existing solution.

What version of SQL Server are you using? You can use a window function like `row_number()` to generate the `ID` column rows. — chridam, Jan 23 '15 at 11:54
In temp table you have 1 column with merged data like 88.47,1,263759,10.00? or you have 1 column and one row with one big merged data 88.47,1,263759,10.00| 303.53,2,264051,13.00|.......?\ — Giorgi Nakeuri, Jan 23 '15 at 11:56
I have one column in the temp table with 88.47,1,263759,10.00 and have multiple rows depending on how long the string is. — Zack, Jan 23 '15 at 12:03
See this question [Split function equivalent in T-SQL?](http://stackoverflow.com/questions/697519/split-function-equivalent-in-t-sql). — Rhys Jones, Jan 23 '15 at 12:04
You want to insert data from #TEMP1 table to #TEMP2 table...this is what you want to do? — Pranav Bilurkar, Jan 23 '15 at 12:07
Hi All THANK YOU for all your responses and suggestions i will try each one of them and get back to you all. — Zack, Jan 23 '15 at 14:59

Pரதீப் · Answer 1 · 2015-01-26T12:56:56.277

First convert the string to rows using the demiliter |

DECLARE @str VARCHAR(max)='88.47,1,263759,10.00| 303.53,2,264051,13.00| 147.92,3,264052,6.00| 43.26,4,268394,10.00| 127.7,5,269229,4.00|'

SELECT Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)')))
FROM   (SELECT Cast ('<M>' + Replace(@str, '|', '</M><M>') + '</M>' AS XML) AS Data) AS A
       CROSS APPLY Data.nodes ('/M') AS Split(a)

then convert the result to different column using parsename trick

SELECT Id,c1,c2,c3
FROM  (SELECT Id=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 4), ';', '.'),
              C1=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 3), ';', '.'),
              c2=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 2), ';', '.'),
              c3=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 1), ';', '.')
       FROM   (SELECT Cast ('<M>' + Replace(@str, '|', '</M><M>') + '</M>' AS XML) AS Data) AS A
              CROSS APPLY Data.nodes ('/M') AS Split(a)) a
WHERE  id IS NOT NULL

SQLFIDDLE DEMO

Update: To have a better performance try this.

SELECT c1,c2,c3,c4
FROM   (SELECT C1=Replace(Parsename(Replace(Replace(col, '.', ';'), ',', '.'), 4), ';', '.'),
               C2=Replace(Parsename(Replace(Replace(col, '.', ';'), ',', '.'), 3), ';', '.'),
               C3=Replace(Parsename(Replace(Replace(col, '.', ';'), ',', '.'), 2), ';', '.'),
               C4=Replace(Parsename(Replace(Replace(col, '.', ';'), ',', '.'), 1), ';', '.')
        FROM   (SELECT Split.a.value('.', 'VARCHAR(100)') col
                FROM   (SELECT Cast ('<M>' + Replace(@str, '|', '</M><M>') + '</M>' AS XML) AS Data) AS A
                       CROSS APPLY Data.nodes ('/M') AS Split(a))v) a
WHERE  c1 IS NOT NULL;

Update2: To parse more than one row from the table use this code.

Sample table with data

create table #test(string varchar(8000))
insert into #test values
('88.47,1,263759,10.00| 303.53,2,264051,13.00| 147.92,3,264052,6.00| 43.26,4,268394,10.00| 127.7,5,269229,4.00|'),
('88.47,1,263759,10.00| 303.53,2,264051,13.00| 147.92,3,264052,6.00| 43.26,4,268394,10.00| 127.7,5,269229,4.00|')

Query

SELECT c1,c2,c3,c4
FROM   (SELECT C1=Replace(Parsename(Replace(Replace(col, '.', ';'), ',', '.'), 4), ';', '.'),
               C2=Replace(Parsename(Replace(Replace(col, '.', ';'), ',', '.'), 3), ';', '.'),
               C3=Replace(Parsename(Replace(Replace(col, '.', ';'), ',', '.'), 2), ';', '.'),
               C4=Replace(Parsename(Replace(Replace(col, '.', ';'), ',', '.'), 1), ';', '.')
        FROM   (SELECT Split.a.value('.', 'VARCHAR(100)') col
                FROM   (SELECT Cast ('<M>' + Replace(string, '|', '</M><M>') + '</M>' AS XML)
                 AS Data from #test) AS A
                       CROSS APPLY Data.nodes ('/M') AS Split(a))v) a
WHERE  c1 IS NOT NULL;

Interesting. I wonder how quick this would be? It requires scanning over the string a huge number of times. — Martin Brown, Jan 23 '15 at 14:08
Downvoter can you please comment so that i can improve my answer — Pரதீப், Jan 23 '15 at 15:28
@MartinBrown - After converting it to rows each row is going to hold few data not the entire string — Pரதீப், Jan 23 '15 at 15:31
First you are doing a split which scans the string once. Then for each part of the split you are doing a split which scans the string again. Then for each part of that you are doing two replaces and a ParseName which is scanning each part three times. In total you are scanning the whole string five times. Also there are going to be a large number of intermediate strings created which all require memory allocations which if memory is fragmented may be slow. While this won't be an issue if there is only one string, if you have a couple of million strings to process that is going to add up. — Martin Brown, Jan 23 '15 at 18:02
Actually I slightly miss read the solution and I see that you are converting the string to XML with a Replace and the splitting the XML instead of doing two splits. But that still counts as two iterations over the string. — Martin Brown, Jan 23 '15 at 18:11
@MartinBrown - That is way better than looping character by character — Pரதீப், Jan 23 '15 at 18:12
No but then my solution only takes 38% of the time that yours takes. I just ran a performance test just to check I was correct. Notes on the technique at the end of my answer. — Martin Brown, Jan 25 '15 at 22:47
@MartinBrown - Yeah i see that. Updated my answer now it time is reduced 60% of previous time you can check. But seriously this not that bad solution to get a downvote — Pரதீப், Jan 26 '15 at 01:36
Hello NoDisplayName,@MartinBrown and everyone else assisting.I am faced with the following issue when testing @NoDisplayName solution.Please click on link really need you guys assistance. http://design.northdurban.com/RevisedQuestion.png — Zack, Jan 26 '15 at 12:50
@Zack - problem is due this query declare @str varchar(max) = (select results from #resultsz). Results table is having more than one row so you got that error. To parse all the rows in table i have updated my answer. check the `update2` — Pரதீப், Jan 26 '15 at 12:59

score 0 · Answer 2 · answered Jan 23 '15 at 12:21

0

This will only work if you you have 4 columns. In this situation you can do the following

SELECT REPLACE(PARSENAME(REPLACE(REPLACE(ColumnName, '.', '~'), ',', '.'), 4), '~', '.'),
 REPLACE(PARSENAME(REPLACE(REPLACE(ColumnName, '.', '~'), ',', '.'), 3), '~', '.'),
 REPLACE(PARSENAME(REPLACE(REPLACE(ColumnName, '.', '~'), ',', '.'), 2), '~', '.'),
 REPLACE(PARSENAME(REPLACE(REPLACE(ColumnName, '.', '~'), ',', '.'), 1), '~', '.')
From #TEMP

answered Jan 23 '15 at 12:21

Giorgi Nakeuri

35,155
8
47
75

Just changed `~` symbol instead of `;` in my answer – Pரதீப் Jan 23 '15 at 12:36
Sorry but you have Cross applies, for xml and other unneeded stuff. I didn't rewrite or copy pasted your solution. I wrote it by myself. – Giorgi Nakeuri Jan 23 '15 at 12:40
this is just half solution – Pரதீப் Jan 23 '15 at 12:44

Martin Brown · Answer 3 · 2015-01-25T22:44:55.370

You can write a table value function to parse the string like this:

CREATE FUNCTION dbo.parseData ( @stringToSplit VARCHAR(MAX) )
RETURNS
    @return TABLE (ID int, Column1 real, Column2 int, Column3 int, Column4 real)
AS
BEGIN

    DECLARE @char char;
    DECLARE @len int = LEN(@stringToSplit);    

    DECLARE @buffer varchar(50) = '';

    DECLARE @field int = 1;

    DECLARE @Column1 real
    DECLARE @Column2 int
    DECLARE @Column3 int
    DECLARE @Column4 real

    DECLARE @row int = 1

    DECLARE @i int = 1;
    WHILE @i <= @len BEGIN

        SELECT @char = SUBSTRING(@stringToSplit, @i, 1)

        IF @char = ','
        BEGIN
            IF @field = 1
                SET @Column1 = CONVERT(real, @buffer);
            ELSE IF @field = 2
                SET @Column2 = CONVERT(int, @buffer);
            ELSE IF @field = 3
                SET @Column3 = CONVERT(int, @buffer);    
            SET @buffer = '';
            SET @field = @field + 1
        END
        ELSE IF @char = '|'
        BEGIN
            SET @Column4 = CONVERT(real, @buffer);
            INSERT INTO @return (ID, Column1, Column2, Column3, Column4)
            VALUES (@row, @Column1, @Column2, @Column3, @Column4);
            SET @buffer = '';
            SET @row = @row + 1
            SET @field = 1
        END
        ELSE
        BEGIN
            SET @buffer = @buffer + @char
        END

        SET @i = @i + 1;
    END

    RETURN
END
GO

And then call that function like this:

SELECT Col1 = '88.47,1,263759,10.00| 303.53,2,264051,13.00| 147.92,3,264052,6.00| 43.26,4,268394,10.00| 127.7,5,269229,4.00|'
INTO #Temp1;

INSERT INTO #Temp1
VALUES ('88.47,1,263759,10.00| 303.53,2,264051,13.00| 147.92,3,264052,6.00| 43.26,4,268394,10.00| 127.7,5,269229,4.00|')

SELECT data.*
INTO #Temp2
FROM #Temp1 CROSS APPLY parseData(#Temp1.Col1) as data

SELECT *
FROM #Temp2

DROP TABLE #Temp1
DROP TABLE #Temp2

Performance:

So I ran a performance test of this technique against the technique described by NoDisplayName. Over 10,000 iterations my technique took 13,826 and NoDisplayName's took 36,176 so mine only takes 38% of the time NoDisplayName's does.

To test this I used an Azure database and ran the following script.

-- First two queries to check the results are the same.
-- Note the Parsename technique returns strings rather than reals which is why
-- the last column has .00 at the end of the numbers in the Parsename tecnique.
DECLARE @str VARCHAR(max)='88.47,1,263759,10.00| 303.53,2,264051,13.00| 147.92,3,264052,6.00| 43.26,4,268394,10.00| 127.7,5,269229,4.01|'

SELECT c1,c2,c3, c4
    FROM  (SELECT C1=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 4), ';', '.'),
                  C2=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 3), ';', '.'),
                  C3=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 2), ';', '.'),
                  C4=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 1), ';', '.')
           FROM   (SELECT Cast ('<M>' + Replace(@str, '|', '</M><M>') + '</M>' AS XML) AS Data) AS A
                  CROSS APPLY Data.nodes ('/M') AS Split(a)) a
    WHERE  c1 IS NOT NULL;

SELECT *
FROM dbo.parseData(@str)
GO

-- Now lets time the Parsename method over 10,000 itterations
SET NOCOUNT ON;

DECLARE @str VARCHAR(max)='88.47,1,263759,10.00| 303.53,2,264051,13.00| 147.92,3,264052,6.00| 43.26,4,268394,10.00| 127.7,5,269229,4.00|'

DECLARE @i int = 0
declare @table table (c1 decimal, c2 int, c3 int, c4 decimal)

DECLARE @Start datetime = GETDATE();

while @i < 1000
begin

    INSERT INTO @table
    SELECT c1,c2,c3, c4
    FROM  (SELECT C1=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 4), ';', '.'),
                  C2=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 3), ';', '.'),
                  C3=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 2), ';', '.'),
                  C4=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 1), ';', '.')
           FROM   (SELECT Cast ('<M>' + Replace(@str, '|', '</M><M>') + '</M>' AS XML) AS Data) AS A
                  CROSS APPLY Data.nodes ('/M') AS Split(a)) a
    WHERE  c1 IS NOT NULL;

    DELETE FROM @table;

    set @i = @i + 1;
end

DECLARE @End datetime = GETDATE()
PRINT CONVERT(nvarchar(50),@Start,126) + ' - ' + convert(nvarchar(50),@End,126) + ' - ' + convert(nvarchar(50), DATEDIFF(ms, @start, @end))
GO

-- Now the my technique over 10,000 itterations
SET NOCOUNT ON;

DECLARE @str VARCHAR(max)='88.47,1,263759,10.00| 303.53,2,264051,13.00| 147.92,3,264052,6.00| 43.26,4,268394,10.00| 127.7,5,269229,4.00|'

DECLARE @i int = 0
declare @table table (c1 decimal, c2 int, c3 int, c4 decimal)

DECLARE @Start datetime = GETDATE();

while @i < 1000
begin

    INSERT INTO @table
    SELECT *
    FROM dbo.parseData(@str)
    DELETE FROM @table;

    set @i = @i + 1;
end
DECLARE @End datetime = GETDATE()
PRINT CONVERT(nvarchar(50),@Start,126) + ' - ' + convert(nvarchar(50),@End,126) + ' - ' + convert(nvarchar(50), DATEDIFF(ms, @start, @end))
GO

Split a delimited string into a table with multiple rows and columns

3 Answers3

SQLFIDDLE DEMO

Linked