2

The following code converts the CSV string to a table with its original order (the order is returned in the column Pos).

declare @input nvarchar(max) = 'x,y,z';

declare @x xml = '<s>' + replace(@input, ',', '</s><s>') + '</s>';

select  c.value('.', 'nvarchar(max)') Col, identity(int) Pos
into    #t
from    @x.nodes('/s') x ( c );

select * from #t;
drop table #t

returns

Col Pos
y   1
x   2
z   3

There are two problem

  1. The Pos is not right, y got 1 and x got 2.
  2. identity(int) requires select into

For input of 'z,y,x', z, y, and x should get 1, 2, and 3 respectively.

ca9163d9
  • 27,283
  • 64
  • 210
  • 413

1 Answers1

3

This is not guaranteed to be stable at all:

declare @input nvarchar(max) = 'x,y,z';
declare @x xml = '<s>' + replace(@input, ',', '</s><s>') + '</s>';

select  c.value('.', 'nvarchar(max)') Col, 
        ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS pos
from    @x.nodes('/s') x ( c )
OPTION (MAXDOP 1);

DBFiddle Demo

  1. Disable parallelism
  2. Identity(int) -> ROW_NUMBER

Modified split function:

CREATE FUNCTION dbo.Split(@origString varchar(max), @Delimiter char(1))     
returns @temptable TABLE (pos INT IDENTITY(1,1), items varchar(max))     
as     
begin  


declare @idx int     
declare @split varchar(max)     

select @idx = 1     
    if len(@origString )<1 or @origString is null  return     

while @idx!= 0     
begin     
    set @idx = charindex(@Delimiter,@origString)     
    if @idx!=0     
        set @split= left(@origString,@idx - 1)     
    else     
        set @split= @origString

    if(len(@split)>0)
        insert into @temptable(Items) values(@split)     

    set @origString= right(@origString,len(@origString) - @idx)     
    if len(@origString) = 0 break     
end 
return     
end

Call:

SELECT *
FROM dbo.Split('z,y,x', ',');

DBFiddle Demo2

Lukasz Szozda
  • 162,964
  • 23
  • 234
  • 275
  • Can you elaborate "This is not guaranteed to be stable at all" while your solution works? – ca9163d9 Apr 20 '18 at 19:43
  • @ca9163d9 The point is you have arbitrary CSV `c,b,a`. With ROW_NUMBER() I order by SELECT 1 which is not stable sort. I disable parallelism so the rows "should" be processed from first to last but it is not guaranteed. If you want stable solution you should define order like `ROW_NUMBER() OVER(ORDER BY c.value('.', 'nvarchar(max)'))` but it will not be the same as defined in CSV – Lukasz Szozda Apr 20 '18 at 19:44
  • Actually, I want the column `Pos` be the physical position of the elements in the string. I tried `z,y,x` and it does return 1,2,3 for z,y,x, as expected. – ca9163d9 Apr 20 '18 at 19:49
  • @ca9163d9 This is exactly what I've written. You can't be sure how optimizer chooses to process rows after parsing XML (it "could" do it from first to last but it is free to do it in any way) – Lukasz Szozda Apr 20 '18 at 19:50
  • So I cannot rely on the code? And the only option maybe write a function to parse the string? – ca9163d9 Apr 20 '18 at 19:52
  • To get "guaranteed solution" based on order in CSV, you could write table-function with WHILE LOOP – Lukasz Szozda Apr 20 '18 at 19:53
  • I wish the `row_number()` can be guaranteed because I may need to run the query on a server I don't have permission to create an object. – ca9163d9 Apr 20 '18 at 20:04
  • 2
    You really shouldn't offer loop based splitters. They are horrifically slow. Here is my personal choice. http://www.sqlservercentral.com/articles/Tally+Table/72993/ Some other excellent alternatives can be found here. https://sqlperformance.com/2012/07/t-sql-queries/split-strings – Sean Lange Apr 20 '18 at 20:35