17

I have a need to create a function the will return nth element of a delimited string.

For a data migration project, I am converting JSON audit records stored in a SQL Server database into a structured report using SQL script. Goal is to deliver a sql script and a sql function used by the script without any code.

(This is a short-term fix will be used while a new auditing feature is added the ASP.NET/MVC application)

There is no shortage of delimited string to table examples available. I've chosen a Common Table Expression example http://www.sqlperformance.com/2012/07/t-sql-queries/split-strings

Example: I want to return 67 from '1,222,2,67,888,1111'

Shnugo
  • 66,100
  • 9
  • 53
  • 114
Gary Kindel
  • 17,071
  • 7
  • 49
  • 66

14 Answers14

42

This is the easiest answer to retrieve the 67 (type-safe!!):

SELECT CAST('<x>' + REPLACE('1,222,2,67,888,1111',',','</x><x>') + '</x>' AS XML).value('/x[4]','int')

In the following you will find examples how to use this with variables for the string, the delimiter and the position (even for edge-cases with XML-forbidden characters)

##The easy one

This question is not about a string split approach, but about how to get the nth element. The easiest, fully inlineable way would be this IMO:

This is a real one-liner to get part 2 delimited by a space:

DECLARE @input NVARCHAR(100)=N'part1 part2 part3';
SELECT CAST(N'<x>' + REPLACE(@input,N' ',N'</x><x>') + N'</x>' AS XML).value('/x[2]','nvarchar(max)')

##Variables can be used with sql:variable() or sql:column()

Of course you can use variables for delimiter and position (use sql:column to retrieve the position directly from a query's value):

DECLARE @dlmt NVARCHAR(10)=N' ';
DECLARE @pos INT = 2;
SELECT CAST(N'<x>' + REPLACE(@input,@dlmt,N'</x><x>') + N'</x>' AS XML).value('/x[sql:variable("@pos")][1]','nvarchar(max)')

##Edge-Case with XML-forbidden characters

If your string might include forbidden characters, you still can do it this way. Just use FOR XML PATH on your string first to replace all forbidden characters with the fitting escape sequence implicitly.

It's a very special case if - additionally - your delimiter is the semicolon. In this case I replace the delimiter first to '#DLMT#', and replace this to the XML tags finally:

SET @input=N'Some <, > and &;Other äöü@€;One more';
SET @dlmt=N';';
SELECT CAST(N'<x>' + REPLACE((SELECT REPLACE(@input,@dlmt,'#DLMT#') AS [*] FOR XML PATH('')),N'#DLMT#',N'</x><x>') + N'</x>' AS XML).value('/x[sql:variable("@pos")][1]','nvarchar(max)');

##UPDATE for SQL-Server 2016+

Regretfully the developers forgot to return the part's index with STRING_SPLIT. But, using SQL-Server 2016+, there is JSON_VALUE and OPENJSON.

With JSON_VALUE we can pass in the position as the index' array.

For OPENJSON the documentation states clearly:

When OPENJSON parses a JSON array, the function returns the indexes of the elements in the JSON text as keys.

A string like 1,2,3 needs nothing more than brackets: [1,2,3].
A string of words like this is an example needs to be ["this","is","an"," example"].
These are very easy string operations. Just try it out:

DECLARE @str VARCHAR(100)='Hello John Smith';
DECLARE @position INT = 2;

--We can build the json-path '$[1]' using CONCAT
SELECT JSON_VALUE('["' + REPLACE(@str,' ','","') + '"]',CONCAT('$[',@position-1,']'));

--See this for a position safe string-splitter (zero-based):

SELECT  JsonArray.[key] AS [Position]
       ,JsonArray.[value] AS [Part]
FROM OPENJSON('["' + REPLACE(@str,' ','","') + '"]') JsonArray

In this post I tested various approaches and found, that OPENJSON is really fast. Even much faster than the famous "delimitedSplit8k()" method...

##UPDATE 2 - Get the values type-safe

We can use an array within an array simply by using doubled [[]]. This allows for a typed WITH-clause:

DECLARE  @SomeDelimitedString VARCHAR(100)='part1|1|20190920';

DECLARE @JsonArray NVARCHAR(MAX)=CONCAT('[["',REPLACE(@SomeDelimitedString,'|','","'),'"]]');

SELECT @SomeDelimitedString          AS TheOriginal
      ,@JsonArray                    AS TransformedToJSON
      ,ValuesFromTheArray.*
FROM OPENJSON(@JsonArray)
WITH(TheFirstFragment VARCHAR(100) '$[0]'
    ,TheSecondFragment INT '$[1]'
    ,TheThirdFragment DATE '$[2]') ValuesFromTheArray
danronmoon
  • 3,814
  • 5
  • 34
  • 56
Shnugo
  • 66,100
  • 9
  • 53
  • 114
  • While this works, on some string I get the error "JSON text is not properly formatted. Unexpected character 'F' is found at position 105." Any idea what's going on there? – Schoof Jun 25 '21 at 14:43
  • Example of such a string that fails: `qsdfqsdfsdf"qsdfqsdf` Basically when the string contains a quote. – Schoof Jun 25 '21 at 15:02
  • @Schoof You might [use `STRING_ESCAPE()`](https://learn.microsoft.com/en-us/sql/t-sql/functions/string-escape-transact-sql?view=sql-server-ver15), which is currently supporting only JSON anyway :-) – Shnugo Jun 25 '21 at 15:04
  • Thanks! This works but makes the end result a lot slower than using STRING_SPLIT unfortunately. – Schoof Jun 28 '21 at 09:04
  • 1
    @Schoof You might use a simple replace() instead. Strings are always tricky when it comes to *magic characters* or *markup*... – Shnugo Jun 28 '21 at 11:35
  • @Shnugo STRING_SPLIT returns the index in SQL Server 2022. Arguably the only new non-Azure feature, which makes it a pretty expensive feature – Panagiotis Kanavos Jul 19 '23 at 19:09
6

On Azure SQL Database, and on SQL Server 2022, STRING_SPLIT now has an optional ordinal parameter. If the parameter is omitted, or 0 is passed, then the function acts as it did before, and just returns a value column and the order is not guaranteed. If you pass the parameter with the value 1 then the function returns 2 columns, value, and ordinal which (unsurprisingly) provides the ordinal position of the value within the string.

So, if you wanted the 4th delimited value from the string '1,222,2,67,888,1111' you could do the following:

SELECT [value]
FROM STRING_SPLIT('1,222,2,67,888,1111',',',1)
WHERE ordinal = 4;

If the value was in a column, it would look like this:

SELECT SS.[value]
FROM dbo.YourTable YT
     CROSS APPLY STRING_SPLIT(YT.YourColumn,',',1) SS
WHERE SS.ordinal = 4;
Thom A
  • 88,727
  • 11
  • 45
  • 75
5

Here is my initial solution... It is based on work by Aaron Bertrand http://www.sqlperformance.com/2012/07/t-sql-queries/split-strings

I simply changed the return type to make it a scalar function.

Example: SELECT dbo.GetSplitString_CTE('1,222,2,67,888,1111',',',4)

CREATE FUNCTION dbo.GetSplitString_CTE
(
   @List       VARCHAR(MAX),
   @Delimiter  VARCHAR(255),
   @ElementNumber int
)
RETURNS VARCHAR(4000)
AS
BEGIN

   DECLARE @result varchar(4000)    
   DECLARE @Items TABLE ( position int IDENTITY PRIMARY KEY,
                          Item VARCHAR(4000)
                         )  

   DECLARE @ll INT = LEN(@List) + 1, @ld INT = LEN(@Delimiter);  

   WITH a AS
   (
       SELECT
           [start] = 1,
           [end]   = COALESCE(NULLIF(CHARINDEX(@Delimiter, 
                       @List, @ld), 0), @ll),
           [value] = SUBSTRING(@List, 1, 
                     COALESCE(NULLIF(CHARINDEX(@Delimiter, 
                       @List, @ld), 0), @ll) - 1)
       UNION ALL
       SELECT
           [start] = CONVERT(INT, [end]) + @ld,
           [end]   = COALESCE(NULLIF(CHARINDEX(@Delimiter, 
                       @List, [end] + @ld), 0), @ll),
           [value] = SUBSTRING(@List, [end] + @ld, 
                     COALESCE(NULLIF(CHARINDEX(@Delimiter, 
                       @List, [end] + @ld), 0), @ll)-[end]-@ld)
       FROM a
       WHERE [end] < @ll
   )
   INSERT @Items SELECT [value]
   FROM a
   WHERE LEN([value]) > 0
   OPTION (MAXRECURSION 0);

   SELECT @result=Item
   FROM @Items
   WHERE position=@ElementNumber

   RETURN @result;
END
GO
Gary Kindel
  • 17,071
  • 7
  • 49
  • 66
  • 7
    It is quite an overhead first to use a heavy recursive CTE to split your string, just to pick out *the nth element*. This can be done much easier... – Shnugo Jul 13 '16 at 11:28
5

How about:

CREATE FUNCTION dbo.NTH_ELEMENT (@Input NVARCHAR(MAX), @Delim CHAR = '-', @N INT = 0)
RETURNS NVARCHAR(MAX)
AS
BEGIN
RETURN (SELECT VALUE FROM STRING_SPLIT(@Input, @Delim) ORDER BY (SELECT NULL) OFFSET @N ROWS FETCH NEXT 1 ROW ONLY)
END
Antoine Thiry
  • 2,362
  • 4
  • 28
  • 42
  • 5
    `STRING_SPLIT()` does not guarantee to return the elements in the given order. As this needs v2016 it is a better approach to use `OPENJSON`, which returns a `[key]` containing the element's index in a JSON-array. You might [read this](https://stackoverflow.com/a/51401270/5089204) – Shnugo Nov 29 '18 at 22:33
  • Thanks @Schugo - yet another use for JSON that I hadn't thought of! – David Johnston Dec 01 '18 at 11:12
  • 4
    Too bad you can't down vote the author of the STRING_SPLIT() - not ordered? that nearly completely kills it. It should have returned idx,value, so you can sort on the index... (or wow: select value from string_Split(str,delim) where idx=@idx) and implement what people want in seconds.. would be a one liner if they had spent a few minutes and designed it before implementing it. Amazing. I was going use the heck out of this function, but now I can't think of a use for it, unless, you happen to have non-ordered set of things which tends to be rare. – Traderhut Games Jan 29 '19 at 23:23
  • 2
    So using suggestion by @Schnugo, my proposed solution becomes `CREATE OR ALTER FUNCTION dbo.NTH_ELEMENT (@Input NVARCHAR(MAX), @Delim CHAR = '-', @N INT = 0) RETURNS NVARCHAR(MAX) AS BEGIN RETURN (SELECT value FROM OPENJSON('["' + REPLACE(@Input, @Delim, '","') + '"]') WHERE [key] = @N) END` – David Johnston Jan 31 '19 at 09:44
3

@a - the value (f.e. 'a/bb/ccc/dddd/ee/ff/....')

@p - the desired position (1,2,3...)

@d - the delimeter ( '/' )

trim(substring(replace(@a,@d,replicate(' ',len(@a))),(@p-1)*len(@a)+1,len(@a)))

only problem is - if desired part has trailing or leading blanks they get trimmed.

Completely Based on article from https://exceljet.net/formula/split-text-with-delimiter

Lior
  • 41
  • 1
2

In a rare moment of lunacy I just thought that split is far easier if we use XML to parse it out for us:

(Using the variables from @Gary Kindel's answer)

declare @xml xml
set @xml = '<split><el>' + replace(@list,@Delimiter,'</el><el>') + '</el></split>'

select
    el = split.el.value('.','varchar(max)')
from  @xml.nodes('/split/el') split(el))

This lists all elements of the string, split by the specified character.

We can use an xpath test to filter out empty values, and a further xpath test to restrict this to the element we're interested in. In full Gary's function becomes:

alter FUNCTION dbo.GetSplitString_CTE
(
   @List       VARCHAR(MAX),
   @Delimiter  VARCHAR(255),
   @ElementNumber int
)
RETURNS VARCHAR(max)
AS
BEGIN
       -- escape any XML https://dba.stackexchange.com/a/143140/65992  
       set @list = convert(VARCHAR(MAX),(select @list for xml path(''), type));

       declare @xml xml
       set @xml = '<split><el>' + replace(@list,@Delimiter,'</el><el>') + '</el></split>'

       declare @ret varchar(max)
       set @ret = (select
              el = split.el.value('.','varchar(max)')
       from  @xml.nodes('/split/el[string-length(.)>0][position() = sql:variable("@elementnumber")]') split(el))

       return @ret

END
Matthew Lock
  • 13,144
  • 12
  • 92
  • 130
Jon Egerton
  • 40,401
  • 11
  • 97
  • 129
  • Great solution. I wonder what is more expensive table variable with identity or XML. Jon, I'll have to create your function and run side by side with CTE solution on a large result set and see which uses more resources. – Gary Kindel Oct 18 '13 at 13:22
  • Would be interested to know - the CTE does a lot of string operations. For your example (with numbers only in the elements) I didn't bother parsing out illegals and replacing with xml entities (eg `' => @apos;`). With more complex strings you might need to (however its often not the case with split operations) – Jon Egerton Oct 18 '13 at 13:50
  • Just re-read your comment.If you want to store the stuff you can still save the output to the table variable easily enough. The comparison is between the two parsing mechanisms. – Jon Egerton Oct 19 '13 at 21:12
  • @JonEgerton This can be done much easier... There's no need first to create a list with `.nodes()` and then pick out the right element. You can use the position directly. I placed an answer myself, might be interesting for you... – Shnugo Jul 09 '16 at 15:38
  • I added some XML escaping in case @list has any XML chars like & in it – Matthew Lock Oct 31 '19 at 08:11
1

you can put this select into UFN. if you need you can customize it for specifying delimiter as well. in that case your ufn will have two input. number Nth and delimiter to use.

    DECLARE @tlist varchar(max)='10,20,30,40,50,60,70,80,90,100'
    DECLARE @i INT=1, @nth INT=3
    While len(@tlist) <> 0
    BEGIN
            IF @i=@nth
            BEGIN
              select Case when charindex(',',@tlist) <> 0 Then LEFT(@tlist,charindex(',',@tlist)-1)
                          Else @tlist
                    END
            END

              Select @tlist = Case when charindex(',',@tlist) <> 0 Then substring(@tlist,charindex(',',@tlist)+1,len(@tlist))
                          Else ''
                          END

            SELECT @i=@i+1
    END
Anup Shah
  • 1,256
  • 10
  • 15
1

Alternatively, one can use xml, nodes() and ROW_NUMBER. We can order the elements based on their document order. For example:

DECLARE @Input VARCHAR(100) = '1a,2b,3c,4d,5e,6f,7g,8h'
       ,@Number TINYINT = 3

DECLARE @XML XML;
DECLARE @value VARCHAR(100);

SET @XML = CAST('<x>' + REPLACE(@Input,',','</x><x>') + '</x>' AS XML);

WITH DataSource ([rowID], [rowValue]) AS
(
    SELECT ROW_NUMBER() OVER (ORDER BY T.c ASC) 
            ,T.c.value('.', 'VARCHAR(100)')
    FROM @XML.nodes('./x') T(c)
)
SELECT @value = [rowValue]
FROM DataSource
WHERE [rowID] = @Number;

SELECT @value;
gotqn
  • 42,737
  • 46
  • 157
  • 243
0

I would rather create a temp table with an identity column and fill it up with output from the SPLIT function.

  CREATE TABLE #tblVals(Id INT IDENTITY(1,1), Val NVARCHAR(100))
  INSERT INTO #tblVals (Val)
  SELECT [value] FROM STRING_SPLIT('Val1-Val3-Val2-Val5', '-')
  SELECT * FROM #tblVals

Now you can easily do something like below.

DECLARE @val2 NVARCHAR(100) = (SELECT TOP 1 Val FROM #tblVals WHERE Id = 2)

See the snapshot below:

see the snapshot

C. Peck
  • 3,641
  • 3
  • 19
  • 36
  • 1
    You don't have any guarantee that STRING_SPLIT will return the values in the same order they are in the input string. I quote [the documentation](https://learn.microsoft.com/en-us/sql/t-sql/functions/string-split-transact-sql?view=sql-server-ver15) for STRING SPLIT: `The output rows might be in any order. The order is not guaranteed to match the order of the substrings in the input string. ` Also, you don't need a temp table, auto int etc, you could just `SELECT[value],ROW_NUMBER()OVER() as i FROM STRING_SPLIT...`, but as noted.. there is no order guarantee.. – Caius Jard Jun 03 '21 at 09:20
0

We have the answer over below url.

DECLARE @ AS VARCHAR(MAX) = 'Pawan1,Pawan2,Pawan4,Pawan3' 
SELECT VALUE FROM 
(
    SELECT VALUE , ROW_NUMBER() OVER (ORDER BY (SELECT null)) rnk FROM STRING_SPLIT(@, ',')
)x where rnk = 3
GO

https://msbiskills.com/2018/06/15/sql-puzzle-multiple-ways-to-split-a-string-and-get-nth-row-xml-advanced-sql/

Prabhat
  • 772
  • 1
  • 8
  • 22
-1

You can use STRING_SPLIT with ROW_NUMBER:

SELECT value, idx FROM
(
  SELECT
    value,
    ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) idx
  FROM STRING_SPLIT('Lorem ipsum dolor sit amet.', ' ')
) t
WHERE idx=2

returns second element (idx=2): 'ipsum'

tibx
  • 840
  • 13
  • 20
  • 2
    The [fine manual](https://learn.microsoft.com/en-us/sql/t-sql/functions/string-split-transact-sql?view=sql-server-ver15) states `The output rows might be in any order. The order is not guaranteed to match the order of the substrings in the input string. ` – Caius Jard Jun 03 '21 at 09:25
-1

you can create simple table variable and use it as below

Declare @tbl_split Table (Id INT IDENTITY(1,1), VAL VARCHAR(50))
INSERT @tbl_split SELECT VALUE
FROM string_split('999999:01', ':')

Select val from @tbl_split
WHERE Id=2
buddemat
  • 4,552
  • 14
  • 29
  • 49
-2

I don't have enough reputation to comment, so I am adding an answer. Please adjust as appropriate.

I have a problem with Gary Kindel's answer for cases where there is nothing between the two delimiters

If you do select * from dbo.GetSplitString_CTE('abc^def^^ghi','^',3) you get ghi instead of an empty string

If you comment out the WHERE LEN([value]) > 0 line, you get the desired result

hanksterr7
  • 105
  • 1
  • 4
-2

I cannot comment on Gary's solution because of my low reputation

I know Gary was referencing another link.

I have struggled to understand why we need this variable

@ld INT = LEN(@Delimiter)

I also don't understand why charindex has to start at the position of length of delimiter, @ld

I tested with many examples with a single character delimiter, and they work. Most of the time, delimiter character is a single character. However, since the developer included the ld as length of delimiter, the code has to work for delimiters that have more than one character

In this case, the following case will fail

11,,,22,,,33,,,44,,,55,,,

I cloned from the codes from this link. http://codebetter.com/raymondlewallen/2005/10/26/quick-t-sql-to-parse-a-delimited-string/

I have tested various scenarios including the delimiters that have more than one character

alter FUNCTION [dbo].[split1]
(
    @string1 VARCHAR(8000) -- List of delimited items
    , @Delimiter VARCHAR(40) = ',' -- delimiter that separates items
    , @ElementNumber int
)
RETURNS varchar(8000)
AS
BEGIN
    declare @position int
    declare @piece varchar(8000)=''
    declare @returnVal varchar(8000)=''
    declare @Pattern varchar(50) = '%' + @Delimiter + '%'
    declare @counter int =0
    declare @ld int = len(@Delimiter)
    declare @ls1 int = len (@string1)
    declare @foundit int = 0

    if patindex(@Pattern , @string1) = 0
        return  ''

    if right(rtrim(@string1),1) <> @Delimiter
        set @string1 = @string1  + @Delimiter

    set @position =  patindex(@Pattern , @string1) + @ld  -1  
    while @position > 0
    begin
        set @counter = @counter +1 
        set @ls1  = len (@string1)
        if (@ls1 >= @ld)
            set @piece = left(@string1, @position - @ld)
        else
            break
        if (@counter = @ElementNumber)
        begin
            set @foundit = 1
                break
        end
        if len(@string1) > 0
        begin
            set @string1 = stuff(@string1, 1, @position, '')
            set @position =  patindex(@Pattern , @string1) + @ld  -1  
        end
        else
        set @position = -1
    end 


    if @foundit =1
        set @returnVal = @piece
    else 
        set @returnVal =  ''
    return @returnVal
user12345
  • 181
  • 1
  • 2
  • 18
  • 2
    It seems like you're asking a question here. Are you? If not, please remove the parts where you asking things. – Artjom B. Oct 03 '15 at 16:19