0

I have read numerous posts about how to do exactly this and I can not get this to work. I have a table with a nvarchar(max) column that can potentially contain dozens of pages worth of text. I need to split this into separate rows where each carriage return is entered (to split out the paragraphs).

I found the below function:

CREATE FUNCTION [dbo].[udf-Str-Parse] 
    (@String varchar(max), @Delimiter varchar(10))
Returns Table 
As
    Return 
        (Select 
             RetSeq = Row_Number() over (Order By (Select null)),
             RetVal = LTrim(RTrim(B.i.value('(./text())[1]', 'varchar(max)')))
         From 
             (Select x = Cast('<x>'+ Replace(@String, @Delimiter, '</x><x>') + '</x>' as xml).query('.')) as A 
         Cross Apply 
             x.nodes('x') AS B(i)
);

I have tested this on a simple dummy table using spaces as a delimiter and it works, but when I try this on my actual table with text using the following code query;

Select 
    A.AssignmentID, Notes = B.RetVal
From  
    dbo.Assignments A
Cross Apply 
    [dbo].[udf-Str-Parse](A.TitleNotes, char(13)) B
Where 
    AssignmentID = 20

I get the following message:

Msg 9411, Level 16, State 1, Line 1
XML parsing: line 2, character 127, semicolon expected

I have tried using different delimiters other than char(13), including spaces like with my simple table example, but I just get errors expecting semicolons.

Does anyone see what I am doing wrong?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • 1
    Does your actual data contain any embedded `<` or `>` characters? Or, worse yet, any embedded XML-like syntax (e.g. ``)? – Forty3 Dec 13 '17 at 19:31
  • The text entered wouldn't be any sort of code and contains a lot of comments about title documents. It may be possible for there to be some < or > characters, though I would be surprised to see them. – user2638607 Dec 13 '17 at 19:37
  • I checked checked the text in the particular AssigmentID=20 used in the code above and there aren't any. – user2638607 Dec 13 '17 at 19:38
  • I came across this SO answer [XML Semicolon needed](https://stackoverflow.com/a/40295835/3485669) which may shed some light. tl;dr; - check to make sure your text doesn't contain ampersands. – Forty3 Dec 13 '17 at 19:41
  • Awesome, I changed my select statement to include a replace function to change '&' to '&' and it works beautifully. I suppose I'll need to also make sure < > get replaced if they are entered. Thanks so much. – user2638607 Dec 13 '17 at 19:52
  • Or perhaps it's time to choose [another string splitting function...](https://sqlperformance.com/2012/07/t-sql-queries/split-strings) – Zohar Peled Dec 13 '17 at 19:53
  • with SQL Server 2012 you don't need to reinvent the wheel. Just use `string_split` function. – Alex Kudryashev Dec 13 '17 at 20:45
  • @AlexKudryashev `strig_split` was added on 2016 version... – Zohar Peled Dec 14 '17 at 09:31
  • @ZoharPeled In my SQL 2012-sp3 (v11.0.6251) `string_split` exists and works. – Alex Kudryashev Dec 14 '17 at 16:59
  • @AlexKudryashev That is strange. According to [official documentation](https://learn.microsoft.com/en-us/sql/t-sql/functions/string-split-transact-sql) it was released in 2016 version. Are you sure this isn't just the version of your SSMS? – Zohar Peled Dec 14 '17 at 17:23

1 Answers1

0

It's my function:

CREATE FUNCTION [dbo].[FN_SPLIT](@Long_str varchar(max),@split_str varchar(100))    
RETURNS  @tmp TABLE(        
    ID      inT IDENTITY PRIMARY KEY,      
    SPLIT   varchar(max)    
)    
AS   
BEGIN 
    DECLARE @sxml XML


    SET @sxml='<r><n>'+REPLACE(@Long_str,@split_str,'</n><n>')+'</n></r>'
        INSERT INTO @tmp([SPLIT])
        SELECT b.value('.','varchar(max)') FROM @sxml.nodes('r/n') AS s(b)
    RETURN     
END
Nolan Shang
  • 2,312
  • 1
  • 14
  • 10