6

Is there a way to split a string (from a specific column) to n-number chars without breaking words, with each result in its own row?

Example:

2012-04-24 Change request #3 for the contract per terms and conditions and per John Smith in the PSO department  Customer states terms should be Net 60 not Net 30.  Please review signed contract for this information.

Results:

2012-04-24 Change request #3 for the contract per terms and conditions and per John Smith in the
PSO department  Customer states terms should be Net 60 not Net 30.
Please review signed contract for this information.

I know I can use charindex to find the last space, but im not sure how i can get the remaining ones and return them as rows.

Daniel A. White
  • 187,200
  • 47
  • 362
  • 445
  • Check out the code for PrintMax, which I found here . . . http://weblogs.asp.net/bdill/archive/2007/09/29/sql-server-print-max.aspx. It looks for line breaks but you can just as easily look for spaces. – Gordon Linoff Jun 01 '12 at 15:03
  • Take a look at Jeff Moden's Split8k function [here](http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=153458). That could probably be modified to do what you want. – Chad Jun 01 '12 at 15:04
  • You could always use .NET Regex via SQL CLR UDF. – SliverNinja - MSFT Jun 01 '12 at 15:05

4 Answers4

4

Try something like this. May be your can create a SQL function of following implementation.

DECLARE @Str VARCHAR(1000)
SET @Str = '2012-04-24 Change request #3 for the contract per terms and conditions and per John Smith in the PSO department  Customer states terms should be Net 60 not Net 30.  Please review signed contract for this information.'

DECLARE @End INT
DECLARE @Split INT

SET @Split = 100

declare @SomeTable table
(
  Content varchar(3000)
)


WHILE (LEN(@Str) > 0)
BEGIN
    IF (LEN(@Str) > @Split)
    BEGIN
        SET @End = LEN(LEFT(@Str, @Split)) - CHARINDEX(' ', REVERSE(LEFT(@Str, @Split)))
        INSERT INTO @SomeTable VALUES (RTRIM(LTRIM(LEFT(LEFT(@Str, @Split), @End))))
        SET @Str = SUBSTRING(@Str, @End + 1, LEN(@Str))
    END
    ELSE
    BEGIN
        INSERT INTO @SomeTable VALUES (RTRIM(LTRIM(@Str)))
        SET @Str = ''
    END
END

SELECT *
FROM @SomeTable

Output will be like this:

2012-04-24 Change request #3 for the contract per terms and conditions and per John Smith in the
PSO department  Customer states terms should be Net 60 not Net 30.  Please review signed contract
for this information.
Firoz Ansari
  • 2,505
  • 1
  • 23
  • 36
  • i don't want it printed, i want it returned as separate rows in a record set. – Daniel A. White Jun 01 '12 at 16:45
  • This is so slow and useless in small size. For example when i want to split every 3 character. – QMaster Oct 19 '14 at 18:17
  • 1
    Thanks for this... this is one of those cookbook snippets that you need in your back pocket. i used this as a base for something i needed but also learned a lot while messing with this. – yedevtxt Apr 30 '15 at 02:30
2

Just to see if it could be done, I came up with a solution that doesn't loop. It's based on somebody else's function to split a string based on a delimiter.

Note: This requires that you know the maximum token length ahead of time. The function will stop returning lines upon encountering a token longer than the specified line length. There are probably other bugs lurking as well, so use this code at your own caution.

CREATE FUNCTION SplitLines
(
    @pString    VARCHAR(7999),
    @pLineLen   INT,
    @pDelim     CHAR(1)
)
RETURNS TABLE
   WITH SCHEMABINDING
AS  
RETURN
WITH
      E1(N) AS ( --=== Create Ten 1's
                 SELECT 1 UNION ALL SELECT 1 UNION ALL
                 SELECT 1 UNION ALL SELECT 1 UNION ALL
                 SELECT 1 UNION ALL SELECT 1 UNION ALL
                 SELECT 1 UNION ALL SELECT 1 UNION ALL
                 SELECT 1 UNION ALL SELECT 1 --10
               ),
      E2(N) AS (SELECT 1 FROM E1 a, E1 b),   --100
      E4(N) AS (SELECT 1 FROM E2 a, E2 b),   --10,000
cteTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT N)) FROM E4),
lines AS (
  SELECT TOP 1
         1 as LineNumber,
         ltrim(rtrim(SUBSTRING(@pString, 1, N))) as Line,
         N + 1 as start
  FROM cteTally
  WHERE N <= DATALENGTH(@pString) + 1
    AND N <= @pLineLen + 1
    AND SUBSTRING(@pString + @pDelim, N, 1) = @pDelim
  ORDER BY N DESC
UNION ALL
  SELECT LineNumber, Line, start
  FROM (
    SELECT LineNumber + 1 as LineNumber,
           ltrim(rtrim(SUBSTRING(@pString, start, N))) as Line,
           start + N + 1 as start,
           ROW_NUMBER() OVER (ORDER BY N DESC) as r
    FROM cteTally, lines
    WHERE N <= DATALENGTH(@pString) + 1 - start
      AND N <= @pLineLen
      AND SUBSTRING(@pString + @pDelim, start + N, 1) = @pDelim
  ) A
  WHERE r = 1
)
SELECT LineNumber, Line
FROM lines

It's actually quite fast and you can do cool things like join on it. Here's a simple example that gets the first 'line' from every row in a table:

declare @table table (
  id int,
  paragraph varchar(7999)
)
insert into @table values (1, '2012-04-24 Change request #3 for the contract per terms and conditions and per John Smith in the PSO department  Customer states terms should be Net 60 not Net 30.  Please review signed contract for this information.')
insert into @table values (2, 'Is there a way to split a string (from a specific column) to n-number chars without breaking words, with each result in its own row?')

select t.id, l.LineNumber, l.Line, len(Line)
from @table t
cross apply SplitLines(t.paragraph, 42, ' ') l
where l.LineNumber = 1
Chad
  • 7,279
  • 2
  • 24
  • 34
  • A big error. Try to set pLineLen as small value for example 5 and you will see it can't get result. I checked it against variable length as pString and found when pString is enough long this is working for pLineLen => 11 and and can be change by pString length but in any length you can't set pLineLen smaller than 11. This is so big trouble and i Think useless in most of scenarios. – QMaster Oct 20 '14 at 09:10
  • After a bit of experimentation, I think `@pLineLen` needs to be set to at least the largest token size plus two. This makes sense because you'd be splitting a word in half otherwise. In the example above, "information." is length 12, so `@pLineLen` needs to be set to at least 14 to get everything. – Bret May 30 '17 at 15:05
  • 1
    @Bret, thanks for pointing that out. I fixed some bugs in the code, so now it should behave (mostly) correct as long as `@pLineLen` is at least as big as the maximum token length. – Chad May 30 '17 at 17:49
2

I read some articles and each of them has error or bad performance or not working in small or big length of chunk we want. You can read my comments even in this article below of any answer. Finally i found a good answer and decided to share it in this question. I didn't check performance in various scenarios but i think is acceptable and working fine for small and big chunk length. This is the code:

CREATE function SplitString
(   
    @str varchar(max),
    @length int
)
RETURNS @Results TABLE( Result varchar(50),Sequence INT ) 
AS
BEGIN

DECLARE @Sequence INT 
SET @Sequence = 1

    DECLARE @s varchar(50)
    WHILE len(@str) > 0
    BEGIN
        SET @s = left(@str, @length)
        INSERT @Results VALUES (@s,@Sequence)

        IF(len(@str)<@length)
        BREAK

        SET @str = right(@str, len(@str) - @length)
        SET @Sequence = @Sequence + 1
    END
    RETURN 
END

and source is @Rhyno answer on this question: TSQL UDF To Split String Every 8 Characters

Hope this help.

Community
  • 1
  • 1
QMaster
  • 3,743
  • 3
  • 43
  • 56
  • 2
    This doesn't answer OP's question, because it doesn't preserve words (IE if there is a word at the 50-character mark, it will slice it in half instead of moving the whole word to the next line). But... it does happen to be exactly what I needed. :) – Keiki Aug 17 '17 at 21:30
  • @Keiki Thanks for your heeds, I'll check it asap. – QMaster Aug 17 '17 at 23:04
0

I know this is a bit late but a recursive cte would allow to achieve this.

Also you could make use of a seed table containing a sequence of numbers to feed into the substring as a multiplier for the start index.

Chris Moutray
  • 18,029
  • 7
  • 45
  • 66
  • @Chris i answered a question in Information Security site and got 5 minus point just for not explain answer in detail even i wrote some useful links additionally. Your answer is not useful without explain, code, sample and link. i don't minus point in it but please careful and do the best when you answer a question. – QMaster Oct 20 '14 at 09:16
  • @QMaster I appreciate your comment and really do wish I had the time to give full explanations to my answers - unfortunately posting answers is *mostly* a thank-less hobby. I bothered to answer in this case because I though putting a different view point would be useful, and I think Chad Henderson provided an answer after which shows CTE example. Also personally I think negative votes should be saved for misleading or incorrect answers. Based on the number of votes my answer has received, it sits on the fence of being neither useful nor incorrect. – Chris Moutray Oct 20 '14 at 11:21
  • @ChrisMoutray i see your good idea and try to help. I just want to hint you because of my experience. Can you imagine -5 points? :) Finally an expert user decided to delete answer and even i wrote new answer with full explanation no one point +1 :) you can look at this to know my honesty: http://security.stackexchange.com/questions/49315/escaping-rich-text-editor-output/66146#66146 Anyway i appreciate your try and being here. – QMaster Oct 20 '14 at 13:10