-1

everyone I'm trying to separate a name column into 4 different parts. As of right now all the name parts are separated by spaces ' '. I am having trouble with my @thirdString populating the fourth part of the name(usually a suffix) which I want to be considered the @fourthString. I will be running this with different names of different lengths. I'm just using Robert Dobson Bud jr as an example. Other names could be two parts or more.

-- Code for parsing a name with multiple parts
-- You should be able to copy and paste this into any MS-SQL Environment it doesn't use a certain table.

DECLARE     @nameString as varchar(max),
        @firstSpaceLoc as smallint,
        @secondSpaceLoc as smallint,
        @thirdSpaceLoc as smallint,
        @forthSpaceLoc as smallint,
        @firstString as varchar(max),
        @secondString as varchar(max),
        @thirdString as varchar(max),
        @fourthString as varchar(max) 


 -- Create some type of loop or case statement to run through the entire table. 
SET @nameString = 'Robert Dobson Bud jr'

SET @firstSpaceLoc = CHARINDEX(' ',@namestring,1)

SET @secondSpaceLoc = CHARINDEX(' ', @namestring, CHARINDEX(' ',@nameString,1)+1)

SET @thirdSpaceLoc = 
CASE
  WHEN CHARINDEX(' ', 
                          @namestring, 
                          CHARINDEX(' ',@nameString,1)+1) = 0 THEN 0
  WHEN CHARINDEX(' ', 
                          @namestring, 
                          CHARINDEX(' ',@nameString,1)+1) > 0 THEN
                                CHARINDEX(' ', @namestring, 
                                CHARINDEX(' ', @namestring, 
                                CHARINDEX(' ',@nameString,1)+1)+1)
END
SET @forthSpaceLoc =
CASE
WHEN CHARINDEX(' ', 
                          @namestring, 
                          CHARINDEX(' ',@nameString,1)+1) = 0 THEN 0
WHEN CHARINDEX(' ', 
                          @namestring, 
                          CHARINDEX(' ',@nameString,1)+1) > 0 THEN 0
WHEN CHARINDEX(' ', 
                          @namestring, 
                          CHARINDEX(' ',@nameString,1)+1) > 0 THEN
                            CHARINDEX(' ', 
@namestring, 
                                CHARINDEX(' ', @namestring,

CHARINDEX(' ', @nameString, 
                                CHARINDEX(' ',@nameString,1)+1)+1)+1)
END

SELECT

   @firstString = 
   CASE
        WHEN @firstSpaceLoc > 0 THEN LEFT(@nameString,CHARINDEX(' ',@namestring,1)-1)
        ELSE @nameString
   END,
   @secondString =   
   CASE
        WHEN @firstSpaceLoc = 0 THEN ''
        WHEN @secondSpaceLoc = 0 THEN 
                    RIGHT(@namestring, LEN(@namestring)- CHARINDEX(' ',@namestring,1))
        WHEN @secondSpaceLoc > 0 THEN
                    REPLACE     (
                    SUBSTRING   (
                                   @nameString, CHARINDEX(' ',@namestring,1)+1, CHARINDEX(' ', @namestring, CHARINDEX(' ',@nameString,1)+1) 
                                         - CHARINDEX(' ',@namestring,1)),' ',''
                                 )
        ELSE ''
  END,



                                     @thirdString =
  CASE
        WHEN @firstSpaceLoc = 0 OR @secondSpaceLoc = 0  THEN ''
        WHEN @secondSpaceLoc > 0 THEN
                    SUBSTRING   (
                                   @nameString,
                                   CHARINDEX(' ', @namestring, 
                                   CHARINDEX(' ',@nameString,1)+1),
                                   LEN(@nameString)
                                 )             
  END,
  @fourthString =
  CASE
        WHEN @firstSpaceLoc = 0 OR @secondSpaceLoc = 0 OR @thirdSpaceLoc = 0 THEN ''
        WHEN @secondSpaceLoc > 0 AND @thirdSpaceLoc = 0 THEN ''
        WHEN @thirdSpaceLoc > 0 THEN
                        SUBSTRING( 
                                    @nameString,
                                   CHARINDEX(' ', @namestring,
                                   CHARINDEX(' ', @namestring, 
                                   CHARINDEX(' ',@nameString,1)+1)+1),
                                   LEN(@nameString)
                                    )

 END

-- Report names
SELECT
        @nameString sourceString,
        @firstString [First string],
        @secondString [Second string],
        @thirdString [Third string],
        @fourthString [Fourth String]

I would like to get rid of the jr in the 3rd column. The intention is to have 4 different columns with 4 different parts of the name.

Ilyes
  • 14,640
  • 4
  • 29
  • 55
NeoAer
  • 327
  • 3
  • 15
  • 3
    [Falsehoods Programmers Believe About Names](https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/) – Thom A Mar 28 '19 at 15:36

4 Answers4

1

The reason why you're getting "jr" in the third string is somewhat mystifying. It's in this part of the code:

@thirdString = CASE
    WHEN @firstSpaceLoc = 0 OR @secondSpaceLoc = 0  THEN ''
    WHEN @secondSpaceLoc > 0 THEN
                SUBSTRING   (
                               @nameString,
                               CHARINDEX(' ', @namestring, 
                               CHARINDEX(' ',@nameString,1)+1),
                               LEN(@nameString)
                             )             

Why are you using LEN(@nameString) for the third parameter of the SUBSTRING? Of course that will return the rest of the string, including the "Jr". You clearly knew not to do it that way when getting the @secondString value, how could you not know to do it that way when getting the @thirdString?

To get the @thirdString you need to use the same technique that you used for getting the @secondString.

Tab Alleman
  • 31,483
  • 7
  • 36
  • 52
  • Yes Sir, I know this is where the problem is, but when I tried to apply the same logic as I did in secondString it doesn't come out correctly. I get some of the secondString text or just the incorrect format in general – NeoAer Mar 28 '19 at 15:54
  • You should edit your question to show your attempt to use the same logic in the thirdString, so we can debug why it didn't work for you. By the way, you could make your code easier to read by using the variables instead of repeating their calculations. In other words, instead of `CHARINDEX(' ',@nameString,1)+1` you could simply do `@firstSpaceLoc+1` – Tab Alleman Mar 28 '19 at 16:59
1

This script will do the job

DECLARE     @namestring as varchar(max)
        SET @namestring = 'Robert Dobson Bud jr'
        --SET @namestring = 'Robert Dobson'

        ;with cte as (
            select cast(0 as int) [start],CHARINDEX(' ',@namestring,0) [end] ,@namestring namestring
            union all
            select cast(cte.[end] as int) [start],CHARINDEX(' ',@namestring,cte.[end]+1) [end] ,@namestring namestring from cte where [end]>0
        ),cte2 as (
        select * ,ROW_NUMBER() over (order by cte.[start]) seq
        ,substring(@namestring,cte.[start]+1,(case when cte.[end]=0 then len(@namestring)+1 else cte.[end] end)-cte.[start]-1) part from cte
        )
        select 
         (select part from cte2 where seq=1) [First String]
        ,(select part from cte2 where seq=2) [Second String]
        ,(select part from cte2 where seq=3) [Third String]
        ,(select part from cte2 where seq=4) [Fourt String]

for 4 part name result will be as below

First String    Second String   Third String    Fourt String
Robert          Dobson          Bud         jr

for 2 part name result will be as below

First String    Second String   Third String    Fourt String
Robert          Dobson          NULL            NULL
Ali Al-Mosawi
  • 783
  • 6
  • 12
1

Does this what you want?

DECLARE @Str VARCHAR(45) = 'Robert Dobson Bud jr';

WITH CTE AS
(
  SELECT Value V,
         'Str' + CAST(ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS VARCHAR(10)) RN
  FROM STRING_SPLIT(@Str, ' ')
)
SELECT *
FROM
     (
       SELECT *
       FROM CTE
     ) X
     PIVOT
     (
       MAX(V) FOR RN IN ([Str1], [Str2], [Str3], [Str4])
     ) P;

Returns:

+--------+--------+------+------+
|  Str1  |  Str2  | Str3 | Str4 |
+--------+--------+------+------+
| Robert | Dobson | Bud  | jr   |
+--------+--------+------+------+

Live Demo

Ilyes
  • 14,640
  • 4
  • 29
  • 55
0

Using a splitting function, this can be arranged very simply.

SELECT firstString  = MAX(CASE WHEN ItemNumber = 1 THEN Item END),
       secondString = MAX(CASE WHEN ItemNumber = 2 THEN Item END),
       thirdString  = MAX(CASE WHEN ItemNumber = 3 THEN Item END),
       fourthString = MAX(CASE WHEN ItemNumber = 4 THEN Item END)
FROM dbo.DelimitedSplit8K_LEAD( @nameString, ' ');

The code of the function was initially published and explained here. But I'm copying the definition.

CREATE FUNCTION [dbo].[DelimitedSplit8K_LEAD]  
--===== Define I/O parameters  
        (@pString VARCHAR(8000), @pDelimiter CHAR(1))  
RETURNS TABLE WITH SCHEMABINDING AS  
 RETURN  
--===== "Inline" CTE Driven "Tally Table” produces values from 0 up to 10,000...  
     -- enough to cover VARCHAR(8000)  
 WITH E1(N) AS (  
                 SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL   
                 SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL   
                 SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1  
                ),                          --10E+1 or 10 rows  
       E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows  
       E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max  
 cteTally(N) AS (--==== This provides the "zero base" and limits the number of rows right up front  
                     -- for both a performance gain and prevention of accidental "overruns"  
                 SELECT 0 UNION ALL  
                 SELECT TOP (DATALENGTH(ISNULL(@pString,1))) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4  
                ),  
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)  
                 SELECT t.N+1  
                   FROM cteTally t  
                  WHERE (SUBSTRING(@pString,t.N,1) = @pDelimiter OR t.N = 0)   
                )  
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.  
 SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY s.N1),  
        Item = SUBSTRING(@pString,s.N1,ISNULL(NULLIF((LEAD(s.N1,1,1) OVER (ORDER BY s.N1) - 1),0)-s.N1,8000))  
   FROM cteStart s  
;  
Luis Cazares
  • 3,495
  • 8
  • 22