0

I need to parse a full name in the format, prefix first middle last suffix, but not all parts may be included. I have the prefix first middle and last working, but Jr gets stuffed in with the last name. How do I get the suffix to come out in a suffix column? Example includes data.

SELECT
  FIRST_NAME.INPUT_DATA
 ,FIRST_NAME.PREFIX
 ,FIRST_NAME.FIRST_NAME
 ,CASE WHEN 0 = CHARINDEX(' ',FIRST_NAME.REMAINING)
       THEN NULL  --no more spaces found, consider remaining to be last name
       ELSE SUBSTRING(
                       FIRST_NAME.REMAINING
                      ,1
                      ,CHARINDEX(' ',FIRST_NAME.REMAINING)-1
                     )
       END AS MIDDLE_NAME
 ,SUBSTRING(
             FIRST_NAME.REMAINING
            ,1 + CHARINDEX(' ',FIRST_NAME.REMAINING)
            ,LEN(FIRST_NAME.REMAINING)
           ) AS LAST_NAME
FROM
  (  
  SELECT
    PREFIX.PREFIX
   ,CASE WHEN 0 = CHARINDEX(' ',PREFIX.REMAINING)
         THEN PREFIX.REMAINING --no space found, return the entire string
         ELSE SUBSTRING(
                         PREFIX.REMAINING
                        ,1
                        ,CHARINDEX(' ',PREFIX.REMAINING)-1
                       )
    END AS FIRST_NAME
   ,CASE WHEN 0 = CHARINDEX(' ',PREFIX.REMAINING)  
         THEN NULL  --no spaces found, consider to be first name
         ELSE SUBSTRING(
                         PREFIX.REMAINING
                        ,CHARINDEX(' ',PREFIX.REMAINING)+1
                        ,LEN(PREFIX.REMAINING)
                       )
    END AS REMAINING
   ,PREFIX.INPUT_DATA
  FROM
    (   
    SELECT --CLEAN_DATA
      --if  first three characters match list,
      --parse as a "PREFIX".  else return NULL for PREFIX.
      CASE WHEN SUBSTRING(CLEAN_DATA.FULL_NAME,1,3) IN ('MR ','MS ','DR ','MRS')
           THEN LTRIM(RTRIM(SUBSTRING(CLEAN_DATA.FULL_NAME,1,3)))
           ELSE NULL
           END AS PREFIX
     ,CASE WHEN SUBSTRING(CLEAN_DATA.FULL_NAME,1,3) IN ('MR ','MS ','DR ','MRS')
           THEN LTRIM(RTRIM(SUBSTRING(CLEAN_DATA.FULL_NAME,4,LEN(CLEAN_DATA.FULL_NAME))))
           ELSE LTRIM(RTRIM(CLEAN_DATA.FULL_NAME))
           END AS REMAINING
     ,CLEAN_DATA.INPUT_DATA
    FROM
      (
      SELECT
        --trim leading & trailing spaces to prepare for processing
        --replace extra spaces in name
        REPLACE(REPLACE(LTRIM(RTRIM(FULL_NAME)),'  ',' '),'  ',' ') AS FULL_NAME
       ,FULL_NAME AS INPUT_DATA
      FROM
        (
        --test with test data, or table
        --table
            --SELECT        CONTACT AS FULL_NAME
            --FROM            CONTACT

        --test data
        --/*
             SELECT 'Andy D Where' AS FULL_NAME
        UNION SELECT 'Cathy T Landers' AS FULL_NAME
        UNION SELECT 'Ms Annie Wint There' AS FULL_NAME
        UNION SELECT 'Frank Fields' AS FULL_NAME
        UNION SELECT 'Howdy U Pokes Jr.' AS FULL_NAME
        --*/

        ) SOURCE_DATA
      ) CLEAN_DATA
    ) PREFIX
  ) FIRST_NAME

--credits to JStyons of course

Blocks
  • 351
  • 4
  • 12

1 Answers1

0

Hope this helps. I have only added Generational SUFFIX titles(Sr, Jr), If more are needed you could add to the Case statement as needed. I am also assuming that your Db is case insensitive.


Assumption (Business Rules):

  1. First Name has no spaces
  2. Middle Name has no spaces
  3. Last name has no spaces
  4. Prefix's are only of the form 'MR ','MS ','DR ','MRS' with no period "."
  5. Suffix's are only of the form 'Sr', 'Jr', 'Sr.', 'Jr.'
  6. The Database is case insensitive

IF OBJECT_ID('tempdb..#cte_SpaceFix') IS NOT NULL
    DROP TABLE #cte_SpaceFix

;WITH cte_OriginalData (FullName)
AS (
    SELECT 'Andy D Where'

    UNION

    SELECT 'Cathy T  Landers'

    UNION

    SELECT 'Ms Annie    Wint There'

    UNION

    SELECT 'Ms Annie    Wint There Jr'

    UNION

    SELECT 'Mrs Annie     There Jr'

    UNION

    SELECT 'Frank     Fields'

    UNION

    SELECT 'Howdy      U Pokes Jr.'

    UNION

    SELECT 'Howdy U Pokes Sr.'

    UNION

    SELECT 'Cathy T Landers Jr'

    UNION

    SELECT 'Landers Jr'
    )
    ,cte_FullNameRemoveTail AS
    (
    SELECT LTRIM(RTRIM(FullName)) AS FullName
    FROM cte_OriginalData
    )
    ,cte_Parse_Prefix(Prefix,FullFirst_Prefix,FullName) AS
    (
    SELECT CASE 
            WHEN SUBSTRING(FullName, 1, 3) IN ('MR ','MS ','DR ','MRS')
                THEN LTRIM(RTRIM(SUBSTRING(FullName, 1, 3)))
            ELSE NULL
            END AS Prefix,
        CASE 
            WHEN SUBSTRING(FullName, 1, 3) IN ('MR ','MS ','DR ','MRS')
                THEN LTRIM(RTRIM(SUBSTRING(FullName, 4, 8000)))
            ELSE LTRIM(RTRIM(FullName))
            END AS FullFirst_Prefix,
        FullName
    FROM cte_FullNameRemoveTail
    )
    ,cte_Parse_Suffix(Prefix,FullFirst_Prefix_Suffix,Suffix,FullName) AS
    (
    SELECT Prefix,
        CASE 
            WHEN RIGHT(FullFirst_Prefix,3) = ' JR'  THEN LTRIM(RTRIM(SUBSTRING(FullFirst_Prefix,1,LEN(FullFirst_Prefix)-3)))
            WHEN RIGHT(FullFirst_Prefix,4) = ' JR.' THEN LTRIM(RTRIM(SUBSTRING(FullFirst_Prefix,1,LEN(FullFirst_Prefix)-4)))
            WHEN RIGHT(FullFirst_Prefix,3) = ' SR'  THEN LTRIM(RTRIM(SUBSTRING(FullFirst_Prefix,1,LEN(FullFirst_Prefix)-3)))
            WHEN RIGHT(FullFirst_Prefix,4) = ' SR.' THEN LTRIM(RTRIM(SUBSTRING(FullFirst_Prefix,1,LEN(FullFirst_Prefix)-4)))
        ELSE LTRIM(RTRIM(FullFirst_Prefix))
        END AS FullFirst_Prefix_Suffix,
        CASE 
            WHEN RIGHT(FullFirst_Prefix,3) = ' JR'
                OR RIGHT(FullFirst_Prefix,4) = ' JR.'
            THEN 'Jr'
            WHEN RIGHT(FullFirst_Prefix,3) = ' SR' 
                OR RIGHT(FullFirst_Prefix,4) = ' SR.' 
            THEN 'Sr'
            ELSE NULL
        END AS Suffix,
        FullName
    FROM cte_Parse_Prefix
    )
    ,cte_SpaceFix(Prefix, FullFirst_Prefix_Suffix, Suffix, FullName) AS
    (
    SELECT Prefix,
        CASE 
            WHEN LEN(FullFirst_Prefix_Suffix) - LEN(REPLACE(FullFirst_Prefix_Suffix, ' ', '')) > 2 THEN REPLACE(REPLACE(REPLACE(REPLACE(FullFirst_Prefix_Suffix,SPACE(5), SPACE(1)),SPACE(4), SPACE(1)),SPACE(3), SPACE(1)),SPACE(2), SPACE(1))
            ELSE FullFirst_Prefix_Suffix
        END AS FullFirst_Prefix_Suffix,
        Suffix,
        FullName
    FROM cte_Parse_Suffix
    )
    SELECT * INTO #cte_SpaceFix
    FROM cte_SpaceFix

    ;WITH cte_Parse_FirstName(Prefix, FirstName, Suffix, FullFirst_Prefix_Suffix_FirstName, FullName) AS
    (
    SELECT Prefix,
        CASE 
            WHEN FullFirst_Prefix_Suffix IS NULL THEN NULL
            WHEN LEN(FullFirst_Prefix_Suffix) - LEN(REPLACE(FullFirst_Prefix_Suffix, ' ', '')) >= 1 THEN LEFT(FullFirst_Prefix_Suffix,CHARINDEX(' ',FullFirst_Prefix_Suffix))
            WHEN LEN(FullFirst_Prefix_Suffix) - LEN(REPLACE(FullFirst_Prefix_Suffix, ' ', '')) = 0 THEN FullFirst_Prefix_Suffix
            ELSE NULL
        END AS FirstName,
        Suffix,
        CASE 
            WHEN FullFirst_Prefix_Suffix IS NULL THEN NULL
            WHEN LEN(FullFirst_Prefix_Suffix) - LEN(REPLACE(FullFirst_Prefix_Suffix, ' ', '')) >= 1 THEN LTRIM(RTRIM(REPLACE(FullFirst_Prefix_Suffix,LEFT(FullFirst_Prefix_Suffix,CHARINDEX(' ',FullFirst_Prefix_Suffix)),'')))
            WHEN LEN(FullFirst_Prefix_Suffix) - LEN(REPLACE(FullFirst_Prefix_Suffix, ' ', '')) = 0 THEN NULL
            ELSE NULL
        END AS FullFirst_Prefix_Suffix_FirstName,
        FullName
    FROM #cte_SpaceFix
    )
    ,cte_Parse_LastName(Prefix, FirstName, LastName, Suffix, MiddleName, FullName) AS
    (
    SELECT Prefix,
        FirstName,
        CASE 
            WHEN FullFirst_Prefix_Suffix_FirstName IS NULL THEN NULL
            WHEN LEN(FullFirst_Prefix_Suffix_FirstName) - LEN(REPLACE(FullFirst_Prefix_Suffix_FirstName, ' ', '')) >= 1 THEN SUBSTRING(FullFirst_Prefix_Suffix_FirstName,CHARINDEX(' ',FullFirst_Prefix_Suffix_FirstName)+1,8000)
            WHEN LEN(FullFirst_Prefix_Suffix_FirstName) - LEN(REPLACE(FullFirst_Prefix_Suffix_FirstName, ' ', '')) = 0 THEN FullFirst_Prefix_Suffix_FirstName
            ELSE NULL
        END AS LastName,
        Suffix,
        CASE 
            WHEN FullFirst_Prefix_Suffix_FirstName IS NULL THEN NULL
            WHEN LEN(FullFirst_Prefix_Suffix_FirstName) - LEN(REPLACE(FullFirst_Prefix_Suffix_FirstName, ' ', '')) >= 1 THEN LEFT(FullFirst_Prefix_Suffix_FirstName,CHARINDEX(' ',FullFirst_Prefix_Suffix_FirstName))
            ELSE NULL
        END AS MiddleName,
        FullName
    FROM cte_Parse_FirstName
    )
    SELECT Prefix, FirstName, MiddleName, LastName, Suffix--, FullName 
    FROM cte_Parse_LastName

IF OBJECT_ID('tempdb..#cte_SpaceFix') IS NOT NULL
    DROP TABLE #cte_SpaceFix
Gouri Shankar Aechoor
  • 1,561
  • 1
  • 8
  • 8
  • Thanks, that's some great code. The Suffix is still showing up in the last name. How would I get that out. Howdy U Pokes Jr. NULL Howdy U Pokes Jr. JR – Blocks Mar 21 '17 at 14:51
  • Apologies, forgot about the Last Name. Edited code now – Gouri Shankar Aechoor Mar 21 '17 at 15:19
  • It looks like it only works when all 5 parts are in the name. When one is left off then it doesn't parse the other parts.INPUT_DATA PREFIX FIRST_NAME MIDDLE_NAME LAST_NAME SUFFIX Andy D Where NULL Andy D NULL NULL Cathy T Landers NULL Cathy T NULL NULL Cathy T Landers Jr NULL Cathy T Landers Jr Frank Fields NULL Frank NULL NULL NULL Howdy U Pokes Jr. NULL Howdy U Pokes Jr Howdy U Pokes Sr. NULL Howdy U Pokes Sr Ms Annie Wint There Ms Annie Wint NULL NULL – Blocks Mar 21 '17 at 17:35
  • Andy D Where NULL Andy D NULL NULL, it only parsed out Andy – Blocks Mar 21 '17 at 17:59
  • Andy D Where NULL Andy D NULL NULL, it only parsed out Andy and D – Blocks Mar 21 '17 at 18:56
  • Hi @ERPise, I do not understand your question, As per your question prefix first middle and last working and you only had issues with parsing suffix. Once you fix the parsing the prefix first middle and last, my code will automatically work. Are you asking me to try and fix JStyons's original code? – Gouri Shankar Aechoor Mar 22 '17 at 04:38
  • The fact is that there is no consistent way of reverse engineering a FullName to fit in a Fname, MName and LName Construct consistently Different languages and cultures can have their own variations. What i would suggest is that you define a rule and stick to it with a clear understanding that deviation and errors are expected. Best example is my name where Gouri Shankar is my First name but in this code the name fill be split into two where Shankar becomes my middle name which is not correct for me but is correct as per defined logic. Any way, made the changes as per original logic. – Gouri Shankar Aechoor Mar 22 '17 at 06:01
  • Ok, thanks that's what I wanted to find out. I've seen the other method for using input parameters and it works. So, I guess this method doesn't work entirely. – Blocks Mar 22 '17 at 17:12