2

I am trying to do a CASE statement that combines two columns that are not alike. However, these two columns are names and the name could actually be the same except it might sometimes contain a suffix. So for instance, I want to bring back the following:

LastName1   | LastName2
------------------------
Ross        | Ross, Jr.
Lee         | Lee
Smith       | Collins
Martin      | Martin
Pierce, Sr. | Pierce

So, my statement looks like this:

SELECT CASE WHEN LastName1 <> LastName2 THEN LastName1 + ', ' + LastName2 ELSE LastName1 END AS LastName

This returns the results as such:

LastName
---------
Ross, Ross, Jr.
Lee
Smith, Collins
Martin
Pierce, Pierce, Sr.

But I would like to use a NOT LIKE (or something similar) and if the first few characters of the LastName1 column are similar to LastName2, only bring back one instance of the name. So my result set would look like:

LastName
---------
Ross
Lee
Smith, Collins
Martin
Pierce
PicoDeGallo
  • 608
  • 9
  • 19
  • probably better to use a fuzzy matching algorithm eg Jaro-Winkler. http://stackoverflow.com/questions/16448297/fuzzy-grouping-in-sql – Johnny Fitz May 02 '14 at 18:39

6 Answers6

2

You can use left to take the first n characters. It is pretty clean LEFT(col, n)

SELECT 
    CASE 
       WHEN left(LastName1,4) <> left(LastName2,4) 
          THEN LastName1 + ', ' + LastName2 
       ELSE LastName1 END AS LastName

But this will be prone to getting false positives, be carful with your data set and the number of characters you select on. In you example you have , which break the data from the suffix. If that is consistent you could select all the text before ,. Here is an example

Community
  • 1
  • 1
TheNorthWes
  • 2,661
  • 19
  • 35
2

I would extend your comparison using like comparison:

SELECT (CASE WHEN LastName1 like LastName2 + '%' THEN LastName1
             WHEN LastName2 like LastName1 + '%' THEN LastName2
             ELSE LastName1 + ', ' + LastName2 
        END) AS LastName
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

You could use CHARINDEX to determine whether one of the LastNames subsumes the other. e.g. it appears you want the 'lessor' of the last names if they are similar:

SELECT 
CASE 
  WHEN CHARINDEX(LastName1, LastName2 ) > 0
    THEN LastName1
  WHEN CHARINDEX(LastName2, LastName1 ) > 0
    THEN LastName2
  ELSE 
    LastName1 + ', ' + LastName2
END AS LastName

This will obviously backfire if a customer has a double-barelled name, e.g. Fred James Jameson would wind up as just Fred James (or Jameson).

StuartLC
  • 104,537
  • 17
  • 209
  • 285
0

Would the following modification work?

   SELECT 
        CASE 
            WHEN LastName1 <> SUBSTRING(LastName2, 1, LEN(LastName1)) THEN LastName1 + ', ' + LastName2 
        ELSE LastName1 
        END AS LastName
Joseph B
  • 5,519
  • 1
  • 15
  • 19
0

Not sure if this is what you want, but how about:

Select case 
   When charIndex(lastName1, LastName2) <> 0 then lastName2
   When charIndex(lastName2, LastName1) <> 0 then lastName1
   Else lastName1 + lastName2 End
Charles Bretana
  • 143,358
  • 22
  • 150
  • 216
0

You can use SUBSTRING or LEFT

SELECT CASE WHEN SUBSTRING(LastName1, 1, 4) <> SUBSTRING(LastName2, 1, 4) THEN LastName1 + ', ' + LastName2 ELSE LastName1 END AS LastName

VinB
  • 1
  • 1