5

I have a complex nested replace which I am using to join two tables in MSSQL.

select * from A
  left outer join 
select * from B
on
  replace(
     replace(
        replace(
           replace(
              replace(A.Column1, '1114', ''),
             '1160', ''), 
          '1162', ''),
        '1167', ''),
      '1176', ''),
    '1177', '')  = B.Column1

The whole reason I am doing this is because data in Table1 contains of some noise - numbers like 1160, 1162 etc wheres Table2 is clean characters.

Eg. - Table 1 - 'HELLO1160WORLD'
      Table 2 - 'HELLOWORLD'

Now in my situation I should be able to match them as one entry.

My current approach of nested replace does work but I am not convinced that this is an elegant way to do this. Any help will be much appreciated. Thanks

dopplesoldner
  • 8,891
  • 12
  • 44
  • 55

3 Answers3

4

Maybe use a function to strip the non-numeric characters:

Create Function [dbo].[RemoveNonAlphaCharacters](@Temp VarChar(1000))
Returns VarChar(1000)
AS
Begin

    While PatIndex('%[^a-z]%', @Temp) > 0
        Set @Temp = Stuff(@Temp, PatIndex('%[^a-z]%', @Temp), 1, '')

    Return @Temp
End

Then you will reference this function in your join:

select a.col1 a, b.col1 b
from tablea a
left join tableb b
  on dbo.RemoveNonAlphaCharacters(a.col1) = b.col1

See SQL Fiddle with Demo

Community
  • 1
  • 1
Taryn
  • 242,637
  • 56
  • 362
  • 405
  • Thanks, that would be an elegant solution. The only problem is that I don't want to remove all non-apha numerics. Sorry for not being clear. There could be cases such as `HELLO1114WORLD1` and `HELLOWORLD1` I have a very specific set of values (transaction codes to be specific) which I would like to remove – dopplesoldner Oct 26 '12 at 14:47
  • @dopplesoldner so those two values would not be a match then? Can you post a larger data sample and then the expected result? – Taryn Oct 26 '12 at 14:48
  • Hi @bluefeet I would like the following values to be treated as **matching** `HELLO1114WORLD1 HELLOWORLD1` `M125ICKY1114MOUSE M125ICKYMOUSE` `RONA1160DINHO10 RONALDINHO10` Currently by using my nested replace, I am replacing certain known identifiers - eg `1114`, `1160` to give me exact matches. Hope it helps and thanks again! – dopplesoldner Oct 26 '12 at 15:03
  • @dopplesoldner can you apply the function to both columns? similar to this demo -- http://sqlfiddle.com/#!3/11e59/1 – Taryn Oct 26 '12 at 15:11
  • But that would match `M125ICKY1114MOUSE` to `M999ICKYMOUSE` ? – dopplesoldner Oct 26 '12 at 15:21
  • 1
    Btw thanks for showing me SQL FIddle, I had never heard of it! – dopplesoldner Oct 26 '12 at 15:21
4

The problem is that T-SQL does not easily allow to mark expressions with a name so you can refer to them from a different place. There is a way to do this though:

select replaceN
from T
cross apply (select replace1 = replace(T.col, 'x', 'y')) r1
cross apply (select replace2 = replace(replace1, 'x', 'y')) r2
cross apply (select replace3 = replace(replace2, 'x', 'y')) r3
...

This at least gets rid of the crazy nesting. It has no negative performance impact.

usr
  • 168,620
  • 35
  • 240
  • 369
  • 1
    This is a good solution... once I understood it. "replaceN" being the highest numbered replace in the sequence, in this example it is replace3 – Jonnie Mar 15 '18 at 16:48
0

Bluefeet's suggestion would do definitely do a good job of making your query much simpler. However, if you don't want to bother with a function and keep all your code one place, try this. Before you do the join, you could dump table A into a staging table:

DECLARE @TmpA TABLE(
     Column1 [nvarchar] (50)),
     ...
     )

Insert into @tmpA select * from A

Update @TmpA set Column1=Replace(Replace(Replace(Column1,...)))

Select * from tmpA
  left outer join 
Select * from B
  on tmpA.Column1=B.Column1
PowerUser
  • 11,583
  • 20
  • 64
  • 98