0

I need to correct Names of users by removing prefixes before I can process the names.

For example, my list of prefixes is:

am, auf, auf dem, aus der, d, da, de, de l’, del, de la, de le, di, do, dos, du, 
im, la, le, mac, mc, mhac, mhíc, mhic giolla, mic, ni, ní, níc, o, ó, 
ua, ui, uí, van, van de, van den, van der, vom, von, von dem, von den, von der

I want to remove any of these prefixes from the First Name if they are present.

For example - inputs:

enter image description here

Outputs:

enter image description here

I know I can take a brute force approach and do a replace 40 odd times, but was wondering if there is a better/smarter way to do this, given the list of names that need to be processed can be in the tens of thousands, daily.

Thank you

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
user1372603
  • 47
  • 1
  • 6
  • If want to pay the time to implement SQL CLR, you can add regex replace function - https://stackoverflow.com/a/31693412/1080354 – gotqn Apr 08 '20 at 10:34
  • 1
    if the text you want removing is at the end of the string this is a suffix not a prefix – Martin Smith Apr 08 '20 at 10:43

1 Answers1

1

You could use apply:

select t.*, v.prefix_free_first_name
from t outer apply
     (select top (1) left(t.first_name, len(t.first_name) - len(v.prefix) - 1) as prefix_free_first_name
      from (values ('am'), ('auf'), . . .
           ) v(prefix)
      where t.first_name like '% ' + v.prefix
      order by len(v.prefix) desc
     ) v;

Note: This handles the situation where multiple prefixes match a name, such as "de le" and "le".

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Hi Gordon, I am trying to use this in a function which I can apply in every row by passing the "raw" first name and get back the first name without the prefix. Its not working out for me. I'm obviously not using the apply correctly. Additional assistance would be appreciated. Thanks. It wont let me paste my function in the comment, so dont know how to show you – user1372603 Apr 08 '20 at 12:07
  • @user1372603 . . . This answers the question that you have asked here. If you have a question about a UDF, then you should ask that as a *new* question. – Gordon Linoff Apr 08 '20 at 12:09
  • K, I'll post another question when it lets me. regarding your answer though. Is this correct v(prefix). Should that be be v.prefix? – user1372603 Apr 08 '20 at 12:39
  • @user1372603 . . . After the `values` statement, that is the correct syntax. – Gordon Linoff Apr 08 '20 at 13:20
  • That doesnt seem to work. I tried replacing the t.first_name with text 'am john' and also replacing t with the clients table i.e. client t. I just get an error – user1372603 Apr 08 '20 at 20:20