2

I would like to remove all diacritics from a string column, using Snowflake SQL. For example: ä, ö, é, č => a, o e, c

I acknowledged that the TRANSLATE function would work but only for single case while there are lots of letters with an accent that need to be translated.

Could you please give me some hints? Thanks so much in advance, Hanh

  • I voted to close this question as duplicate (Does this answer your question? [Removing accents from string in Snowflake](https://stackoverflow.com/questions/66606576/removing-accents-from-string-in-snowflake)), while Greg's answer below is great. – Felipe Hoffa Sep 02 '21 at 17:10

1 Answers1

4

Perhaps the safest way to make sure it covers all of them is to draft on the work in ES2015/ES6 to cover all characters like this:

create or replace function REPLACE_DIACRITICS("str" string)
returns string
language javascript
strict immutable
as
$$
    return str.normalize("NFD").replace(/\p{Diacritic}/gu, "");
$$;

select REPLACE_DIACRITICS('ö, é, č => a, o e, c');

JS for the UDF is courtesy of this post: Remove accents/diacritics in a string in JavaScript

Greg Pavlik
  • 10,089
  • 2
  • 12
  • 29