I am trying to remove all accents from a string column in Snowflake. In MSSQL I can use the collate function to do this. For example I have this string in a column: 'JESÚSSMITH12345'. I want to remove the accent from above the U:
In MSSQL I can do the following:
select ('JESÚSSMITH12345' Collate SQL_Latin1_General_CP1253_CI_AI) as identifier
returns: JESUSSMITH12345
However, I can't find a way to do this in Snowflake, other than using the TRANSLATE function. Using the TRANSLATE function isn't a good solution because, I would have to define every possible letter with an accent to do the translation. For example, this works in Snowflake:
select translate('JESÚSSMITH12345', 'Ú', 'U');
returns: JESUSSMITH12345
However, if a column contains any value other than U with an accent, I have to add that to my translation. It's not ideal.
Using the collate function in Snowflake, even with "ai" (a.k.a accent-insensitive), this still returns the string with the accented U.
For example in Snowflake:
select collate('JESÚSSMITH12345', 'en-ai');
returns: JESÚSSMITH12345