1

I want to have a function in Redshift that removes accents from words. I have found a question in SO(question) with the code in Python for making it. I have tried a few solutions, one of them being:

import unicodedata
def remove_accents(accented_string):
    nfkd_form = unicodedata.normalize('NFKD', input_str)
    return u"".join([c for c in nfkd_form if not unicodedata.combining(c)])

Then I create the function in Redshift as follows:

create function remove_accents(accented_string varchar)
returns varchar
immutable
as $$
import unicodedata
def remove_accents(accented_string):
    nfkd_form = unicodedata.normalize('NFKD', input_str)
    return u"".join([c for c in nfkd_form if not unicodedata.combining(c)])
$$ language plpythonu;

And I apply it to a column with:

SELECT remove_accents(city) FROM info_geo

Getting just null values. The column city is of varchar type. Why am I getting null values and how could I solve it?

Javier Lopez Tomas
  • 2,072
  • 3
  • 19
  • 41

1 Answers1

1

You don't need to create a Python function inside the UDF. Either add a call of the function or write it as a scalar expression:

create function remove_accents(accented_string varchar)
returns varchar
immutable
as $$
  import unicodedata
  nfkd_form = unicodedata.normalize('NFKD', accented_string)
  return u"".join([c for c in nfkd_form if not unicodedata.combining(c)])
$$ language plpythonu;
Yann
  • 2,426
  • 1
  • 16
  • 33