1

I used the translate method in spark but I would like to have the same with using the spark regex (replace etc.). Could you Please help to re-write it?

df.withColumn(„name_surname”,translate(col(„name_surname”),”ĄąĆcĘeŁłŹźŻŚśÓóŃń”,”AaCcEeLlZzZSsOoNn”))
SebastianK
  • 13
  • 3

1 Answers1

0

I think there's no Spark function to do that, but you can always one of plain Java methods, eg. as suggested in answers to this question, then wrap it in a UDF.

val stripAccents = udf(org.apache.commons.lang3.StringUtils.stripAccents(_))
df.withColumn("name_surname", stripAccents($"name_surname")).show

+-----------------+
|     name_surname|
+-----------------+
|AaCcEeLlZzZSsOoNn|
+-----------------+
Kombajn zbożowy
  • 8,755
  • 3
  • 28
  • 60