I need to remove Latin characters like accents or "ñ"
in Ruby. I tried using force_encoding('UTF-8')
but it didn't work.
Asked
Active
Viewed 1,833 times
2

the Tin Man
- 158,662
- 42
- 215
- 303

jgiunta
- 721
- 2
- 8
- 28
-
1Why do you want to remove them? – SLaks Apr 24 '12 at 22:04
-
What about non-Latin characters? – SLaks Apr 24 '12 at 22:04
-
2Do you actually want to convert them to their non-accented ASCII equivalent? – Mark Thomas Apr 24 '12 at 22:14
-
Duplicate of [Transliteration in ruby](http://stackoverflow.com/questions/1726404/transliteration-in-ruby) or [Transliteration with Iconv in Ruby](http://stackoverflow.com/questions/4410340/transliteration-with-iconv-in-ruby) or [How Do I Replace Accented Latin Characters in Ruby](http://stackoverflow.com/questions/225471/how-do-i-replace-accented-latin-characters-in-ruby). – Phrogz Apr 24 '12 at 22:23
-
Exactly i need to convert é to e and ñ to n ! The best solution is use gsub an not forcing to explicit encode ? – jgiunta Apr 25 '12 at 12:23
1 Answers
5
This piece of code I used in other answers about Ruby encoding proved to be effective most of the time. Make sure your script itself is saved with UTF8 coding:
t="doña"
p t.force_encoding(Encoding.locale_charmap).encode('UTF-8')
#=>"do\u251C\u2592a"
If it is replace you want instead of encode there are libraries for that but you could also use a simple regular expression
t="déjà"
puts t.gsub(/[éèàùµñçêï]/, '?') => d?j?
EDIT: i noticed in the comments you want to replace the special version of a character by the normal one, you can do this like follows
p string_with_special_chars.tr(
"ÀÁÂÃÄÅàáâãäåĀāĂ㥹ÇçĆćĈĉĊċČčÐðĎďĐđÈÉÊËèéêëĒēĔĕĖėĘęĚěĜĝĞğĠġĢģĤĥĦħÌÍÎÏìíîïĨĩĪīĬĭĮįİıĴĵĶķĸĹĺĻļĽľĿŀŁłÑñŃńŅņŇňʼnŊŋÒÓÔÕÖØòóôõöøŌōŎŏŐőŔŕŖŗŘřŚśŜŝŞşŠšſŢţŤťŦŧÙÚÛÜùúûüŨũŪūŬŭŮůŰűŲųŴŵÝýÿŶŷŸŹźŻżŽž",
"AAAAAAaaaaaaAaAaAaCcCcCcCcCcDdDdDdEEEEeeeeEeEeEeEeEeGgGgGgGgHhHhIIIIiiiiIiIiIiIiIiJjKkkLlLlLlLlLlNnNnNnNnnNnOOOOOOooooooOoOoOoRrRrRrSsSsSsSssTtTtTtUUUUuuuuUuUuUuUuUuUuWwYyyYyYZzZzZz")

peter
- 41,770
- 5
- 64
- 108