I am trying to read a text file which contains many string with accents (punctuation), and fill a database with those string without these accents, using Ruby ( not On Rails).
For example I have:
J'ai été mise au courant des éventualités à temps.
I want to replace the whole line to have the following string:
J'ai ete mise au courant des eventualites a temps.
So, for that I found that method, which should work:
def convert_to_ascii(s)
undefined = ''
fallback = { 'À'=>'A', 'Á'=>'A', 'Â'=>'A', 'Ã'=>'A', 'Ä'=>'A',
'Å'=>'A', 'Æ'=>'AE', 'Ç'=>'C', 'È'=>'E', 'É'=>'E',
'Ê'=>'E', 'Ë'=>'E', 'Ì'=>'I', 'Í'=>'I', 'Î'=>'I',
'Ï'=>'I', 'Ñ'=>'N', 'Ò'=>'O', 'Ó'=>'O', 'Ô'=>'O',
'Õ'=>'O', 'Ö'=>'O', 'Ø'=>'O', 'Ù'=>'U', 'Ú'=>'U',
'Û'=>'U', 'Ü'=>'U', 'Ý'=>'Y', 'à'=>'a', 'á'=>'a',
'â'=>'a', 'ã'=>'a', 'ä'=>'a', 'å'=>'a', 'æ'=>'ae',
'ç'=>'c', 'è'=>'e', 'é'=>'e', 'ê'=>'e', 'ë'=>'e',
'ì'=>'i', 'í'=>'i', 'î'=>'i', 'ï'=>'i', 'ñ'=>'n',
'ò'=>'o', 'ó'=>'o', 'ô'=>'o', 'õ'=>'o', 'ö'=>'o',
'ø'=>'o', 'ù'=>'u', 'ú'=>'u', 'û'=>'u', 'ü'=>'u',
'ý'=>'y', 'ÿ'=>'y' }
s.encode('ASCII',fallback: lambda { |c| fallback.key?(c) ? fallback[c] : undefined })
end
But it just gives me the following string:
J'ai t mise au courant des ventualits temps.
Or even:
J'ai �t� mise au courant des �ventualit�s temps.
I don't understand why it do not work...
EDIT:
I was using
file = File.open(i_FileName, 'r:utf-8')
To read the file, I replaced it by
file = File.open(i_FileName, 'r:iso-8859-1:utf-8')
And it works like a charm !