I want to convert windows UTF8 file containing a special apostrophe to unix ISO-8859-1 file. This is how I am doing it :
# -- unix file
tr -d '\015' < my_utf8_file.xml > t_my_utf8_file.xml
# -- get rid of special apostrophe
sed "s/’/'/g" t_my_utf8_file.xml > temp_my_utf8_file.xml
# -- change the xml header
sed "s/UTF-8/ISO-8859-1/g" temp_my_utf8_file.xml > my_utf8_file_temp.xml
# -- the actual charecter set conversion
iconv -c -f UTF-8 -t ISO8859-1 my_utf8_file_temp.xml > my_file.xml
Everything is fine but one thing in one of my files. It seems like there is originally an invisible character at the beginning of the file. When I open my_file.xml in Notepadd ++, I see a SUB at the beginning of the file. In Unix VI I see ^Z.
What and where should I add to my unix script to delete those kinds of characters.
Thank you