0

Can we change from ASCII to BIG5??

Actually I have to generate a file in BIG5 format from ASCII format and I am not able to find a way to change the encoding of the file. My file created here contains Chinese data which is not displayed in ASCII format and it can only be displayed in BIG5 format.So once I have created an ASCII file I need to convert it to the BIG5. So thats why I need it to convert to BIG5.

Pankaj Anand
  • 475
  • 1
  • 6
  • 15
  • This http://stackoverflow.com/questions/13281899/display-big5-encoding-from-dbase-on-the-web question can be useful. – Mihai8 Jan 31 '13 at 09:03
  • What does the ASCII file contain? Please show some examples. If the file contains Chinese as transcribed in Latin letters (using e.g. pinyin or Wade–Giles), then it is something completely different from character encoding conversion. Rewriting transcribed Chinese into normal Chinese “ideographs” (Han characters) is rather complicated, because the same transcribed word may map to many different Chinese words. – Jukka K. Korpela Jan 31 '13 at 09:20
  • @user1929959 : Thanks but that question is a little different.. In my case I want to covert a file from ASCII to BIG5.. – Pankaj Anand Feb 06 '13 at 08:59
  • @JukkaK.Korpela : I have some traditional Chinese in the file for eg. " 松大道 " these kind of characters... – Pankaj Anand Feb 06 '13 at 09:03
  • 1
    " 松大道 " cannot be written in Ascii. Ascii is a 7-bit encoding that has 128 code positions only, for control characters, basic Latin letters, common digits, and a handful of other characters. – Jukka K. Korpela Feb 06 '13 at 10:36
  • @JukkaK.Korpela : ok so basically we can convert BIG5 into ASCII but the characters wont show up. thanks – Pankaj Anand Feb 12 '13 at 12:13
  • No, you cannot convert BIG5 into ASCII. You should rephrase your problem, starting from a description of how (which software, which settings?) the purportedly ASCII file was created. – Jukka K. Korpela Feb 12 '13 at 12:22

1 Answers1

1

I have no idea how a file in ASCII encoding could contain Chinese data but if it were possible this would be the command:

iconv -f ASCII -t BIG5 asciifile -o big5file.txt

It will convert your file in ASCII encoding to BIG5 and write the output to big5file.txt.

But most likely it is not ASCII that you have in the original file. Make sure you detect the exact encoding and then use it in the command. Use iconv -l to view all available encodings.

You can try to figure out the real encoding with chardet or cchardet. If not available in your terminal, you can install it with pip install chardet (or pip install cchardet).

Once installed pass the the file name as first argument:

 chardet Tian.Jiang.Xiong.Shi.srt 
      >>> Tian.Jiang.Xiong.Shi.srt: GB2312 with confidence 0.99

If you install with pip3 then the script name will be chardet3 or chardetect3.

ccpizza
  • 28,968
  • 18
  • 162
  • 169
  • Thanks I converted the file from ASCII to BIG5 but why is that the chinese characters appears as "?????". I really cannot understand the thing. – Pankaj Anand Feb 06 '13 at 09:05
  • If you get ???? that means the actual encoding is not in fact ASCII but something else. You need to find out what the actual encoding is, for example using some of these tools: http://stackoverflow.com/questions/3759356/what-is-the-most-accurate-encoding-detector – ccpizza Feb 07 '13 at 09:48
  • @JukkaK.Korpela : The answer was accepted because it did converted the BIG5 to ASCII as verified from the size of the file, the characters wont show because it can not represent chinese character in the ASCII. – Pankaj Anand Feb 12 '13 at 12:16
  • @ccpizza : I am sure that the file was ASCII because when I was writing the file I clearly specified in the parameters that ASCII encoding dhould be used to write the file. But the link you provided was also very helpful.. Thanks – Pankaj Anand Feb 12 '13 at 12:18