I'm guessing that you'd like to convert the character set (alias code page) of a file so you can open and read it.
I'm assuming you are using a Windows computer.
Let's say that your file is russian.txt and when you open it with notepad, the characters doesn't make any sense. The russian.txt file's character encoding is most propably ANSI and it's code page is Windows-1251.
Some words about character encoding:
- In ANSI one character is one byte long.
- Different languages have different code pages: Windows-1251 = Russian, Windows-1252 = Western Languages (English, German, Swedish...), Windows-1253 = Greek ...
- In UTF-8 English characters are one byte long and non-English characters two bytes long.
- In Unicode all characters are two bytes long.
- UTF-8 and Unicode doesn't need code pages.
You can check the encoding by opening the file in notepad and clicking File, Save As. At the right bottom corner beside the Save-button you can see the encoding.
With some googling I found a site where you can do the character encoding conversion online. I Haven't tested it, but here's the address:
I've made a script (= a small program) which changes the character encoding from any ANSI and code page combination to UTF-8 or Unicode or vice versa.
Let's say you have and English Windows computer and want to convert the russian.txt (ANSI / Windows-1251) to UTF-8.
Here's how:
- Open this web-page and copy the script in it to the clipboard:
- Create a new file named ConvertCharset.vbs to the same folder, where the russian.txt is, say C:\Temp.
- Open the ConvertCharset.vbs in notepad (right click+edit) and paste.
- Open CMD (Windows-button+R, cmd, Enter).
- In CMD-window type (hit Enter-key at each end of the line):
cd C:\Temp\
cscript ConvertCharset.vbs /InputCharset:Windows-1251 /OutputCharset:utf-8 /InputFile:russian.txt /OutputFile:russian_utf-8.txt
Now the you can open the russian_utf-8.txt in notepad and you'll see the Russian characters OK.
More info: