Saving file as UTF-8

Question

I am trying to write a program for translating English to Greek. So I find the ASCII number of the English character (a) and then saving the new char to a file. However is still saves 'á' and not 'α' cause they have the same decimal number.

    int main(int argc, char *argv[]) {

    FILE *fp1, *fp2;
    char ch,demo;
    int i;

   fp1 = fopen( argv[1], "r");
   fp2 = fopen("Translated.txt", "w");

   while (1) {
      ch = fgetc(fp1);

      if (ch == EOF)
         break;
      else{

        i = ch + 128;
        demo = i;
        putc(demo, fp2);

      }
   }

   printf("File copied Successfully!");
   fclose(fp1);
   fclose(fp2);

    return 0;
}

How can I save a file as UTF-8 in order to view it as a Greek character ? Any other way of converting ISO8859-1 to ISO8859-7 ?

It is not about saving, as you say same decimal number. It is about displaying. You need to change the encoding for whatever does the displaying. ... or write a graphical renderer yourself. — Yunnosch, Feb 17 '18 at 12:59
There are several classes available that encompass serialization of UTG-8 encoded data files. For example on codeproject.com. — Andrew Truckle, Feb 17 '18 at 13:00
Could you help with understanding the "add 128 to it, **kai** then saving the new char"? — Yunnosch, Feb 17 '18 at 13:00
I don't see any reference to UTF-8 in your code. You are converting from one *codepage* to another. And how that other codepage displays depends on your system: i.e., if you don't do anything more then you'd still "see" the characters in your local, non-Greek code page. — Jongware, Feb 17 '18 at 13:04
@Yunnosch OP just means **and** ;-) which is **kai** in greek (**και** actually) ;-). UTF-8 are 2 bytes while ascii are 1 byte long so for every ascii char one would need to write 2 "chars" length for UTF-8. Take a look here https://stackoverflow.com/questions/30388085/how-to-use-utf-8-in-c-code — PKey, Feb 17 '18 at 13:04
Thanks, i think i will work on my UTF-8 understanding for now ! Thank you all and i know what and σημαίνει ;-) — Andrew Zacharakis, Feb 17 '18 at 13:08
@Plirkee UTF-8 isn't "2 bytes". The length (which is 1-4 bytes) depends on the character you want to encode. — Marco, Feb 17 '18 at 13:28
@d3l I had Greek characters in my mind. But can't argue with what you said - you are absolutely right: 1-4 byes in general. — PKey, Feb 17 '18 at 14:41
I would suggest keeping to one encoding (UTF-8) and doing a straightforward character substitution rather than magic hat tricks with cross-encoding character code arithmetic. You might want to use the blue table from this curious page on [English-to-Greek transliteration](http://www.sjsu.edu/faculty/watkins/transliteration.htm) (though, it says it is for classical Greek rather than modern). — Tom Blodget, Feb 18 '18 at 16:34
i have a project ..i have to create a program that translates greek to english ..and it is simple (the logic) but i can't get the codes.For example greek a (α) has dec value of 225 at iso8859-7 ..but when i print it's value i get -59-76 ..even when i have iso8859-7 encoding. Anything i miss? — Andrew Zacharakis, Feb 20 '18 at 00:18
Start reading [this](https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/). — n. m. could be an AI, Mar 13 '18 at 18:42

Saving file as UTF-8

0 Answers0