0

I made a C program that had the following requirements for input string or it's substring/chars:

  • Must be able to user input from console, and/or read input from text file
  • Must support reading, comparing, modifying and writing different inputs
  • Must work (at least) on windows machine

Because of those limitations, I made the initial version of my program to work with ascii input only, but now I am looking for ways to improve it to support more character sets.

At first, I tried to define my input as char[] array. That solution made text input corrupted(multibyte) for each unicode char. For user input on console the solution kind of worked, but only if my code page was correct(eg. chcp 1252 for extended Latin chars). I do not like that solution, because I am unable to predict, which language is user trying next.

I also tried using wchar_t[] input, that worked on both test and user input and only required chcp 65001. But that solution required also predicting next input language, because it printed foreign scripts correctly only on correct system settings (for example cyrillic - system("powershell Set-Culture -CultureInfo ru-RU"); ).

Defining input as unicode string seems to work better for just echoing same input back, but as chars are with different length, it is hard to access individual chars.

What would be the easiest way to successfully implement working with different alpabets in the same runtime.

tanler
  • 9
  • 3
  • 1
    Instead of guessing the language, using unicode is probably the best thing to do as it is guaranteed to work with all languages. However, to work with it you probably do need to use some kind of a library, I suggest looking into that. This related question offers a few options: https://stackoverflow.com/questions/313555/light-c-unicode-library – Sander Apr 19 '23 at 11:33
  • 3
    One approach is described in the [UTF-8 Everywhere Manifesto](http://utf8everywhere.org). (It's unfortunately more difficult under Windows, but they've got lots of specific advice.) – Steve Summit Apr 19 '23 at 11:36

0 Answers0