1

I got a file which has a non-English name. The content is also non-English.

Now, I would like to create a stream on that file to be able to read/write data.

Also, if possible, I would like to detect on which language is the text.

For example:

I have a file "հայերեն.txt" and inside the text file we have written "Բարև"․ File is Unicode-encoded. Now I want to read both filename and text into RAM.

Also, suppose I have some other text "Վահագն". Now I want to create a file "Վահագն․txt" and write some other unicode text inside the file.

OS: Windows 7+.

C++: vc120 or vc140.

mbaros
  • 825
  • 8
  • 31
  • 2
    Hi, what do you mean, non-english? Do you mean it's in UTF8? You can usually just create a stream from the file using ifstream, but you need to be careful when interpreting the bytes. – user1582024 Jul 05 '16 at 13:45
  • 2
    [How to open an std::fstream (ofstream or ifstream) with a unicode filename?](http://stackoverflow.com/questions/821873/how-to-open-an-stdfstream-ofstream-or-ifstream-with-a-unicode-filename) Also, please tell us on which OS your are working (Windows, Linux, MS-DOS) – mvidelgauz Jul 05 '16 at 13:46
  • 1
    I think you want to utilize a wide character stream. Please look here. http://stackoverflow.com/questions/12789273/read-write-unicode-c – Christopher Crowe Jul 05 '16 at 13:47
  • @ChristopherCrowe not necessarily if file is UTF8 encoded – mvidelgauz Jul 05 '16 at 13:48
  • 2
    Please post relevant part of your code – mvidelgauz Jul 05 '16 at 13:49
  • 1
    Detection will only be possible if [BOM](https://en.wikipedia.org/wiki/Byte_order_mark) is present in the file. Otherwise - heuristic methods – mvidelgauz Jul 05 '16 at 14:02
  • I have updated the question with examples – mbaros Jul 05 '16 at 14:30
  • @mbaros Why don't you want to show us your code? Is it protected IP? – mvidelgauz Jul 05 '16 at 14:50
  • These are really two distinct questions, with separate answers. **(1)** [how to use Unicode file names](http://stackoverflow.com/questions/821873/how-to-open-an-stdfstream-ofstream-or-ifstream-with-a-unicode-filename), and **(2)** How to read the text. That last answer depends on how your text file is encoded. – roeland Jul 05 '16 at 23:38

1 Answers1

2

Windows has natively a pretty good support for unicode, at least for the Basic Multilingual Plane, since at least Windows 3 with the Unicode part of the API.

You just have to use wide characters for the name of the file and for the functions that process it, and a wide character stream. But without seeing your code, I'm afraid I cannot say much more, except that with decently recent C++ compilers, you can do that either with WinAPI functions or with the standard C++ library (std::wstring and and std::wfstream). But you will have to know what encoding you use for the files...

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252