I am trying to read a UTF-8 encoded .txt file and need to do validations on it.
I am working on Windows 10 even though I need the solution to work the same way on Linux. I work with Dev c++ 6.3, TDM-GCC 9.2.0 64-bit Compiler and I am compiling with GNU C++11
At the moment I am reading the following .txt file:
Inicio
D1
Biatlón
S1
255
E1
Esprint 7,5 km (M); 100; 200
E2
Persecucion 10 km (M); 100; 200
ff
This is my code:
#include <iostream>
#include <locale.h>
#include <locale>
#include<fstream>
#include<string>
#include <windows.h>
#define CP_UTF8 65001
#define CP_UTF32 12000
#include <codecvt>
using std::cout;
std::wstring utf8_to_ws(std::string const&);
int main(){
std::ifstream file;
std::string text;
if (!SetConsoleOutputCP(CP_UTF8)) {
std::cerr << "error: UTF-8 codigo.\n";
return 1;
}
file.open("entryDisciplineESP.txt");
int line = 0;
if (file.fail()){
cout<<"Error. \n";
exit(1);
}
while(std::getline(file,text)){
if(linea == 2){
std::cout<<text[5]<<"\n";
auto a = utf8_to_ws(text);
std::wcout<<a<<"\n";
}
std::cout<<text<<"\n";
line++;
}
cout<<"\n";
system("Pause");
return 0;
}
std::wstring utf8_to_ws(std::string const& utf8)
{
std::wstring_convert<std::codecvt_utf8<wchar_t>, wchar_t> cnv;
std::wstring s = cnv.from_bytes(utf8);
if(cnv.converted() < utf8.size())
throw std::runtime_error("incomplete conversion");
return s;
}
And I am receiving the following by console:
Inicio
D1
Biatln
Biatlón
S1
255
E1
Esprint 7,5 km (M); 100; 200
E2
Persecucion 10 km (M); 100; 200
ff
If I print the file on the screen, I receive the character "ó" but not separately, I need to interact with that character to do validations, I need to check that there are no numbers or special characters on that line: "!,?,:" etc. I also need to save that name in a string and be able to interact with it and display results on the console.
Thanks in advance.