1

How to do this in C++ using boost locale? In one of the questions I found an example Cross-platform iteration of Unicode string (counting Graphemes using ICU)

#include <iostream>
#include <string_view>

#include <boost/locale.hpp>

using namespace std::string_view_literals;

int main()
{
    boost::locale::generator gen;
    auto string = "noël "sv;
    boost::locale::boundary::csegment_index map{
        boost::locale::boundary::character, std::begin(string),
        std::end(string), gen("")};
    for (const auto& i : map)
    {
        std::cout << i << '\n';
    }
}

This code turned out to be non-working.How can I fix it? Error:

E0289 no instance of constructor "boost::locale::boundary::segment_index::segment_index [with BaseIterator=const char *]" matches the argument list

C2440 'initializing': cannot convert from 'initializer list' to 'boost::locale::boundary::segment_index<const char *>'

enter image description here enter image description here

Version of boost: 1.81.0, I use a pre-release version of the C++23 and C17 standard. Visual Studio. The boost is statically bonded. Icu is installed. File encoded utf8. I compile the project as release x64

MaKeSter
  • 73
  • 7

2 Answers2

1

It does work for me with gen("") as well as gen("en_US.utf8"):

enter image description here

If it doesn't work for you, check that

  • Your source file encoding (UTF8 is assumed here)
  • Boost is built with ICU support
  • your system supports UNICODE locales

sehe
  • 374,641
  • 47
  • 450
  • 633
  • Code does not compile in windows Visual Studio – MaKeSter Jun 09 '23 at 16:01
  • @MaKeSter was that your original problem? Why don't you include the error message that you see, and the Boost/compiler versions that you use? "This code turned out to be non-working" is completely useless for troubleshooting. – sehe Jun 09 '23 at 16:21
  • Version of boost: 1.81.0, I use a pre-release version of the C++23 and C17 standard. – MaKeSter Jun 09 '23 at 18:43
  • Error: E0289 no instance of constructor "boost::locale::boundary::segment_index::segment_index [with BaseIterator=const char *]" matches the argument list – MaKeSter Jun 09 '23 at 18:46
  • @MaKeSter question details go in the question! Please use the edit button there – sehe Jun 09 '23 at 19:41
  • My crystal ball says you might be using wide-char literals. Consider trying `wcsegment_index` – sehe Jun 09 '23 at 19:44
  • With wcsegment_index the error persists. Your source file encoding (UTF8 is assumed here) - yes, Boost is built with ICU support - how to check this?, your system supports UNICODE locales - how to check this? – MaKeSter Jun 09 '23 at 20:24
  • Since you're on Windows, I assume UNICODE is supported. Regardless, we now finally know that you have a compilation error, so it's not relevant to the error. Checking about ICU could be done in many (platform specific) ways, or using https://www.boost.org/doc/libs/1_79_0/libs/locale/doc/html/using_localization_backends.html or the `BOOST_LOCALE_WITH_ICU` define – sehe Jun 09 '23 at 20:50
  • Can't this be fixed? Do I have to switch to Linux? – MaKeSter Jun 09 '23 at 20:53
  • We're still [finding the problem](https://stackoverflow.com/questions/76440884/output-all-line-graphemes-to-the-console/76441802?noredirect=1#comment134791217_76440884) – sehe Jun 09 '23 at 20:55
0

Problem is most probably handling encoding.

If you are using MSVC make sure, that:

  • source file is encoded with UTF-8
  • compler received /utf-8 switch - to properly read source code and generate executable which for strings uses UTF-8
  • locale is configured properly:
#include <iostream>
#include <locale>
#include <string_view>

#include <boost/locale.hpp>

using namespace std::string_view_literals;

int main()
{
    std::locale::global(std::locale(".utf-8")); // inform standard library that executable is using UTF-8 encoding
    std::cout.imbue(std::locale("")); // use system locale on standard output

    boost::locale::generator gen;
    auto string = "noël "sv;
    boost::locale::boundary::csegment_index map{
        boost::locale::boundary::character, std::begin(string),
        std::end(string), gen(".utf-8")};
    for (const auto& i : map)
    {
        std::cout << i << '\n';
    }
}
Marek R
  • 32,568
  • 6
  • 55
  • 140