0

I found this great solution of counting words in text. My problem is I can properly understand these "magic" of STL that the solution provide. If I understand right letter_only - is a struct that create locale object to find only letters and excluding punctuaion. When main() calls imbue() letter_only's constructor calls get_table()... Realization of this function for me - deep dark magic.

I spend a lot of time to read cppreference but can't exactly understand those 3 strings.

My goal is adapting this solution to unicode chars.

Can someone explain me how it works? Thank you.

Alex_H
  • 134
  • 11

1 Answers1

0

A locale is not a simple object; it's built from multiple facets. One of those facets is the ctype. (Character type). Your linked solution creates a locale with a custom ctype<char> called letter_only.

The implementation of letter_only works by building a table of characters, so you can determine if char c is a letter by lookup.

This is potentially different for you, as a table for wchar_t would be much bigger. That's why when inheriting from ctype<wchar_t> you typically override virtual bool do_is(mask m, wchar_t c) const.

MSalters
  • 173,980
  • 10
  • 155
  • 350
  • Thank you. Am I right that std::fill - just filling vector (rc) with "english" characters so creating the "table"? – Alex_H May 31 '17 at 09:43
  • No, that's not how the lookup table works. The "english" characters are not the content, but the table _index_. The table content is their type (`alpha` = alphabetic). – MSalters May 31 '17 at 10:05
  • There is a clue) Thank you – Alex_H May 31 '17 at 10:09