There's no such thing as "character".
Or, more precisely, what "character" is depends on whom you ask.
If you look in the Unicode glossary you will find that the term has several not fully compatible meanings. As a smallest component of written language that has semantic value (the first meaning), á
is a single character. If you take á
and count basic unit of encoding for the Unicode character encoding (the third meaning) in it, you may get either one or two, depending on what exact representation (normalized or denormalized) is being used.
Or maybe not. This is a very complicated subject and nobody really knows what they are talking about.
Coming down to earth, you probably need to count code points, which is essentially the same as characters (meaning 3). mblen
is one method of doing that, provided your current locale has UTF-8 encoding. Modern C++ offers more C++-ish methods, however, they are not supported on some popular implementations. Boost has something of its own and is more portable. Then there are specialized libraries like ICU which you may want to consider if your needs are much more complicated than counting characters.