0

I need to find out how many symbols are to be printed from a string. Let's say I have this code:

char    buf[200];

strcpy(buf, "\033[31m");     //Red color control sequence
strcat(buf, to_utf8(L'漢'));
strcat(buf, "a");
printf("%s", buf);

where

to_utf8(wchar_t c);

transforms the given white char into its utf8 representation and returns a string of it

Only 2 red symbol will be printed ("漢a").

If I were to run:

strlen(buf);

I would receive a length of 9

What I need is a function which will count the number of to be printed symbols, that is, in this case: 2

I need a solution without any external libraries.

Any ideas on this matter?

Cœur
  • 37,241
  • 25
  • 195
  • 267
Emil Terman
  • 526
  • 4
  • 22
  • `wcslen()` is for `wchar_t`, otherwise it depends to enconding. `wchar_t` *is* an encoding to some extend. – 0andriy Aug 18 '17 at 15:07
  • 4
    "*I need a solution without any external libraries.*" <- which means to write one, and it must be aware of ANSI escape sequences and proper UTF8 multibyte characters. Quite some work. –  Aug 18 '17 at 15:08
  • 1
    @FelixPalmen Also it needs to be aware of Unicode combining characters, and who knows what other headaches. EmilTerman: Just use a library :) I [hear](https://stackoverflow.com/questions/55641/unicode-processing-in-c) ICU is good. – Thomas Aug 18 '17 at 15:10
  • What gets printed depends entirely on the terminal. Some may interpret the escape sequences and print `漢a` in red, while others won't and will print `\033[31m漢a`. So you would need to know what the terminal is. – dbush Aug 18 '17 at 15:10
  • Or better, not use these sequences manually but **another library** instead (like, `curses`, e.g. in the `ncurses` implementation for \*nix) –  Aug 18 '17 at 15:13
  • How is "number of symbols" a useful data point? – melpomene Aug 18 '17 at 15:19
  • 1
    If it needs to parse ESC [ sequences, I guess it also ought to parse CSI sequences in the same way for completeless, i.e. interpret the CSI character U+009B (which is encoded as \xc2\x9b in UTF-8) the same as the ESC [ sequence. – Ian Abbott Aug 18 '17 at 15:20
  • @melpomene I'm currently creating my own shell and I need to know exactly how many symbols are on the screen to know exactly where to place the cursor and do other stuff like that – Emil Terman Aug 18 '17 at 15:21
  • If it's a shell for a terminal, with fixed-width fonts, you should also be aware that CJK characters are typically rendered at double the width of latin characters on such terminals. – Ian Abbott Aug 18 '17 at 15:24
  • @EmilTerman Then you need to know how many screen cells are occupied, not how many symbols there are. Some symbols have width 0, others have width 2. – melpomene Aug 18 '17 at 15:27
  • @melpomene how do I find out how many cells are occupied then? Is there an easy way to do that? – Emil Terman Aug 18 '17 at 15:35
  • Isn't it the job of the terminal, not the shell, to know where the cursor is? – dbush Aug 18 '17 at 15:41

1 Answers1

0

In case anyone still has this question:

A solution to this problem was to know where my cursor is. This guy here has a nice piece of low level code which can read the cursor position.

This is not a direct answer to my question, but knowing where my cursor is, before and after I press a key, resolved my particular problem.

Emil Terman
  • 526
  • 4
  • 22