std::string characters somehow turned into Unicode/ASCII numbers

Question

I have a function ls() which parses a vector of string and puts it into a comma-separated list, wrapped within parentheses ():

std::string ls(std::vector<std::string> vec, std::string wrap="()", std::string sep=", ") {
    std::string wrap_open, wrap_close;
    wrap_open = std::to_string(wrap[0]);
    wrap_close = std::to_string(wrap[1]);
    std::string result = wrap_open;
    size_t length = vec.size();
    if (length > 0) {
        if (length == 1) {
            result += vec[0];
            result += wrap_close;
        }
        else {
            for (int i = 0; i < vec.size(); i++) {
                if (i == vec.size() - 1) {
                    result += sep;
                    result += vec[i];
                    result += wrap_close;
                }
                else if (i == 0) {
                    result += vec[i];
                }
                else {
                    result += sep;
                    result += vec[i];
                }
            }
        }
    }
    else {
        result += wrap_close;
    }

    return result;
}

If I pass this vector

std::vector<std::string> vec = {"hello", "world", "three"};

to the ls() function, I should get this string:

std::string parsed_vector = ls(vec);

// AKA

std::string result = "(hello, world, three)"

The parsing works fine, however the characters in the wrap string turn into numbers when printed.

std::cout << result << std::endl;

Will result in the following:

40hello, world, three41

When it should instead result in this:

(hello, world, three)

The ( is turned into 40, and the ) is turned into 41.

My guess is that the characters are being turned into the Unicode/ASCII number values or something like that, I do not know how this happened or what to do.

[`std::to_string`](https://en.cppreference.com/w/cpp/string/basic_string/to_string) doesn't work the way you think it does. — Captain Obvlious, Dec 18 '22 at 22:29
It is never too early to [learn how to run your code in a debugger](https://stackoverflow.com/questions/25385173). Nobody writes perfect code. Stepping through this code line-by-line in a debugger is how programmers discover exactly where your code deviates from your expectations. — Drew Dormann, Dec 18 '22 at 22:30
`sts::to_string` Converts a **numeric** value to `std::string`. — sklott, Dec 18 '22 at 22:30
What if I want the wrap open to be `"[>"` and the wrap close to be `"<]"`? Or `""` and `""`? Or `"⟦"` and `"⟧"`? — Eljay, Dec 18 '22 at 22:41

score 2 · Answer 1 · answered Dec 18 '22 at 22:30

2

The problem here is std::to_string converts a number to a string. There is no specialization for char values. So here, you're converting the ASCII value to a string:

wrap_open = std::to_string(wrap[0]);
wrap_close = std::to_string(wrap[1]);

Instead, you could simply do:

std::string wrap_open(1, wrap[0]);
std::string wrap_close(1, wrap[1]);

Note that you can greatly simplify your function by using std::ostringstream:

std::ostringstream oss;
oss << wrap[0];
for (size_t i = 0; i < vec.size(); i++)
{
    if (i != 0) oss << sep;
    oss << vec[i];
}
oss << wrap[1];
return oss.str();

answered Dec 18 '22 at 22:30

paddy

60,864
6
61
103

I don't think it's a good idea to support the use of [] operator. As there is no bound checking there. I would recommend amending your answer to use the .at() method. – Milan Š. Dec 18 '22 at 22:40
1

For all we know it may be a documented precondition that the caller always provides a string with exactly 2 characters. I personally think the interface is poor to begin with, and requiring exception handling with `.at` is only a marginal improvement. Better off returning an empty string if `wrap` is not length 2. – paddy Dec 18 '22 at 22:47
I am not disputing that the function could use some improvements, however it is obvious that the one asking the question is learning about C++. In such a case I really do believe there is no reason not to use `.at` or to at the very least - point to its existence. :) EDIT: Just to clarify, it was just a suggestion. – Milan Š. Dec 18 '22 at 22:50
1

I have a pretty hard-line approach when it comes to `.at`. I have seen far more code where it is used _incorrectly_ than code where it's actually appropriate. For that reason, I typically avoid recommending it to beginners and instead focus on ensuring correctly-bounded accesses to begin with. For this particular function I don't think throwing out-of-range exceptions is appropriate. Perhaps edit your own answer to provide an explanation of why you used `.at` and what it does. – paddy Dec 19 '22 at 00:01
Interesting, I've usually had the opposite experience with beginners. As in (without exception handling) the program just crashes and gives them a simple to understand message rather than RAV or weird results. I might edit my answer to suggest that. – Milan Š. Dec 19 '22 at 08:34

Milan Š. · Answer 2 · 2022-12-19T09:22:41.120

1

I won't be commenting on how you could improve the function and that passing a vector by value (as an argument in the function) is never a good idea, however I will tell you how to fix your current issue:

std::string ls(std::vector<std::string> vec, std::string wrap = "()", std::string sep = ", ") {
    std::string wrap_open, wrap_close;
    wrap_open = wrap.at(0); //<----
    wrap_close = wrap.at(1); //<----
    std::string result = wrap_open;
    size_t length = vec.size();
    if (length > 0) {
        ... // Rest of the code

You don't need to use std::to_string, just use one of std::string's constructors to create a string with one character from the wrap string. This constructor is invoked via the = operator.

I recommend reading about std::string, it is apparent that you aren't using the full potential of the STL library : std::string

EDIT: After discussing the usage of .at() vs [] operator in the comments. I've decided to add the bit into this answer: The main difference between .at() and [] is the bounds checking feature. .at will throw an std::out_of_range exception because it is performing a bounds check. The [] operator (IMHO) is present in STL containers due to backwards compatibility (imagine refactoring old C code into a C++ project). Point being it behaves like you would expect [] to behave and doesn't do any bounds checking.

In general I recommend the usage of .at() especially to beginners and especially if you are relying on human input. The uncaught exception will produce an easy to understand error, while untested [] will either produce weird values or RAV (read access violation) depending on the type stored in the container and from experience beginners usually have a harder time debugging this.

Bare in mind that this is just an opinion of one programmer and opinions may vary (as is visible in the discussion).

Hope it helps!

edited Dec 19 '22 at 09:22

answered Dec 18 '22 at 22:36

Milan Š.

1,353
1
2
11

Using `at` rather than `[]` here is a workaround for a badly designed interface. Fix the interface. – Pete Becker Dec 19 '22 at 02:06
Not disputing that here and I am mentioning that the design of the function is not ideal in the first few sentences. However I in general recommend to beginners the usage of 'at' due to its bound checking feature. The uncaught exception produces an easy to understand error and based on my experience the person in question learns to fix these issues by himself. – Milan Š. Dec 19 '22 at 08:40
You've seen, I assume, the code that beginners post here where every access uses `at`, sometimes three time with the same index, inside a `for` loop that simply runs through all the correct indices? That's horrible, and it's the result of learning to always us `at` instead of thinking. Beginners need to learn to think. (Incidentally, your version does a redundant test...) – Pete Becker Dec 19 '22 at 14:10
There are definitely countless examples of bad use of `.at` I am not disputing that. However you will see countless examples of bad use of the `[] operator` as well. In the end I believe the best thing for a new programmer is to learn to skim the documentation and know that such things exist. I firmly believe skill comes with practice and part of that practice is going through documentation and arriving at the question of what to use yourself. That all being said I do not dispute that the interface is bad, but for the given interface `.at()` is a better option than the `[] op` – Milan Š. Dec 19 '22 at 14:20
`assert(wrap.size() >= 2)` says it better. – Pete Becker Dec 19 '22 at 14:26
Are you suggesting to include a runtime assert in a release build of something that requires human input? I agree with you that this assertion is good for debugging, but I highly doubt you want it in release versions of your code. EDIT: Just to clarify, if you wish I can add an update to the EDIT to include these suggestions for debug builds. But I like to pretend that every code presented here is "release" code. – Milan Š. Dec 19 '22 at 14:35
Sigh. Of course not. You validate input **when it comes in**. That's part of **designing** code. If you don't do proper design, you end up putting all sorts of redundant tests into your code, giving you an incomprehensible mess and horrible performance. For "release" code, errors in user input are expected, and exceptions are the wrong mechanism for communicating them. – Pete Becker Dec 19 '22 at 14:41
Well I still don't understand your point in how this would help in release builds when you are relying on human input. But then again if what you are trying to say is that this is part of the actual development cycle where you add these asserts when you are writing the code (i.e. you iterate over your actively developed code) then I would agree with you. However that is not apparent from a sentence like : "`assert(wrap.size() >= 2)` says it better. " – Milan Š. Dec 19 '22 at 14:45
When your code takes human input it should **immediately** check for valid input; if the input isn't valid, notify the user and ask again. If there's a requirement that a particular string have a length of at least two, check the string when it's read, not later on when it's used. Asserts are used in debug versions to catch mistakes in **validation**, not mistakes in input. Mistakes in input shouldn't reach the point where an assert triggers; they should be caught earlier. – Pete Becker Dec 19 '22 at 15:00
Again,... no one disputes that, but as I mentioned at the beginning of my answer - I'm not going to suggest improvements to the function (and 2 `.at()` calls provide 0 overhead). If you are fine with that, I would leave this discussion with your last statement (as we both agree with it). Another reason being that we started this discussion on the matter of how beginners should approach learning and we've diverted with the discussion into basic and general program/library design. – Milan Š. Dec 19 '22 at 15:08

std::string characters somehow turned into Unicode/ASCII numbers

2 Answers2