1

This is an old problem, which I have observed in past. So thought of getting a clarification once & for all. There are many standard / orthodox C library functions, which deal only with C-style strings. For example, my current implementation looks like this:

std::string TimeStamp (const time_t seconds)  // version-1
{
  auto tm = *std::localtime(&seconds);  // <ctime>
  char readable[30] = {}; 
  std::strftime(&readable[0], sizeof(readable) - 1, "%Y-%h-%d %H:%M:%S:", &tm);
  return readable;
}

Above works as expected. But as you can see, that the readable is copied from stack array to std::string. Now this function is called very frequently for logging & other purposes.

Hence, I converted it to following:

std::string TimeStamp (const time_t seconds)  // version-2
{
  auto tm = *std::localtime(&seconds);  // <ctime>
  std::string readable(30,0);
  std::strftime(&readable[0], readable.length(), "%Y-%h-%d %H:%M:%S:", &tm);
  return readable;
}

At unit test level, it apparently seems to work. But for overall logging in my much larger code, it somehow gets messed up. A new line character appears after this output & many of the output strings which are called outside this function are not printed. Such issue happens only when the "version-1" is changed to "version-2".
Even following modification also doesn't help:

readable.resize(1 + std::strftime(&readable[0], readable.length(), "%Y-%h-%d %H:%M:%S:", &tm));

Is there anything wrong in my code? What is the correct way of directly using std::string in the C-style string functions?

iammilind
  • 68,093
  • 33
  • 169
  • 336
  • `\0` can appears in `std::string`, you have to resize the string correctly. (I think the `1 + ` is incorrect). – Jarod42 Feb 01 '17 at 10:47
  • @Jarod42, actually removing `1 +` resolves the problem. However, I put it as a kind of convention that a string should end with a `NUL` character. BTW, if you have an explanation for why `1 + ` is wrong & how will it affect the output (due to addon `NUL`), then feel free to put it as an answer. Thanks. – iammilind Feb 01 '17 at 10:52
  • You should check the return value of `strftime` and handle failure – M.M Feb 01 '17 at 10:52
  • 1
    It sounds like the problem is that you're returning a string with a bunch of nulls on the end and the rest of your program doesn't correctly handle that sort of string – M.M Feb 01 '17 at 10:55
  • @M.M, yes that's the problem IMO as well. However, those printing is happening with `std::cout` only. Why do you feel that `std::string` with `\0`-s at end should misbehave? Shouldn't they be just ignored. Also if that's causes the issues, then also it should be consistent, however many of the logs are still printed. Is it causing some kind of UB? – iammilind Feb 01 '17 at 10:58
  • 1
    _"A new line character appears after this output & many of the output strings which are called outside this function are not printed."_ Show us, with a [MCVE]. – Lightness Races in Orbit Feb 01 '17 at 10:58
  • 1
    @iammilind: A string may contain NULLs, but there are near-infinite ways to handle such a string, and apparently yours is wrong. We cannot comment on how or why, without knowing what your code is. – Lightness Races in Orbit Feb 01 '17 at 10:59
  • 1
    @iammilind I'm suggesting that the rest of your code contains bugs when this sort of string is supplied as input. So you either have to make the rest of your code accept this string, or not supply this string. – M.M Feb 01 '17 at 10:59
  • @LightnessRacesinOrbit, unfortunately I can't show with minimal example, because it doesn't show up in the unit tests. But it does show up in the overall program, which uses simple `std::cout` for outputting these. May be there are bug in my code as M.M suggested. That's why the code examples are just preface of the main Qn, which is: *"What is the correct way to deal with `std::string` in the C-style string functions?"* Consider that as something to be answered. I am not asking to fix the apparent bug in my overall code. – iammilind Feb 01 '17 at 11:06
  • I can't believe I have to say this to a 38.5k user but, yes, you can show with a minimal example, by spending some time constructing one, and we can't/won't solve your problem without it. _"What is the correct way to deal with std::string in the C-style string functions?"_ This is a non-question. The real question underneath this post is waiting for an MCVE; both MM and myself (twice now) have indicated that the bug lies elsewhere. – Lightness Races in Orbit Feb 01 '17 at 11:08
  • C strings are 0-terminated. C++ strings are not. `'\0'` is just a normal character in a C++ string. `std::string readable(30,0)` is no better or worse than `std::string readable(30,'@')`. Do you expect the latter to work correctly with the rest of your program? – n. m. could be an AI Feb 01 '17 at 11:11
  • Not related to the main question, but `std::localtime` is not re-entrant and not thread-safe. And this is not an easy problem to solve. [See this thread](http://stackoverflow.com/questions/15957805/extract-year-month-day-etc-from-stdchronotime-point-in-c) – M.M Feb 01 '17 at 11:15
  • You should use `std::put_time` or similar if you want to get rid of C strings. – n. m. could be an AI Feb 01 '17 at 11:21
  • 1
    @M.M Actually very easy --- just use localtime_s. – n. m. could be an AI Feb 01 '17 at 11:22
  • @n.m., I tried with `put_time`, but it was not implemented in my machine yet. I am using the latest Clang. There was some issue, which I couldn't figure out easily, hence I had to resort on the C-style function. Regarding your comment on `\0` vs `@` thing, I expected that while `std::cout`, the `\0` will not have any effect as such. may be they really don't have & that's causing issue due to something in my code. I think, I will take your input of `localtime_s()`, thanks. – iammilind Feb 01 '17 at 11:32
  • In fact nobody implements localtime_s except Microsoft, despite it being in the C11 standard (not C++11 though). But the rest of the world has localtime_r. Just lobby for its inclusion in the next standard. – n. m. could be an AI Feb 01 '17 at 12:02

2 Answers2

1

Your first function is correct. There is no point mucking around with the troublesome details in the second function because even once you get it right, it is no improvement on the first function.

In fact it might even perform worse, because of the need to over-allocate the string and resize it down. For example perhaps the size 30 exceeds the size for Small String Optimization but the actual length of data doesn't.

M.M
  • 138,810
  • 21
  • 208
  • 365
0

std::string can have \0 in it.

so

std::string s1 = "ab\0\0cd";   // s1 contains "ab"       -> size = 2
std::string s2{"ab\0\0cd", 6}; // s2 contains "ab\0\0cd" -> size = 6

Your first snippet use constructor 1 whereas the second is similar to the second one (string of size 30 filled with \0).

So you have to resize correctly your string to avoid trailling \0.

Jarod42
  • 203,559
  • 14
  • 181
  • 302
  • Why would there be a leading `\0`? – Lightness Races in Orbit Feb 01 '17 at 10:58
  • @LightnessRacesinOrbit: OP overwrites beginning of a string of size `30` with a shorter buffer. – Jarod42 Feb 01 '17 at 10:59
  • Yes, so why would there be a leading `\0`? – Lightness Races in Orbit Feb 01 '17 at 11:00
  • 2
    You must have meant a trailing `\0`. – Anton Feb 01 '17 at 11:00
  • Okay, you're talking about trailing NULLs. Now, as you've pointed out yourself, this is perfectly valid, so the problem is a bug in code that the OP did not post (thus this question should be closed pending MCVE, not answered). However, I would agree that it makes the most sense to trim this string to size. – Lightness Races in Orbit Feb 01 '17 at 11:02
  • @LightnessRacesinOrbit: typo: trailing versus leading... sorry – Jarod42 Feb 01 '17 at 11:02
  • Can you elaborate more on why trailing `\0` should be avoided? Do they cause any undefined behaviour themselves? I thought that `std::cout` will simply ignore them, if they come across. – iammilind Feb 01 '17 at 11:13
  • `std::cout << s2` is different than `std::cout << s2.c_str()` (not the same overload), the former would display all characters in it, and displaying some (unprintable) characters (as `\0`) is UB if I remember correctly. – Jarod42 Feb 01 '17 at 12:26
  • @Jarod42, if you can paste a standard reference or some other answer reference which may contain such info, then I would be happy to accept this answer. – iammilind Feb 02 '17 at 04:01
  • According to [what-is-the-behavior-of-writing-a-non-printing-character-in-c-c](http://stackoverflow.com/questions/41700322/what-is-the-behavior-of-writing-a-non-printing-character-in-c-c), it is implementation specific, not UB. – Jarod42 Feb 02 '17 at 08:48