1

I've seen in code a direct read into std::string where the contents are intended to be interpreted as a string as follows:

std::string note;
note.resize(n);
read( &note[0], n );

Assume that n is of a fixed size, as in a parsing scenario.

Are there are any issues with reading directly into a string? I have seen a lot of uses of ifstreams, but it seems excessive in this case.

Kara
  • 6,115
  • 16
  • 50
  • 57
Aaron Swan
  • 1,206
  • 1
  • 13
  • 20
  • It's basically the same as reading it into a `char` buffer. If you're sure your buffer is big enough there's nothing wrong with it. – Hatted Rooster Sep 07 '16 at 15:55
  • 3
    It really depends on the use case. If the data is not really a string then you may want to consider a `std::vector` so you do not "lie" to yourself. – NathanOliver Sep 07 '16 at 15:56
  • 2
    There is no such thing as *good practice*. There are techniques suitable to achieve a particular goal or not suitable for a specific set of reasons. Stop thinking about best practices, think like an engineer. – SergeyA Sep 07 '16 at 15:59
  • @SergeyA I understand what you mean. By good practice, I meant, in this scenario, are there specific reasons not to do this. – Aaron Swan Sep 07 '16 at 16:05
  • @NathanOliver Thank you. I've clarified for the intent of interpreting as a string. – Aaron Swan Sep 07 '16 at 16:07
  • @GillBates It looks like you're right in practice. I found the following link that says until c++11, a string wasn't defined to have contiguous storage, but in practice it does: http://stackoverflow.com/questions/1986966/does-s0-point-to-contiguous-characters-in-a-stdstring – Aaron Swan Sep 07 '16 at 21:22

2 Answers2

2

First, if it's a text file made by several lines, I find the std::string class not a good choice as a "container"; I would prefer just a std::vector<char>, or if you want to do some additional parsing and break the file into its single lines, a std::vector<std::string>.

I'd also pay attention to the encoding used by the file: is it UTF-8? Is it some other char-based encoding?

For example, if the file is UTF-16, reading it as a raw sequence of bytes into a std::string would be very misleading (and bug prone).

Moreover, it's important also to pay attention to the size of the file. If you have a gigantic text file (e.g. 5GB) and you are building a 32-bit Windows application, your code won't work (as 32-bit processes on Windows are limited to 2GB by default). In such cases, reading the file content in smaller chunks (or using memory-mapped file techniques with smaller "views" on the file) may be a better alternative.

Mr.C64
  • 41,637
  • 14
  • 86
  • 162
1

Look at it this way: “What's the worst that could happen?”

Are you obtaining a file from the local user? And if they supply a file that's too big, perhaps their machine will thrash, or even kill your program with an out-of-memory error?

Do you expect the user to do that often enough to worry about?

Alternatively:

Are you obtaining the file from a network source or untrusted user? Would giving that user the ability to potentially thrash your system or kill your application constitute a risk?

BRPocock
  • 13,638
  • 3
  • 31
  • 50
  • In answer to your prompt, I found the following http://stackoverflow.com/questions/1986966/does-s0-point-to-contiguous-characters-in-a-stdstring which partially addresses my question. – Aaron Swan Sep 07 '16 at 20:42