rdbuf vs getline vs ">>"

Question

I want to load a map from a text file (If you can come up with whatever else way to load a map to an array, I'm open for anything new). Whats written in the text file is something like this but a bit larger in the scale.

6 6 10 (Nevermind what this number "10" is but the two other are the map size.)
1 1 1 1 1 1
1 0 2 0 0 1
1 0 0 0 2 1
1 2 2 0 0 1
1 0 0 0 0 1
1 1 1 1 1 1

Where 1 is border, 0 is empty, 2 is wall. Now i want to read this text file but I'm not sure what way would be best. What i have in mind yet is:

Reading the whole text file at once in a stringstream and convert it to string later via rdbuf() and then split the string and put it in the array.
Reading it number by number via getline().
Reading it number by number using the >> operator.

My question is which of the mentioned (or any other way if available) ways is better by means of ram use and speed. Note: weather or not using rdbuf() is a good way. I'd appreciate a lot a good comparison between different ways of splitting a string, for example splitting a text to words regarding to the whitespaces.

Haven't you had a look at scanf? [Using scanf() in C++ programs is faster than using cin?](http://stackoverflow.com/questions/1042110/using-scanf-in-c-programs-is-faster-than-using-cin) . The answer to the question has comparison between istream and scanf. Also, you can turn off synchronization of istream by setting [std::ios_base::sync_with_stdio](http://en.cppreference.com/w/cpp/io/ios_base/sync_with_stdio) to false, but istream, if i not makes, istream slows only when there is large input. But not sure about memory usage. — Incomputable, Dec 25 '15 at 17:09
Have you searched the internet for duplicates such as "stackoverflow c++ read file matrix" or "stackoverflow read file array"? — Thomas Matthews, Dec 25 '15 at 17:17
How large is 'a bit larger'? Megabytes, or could it be Gigabytes? And are 0 1 2 the only possible values? (i.e. fixed width rows?). — Danny_ds, Dec 25 '15 at 17:17
BTW, there are two performance bottlenecks: 1) Input (reading from a device) and 2) Converting from textual representation to internal representation. Unless your data is in Gigabytes, I suggest not to optimize because any interaction with the User will waste any time you gained through optimizations. — Thomas Matthews, Dec 25 '15 at 17:20
@ThomasMatthews well! yeah actually! i have a good cpu and an ssd memory! i just want to learn! there is not much limit for resources:) but thanks for your advises. — yukashima huksay, Dec 25 '15 at 17:25
@aran - Ah, then it won't matter that much how you read it. For real big data I was about to say that, in fact, you already had an array on disk, which you could access directly via memory mapping. But this would be a bit overkill here :) — Danny_ds, Dec 25 '15 at 17:26

Thomas Matthews · Accepted Answer · 2015-12-25T17:52:29.643

Where 1 is border, 0 is empty, 2 is wall. Now i want to read this text file but I'm not sure what way would be best. What i have in mind yet is:

You don't have enough data to make a significant impact on performance by any of the means you mentioned. In other words, concentrate on correctness and robustness of your program then come back and optimize the parts that are slow.

Reading the whole text file at once in a stringstream and convert it to string later via rdbuf() and then split the string and put it in the array.

The best method for inputting data is to keep the input stream flowing. This usually means reading large chunks of data per transaction versus many small transactions of small quantities. Memory is a lot faster to search and process than an input stream.

I suggest using istream::read before using rdbuf. For either one, I recommend reading into a preallocated area of memory, that is either an array or if using string, reserve a large space in the string when constructing it. You don't want the reallocation of std::string data to slow your program.

Reading it number by number via getline().

Since your data is line oriented this could be beneficial. You read one row and process the one row. Good technique to start with, however, a bit more complicate than the one below, but simpler than the previous method.

Reading it number by number using the >> operator.

IMO, this is the technique you should be using. The technique is simple and easy to get working; enabling you to work on the remainder of your project.

Changing the Data Format

If you want to make the input faster, you can change the format of the data. Binary data, data that doesn't need translations, is the fastest format to read. It bypasses the translation of textual format to internal representation. The binary data is the internal representation.

One of the caveats to binary data is that it is hard to read and modify.

Optimizing

Don't. Focus on finishing the project: correctly and robustly.
Don't. Usually, the time you gain is wasted in waiting for I/O or the User. Development time is costly, unnecessary optimization is a waste of development time thus a waste of money.
Profile your executable. Optimize the parts that occupy the most execution time.
Reduce requirements / Features before changing code.
Optimize the design or architecture before changing the code.
Change compiler optimization settings before changing the code.
Change data structures & alignment for cache optimization.
Optimize I/O if your program is I/O bound.
Reduce branches / jumps / changes in execution flow.

In particular, the program sounds like a maze following program. If the input data is small, IO will be trivial anyway. If the input data is large, IO will be a trivial part of the overall execution time. Do it the simplest way possible (which is >>) first, and only *if necessary* worry about optimization. — Martin Bonner supports Monica, Dec 25 '15 at 18:04

rdbuf vs getline vs ">>"

1 Answers1

Changing the Data Format

Optimizing