0

I have a small c++ codebase that reads commands from stdin executes them and then outputs the result to stdout. I use the wide input streams: wcin and wcout for this. My problem is that large input line, in the size of 4000+ characters gets cut. I have tested this both on windows and osx and the problem is on both.

I have created a minimal program that illustrates the problem:

#include <iostream>
#include <string>
#include <sstream>

using namespace std;

int main() 
{
    const size_t bufferSize = 2 * 4096;
    wchar_t lineBuffer[bufferSize] = {0};

    wcin.getline(lineBuffer, bufferSize); 
    wstring line(lineBuffer);

    wostringstream wos;
    wos << L", state of wcin, badbit: " << wcin.bad();
    wos << L", eof: " << wcin.eof();
    wos << L", failbit: " << wcin.fail();

    wcout << L"The input: " << line << wos.str() << endl; 

    return 0;
}

Note that the eof, failbit and badbit all look ok when the problem arises.

Code can also be found here with a test string in a comment: https://github.com/Discordia/large-std-input

I can kinda fix this by setting the buffer size of wcin to 4096 (note that that is smaller than the input, the getline buffer is large than the input though), by doing:

const size_t wcinBufferSize = 4096;
wchar_t wcinBuffer[wcinBufferSize] = {0};
wcin.rdbuf()->pubsetbuf(wcinBuffer, wcinBufferSize); 

But this only pushes the problem a bit. If the input is large say 9000 characters (I have then upped the size of the wcin.getline buffer to 4 * 4096) the problem is present again.

What is the best way of doing this if I do not know how large the input will grow? Should I not use getline?

Robert Sjödahl
  • 734
  • 5
  • 19
  • 1
    Can you use `wstring` instead of using `wchar_t[]`? – Thomas Matthews Oct 24 '13 at 19:45
  • Do you mean "wcin >> line;" or in some other way? I tested that now after your comment and that will also get cut without the code for setting the wcin buffer size. (I should probably use a wstring instead of a wchar_t buffer though, but it does not solve the problem). – Robert Sjödahl Oct 24 '13 at 19:52
  • No, I'm talking about the `LineBuffer` and `wcinBuffer` variables. You may be overrunning the buffer. The `wstring` has its own allocation scheme. Try printing "sizeof(wchar_t)" for more information. – Thomas Matthews Oct 24 '13 at 19:56
  • Do you mean like this? http://www.cplusplus.com/reference/string/string/getline/ If so, that didn't work either. – Robert Sjödahl Oct 24 '13 at 20:03
  • No, if you allocate 32 bytes for an array and read in 32 wchars, you may read past the end of the buffer. The `getline` function may be counting characters, not bytes. Your array is measured in bytes. A wchar may be 2 bytes or more. – Thomas Matthews Oct 24 '13 at 20:19
  • Ok. sizeof(lineBuffer) gives 16384. If I make the bufferSize 16 * 4096, then sizeof(lineBuffer) gives 131072, and the problem is still there. – Robert Sjödahl Oct 24 '13 at 20:27
  • [Works for me](http://ideone.com/gvNaRz) – Igor Tandetnik Oct 24 '13 at 20:51
  • @Igor: Yes it looks like it does. For me it gets cut a bit before the qqqq:s start. I'm compiling and running from VS2012 on win7, what did you use? – Robert Sjödahl Oct 24 '13 at 21:05

1 Answers1

1

With VS2012, I can see the input cut off when I paste the string into the console window. But it works if I save this long string to a file, then run the program with the input redirected from said file, as in test.exe < input.txt.

So it seems to be a limitation of the Windows console, not of the C++ streams implementation.

Igor Tandetnik
  • 50,461
  • 4
  • 56
  • 85
  • If it is just a limitation in the windows console, then I do not understand why it works when I make the wcin buffer bigger. – Robert Sjödahl Oct 24 '13 at 21:20
  • Why is Wondows concole so irritating? Why dont the Microsoft guys work on it to make it better like a linux console – sumanth232 Jun 09 '14 at 13:14
  • @sumanth232: https://stackoverflow.com/questions/18015137/linux-terminal-input-reading-user-input-from-terminal-truncating-lines-at-4095/18018473#18018473 ;-) – FooF May 23 '23 at 05:51