2

My programs needs to read up to 50k character long strings from stdin. The code is as follows:

#include <iostream>
#include <string>

std::string A;

int main(){
    std::cin >> A;
    std::cout << "String max: " << A.max_size() << std::endl;
    std::cout << "Size: " << A.size();

When I try to enter a 10k character long string I get the following output:

String max: 4611686018427387903
Size: 4095

According to Google both std::cin and std::string should have no problem handling 10k characters, but for some reason A gets truncated after 4095 characters. I entered the string by pasting it into the default Ubuntu terminal. Pasting it into Python3 in the same terminal works fine, which leads me to believe that it's not the terminal that's truncating it, but C++. I compiled with g++ program.cxx and I have 16 GB of RAM.

How can I enter large strings from stdin? Any help is appreciated.

P.S.: If you need a large string just paste this into Python: print("123"*5000)

Jan
  • 31
  • 2
  • 4095 is 2 ^ 12 minus 1. Suspicious. – Yakk - Adam Nevraumont Aug 12 '21 at 23:15
  • Looks like it is a hard limit imposed by your kernel: https://stackoverflow.com/questions/18015137/linux-terminal-input-reading-user-input-from-terminal-truncating-lines-at-4095 – NathanOliver Aug 12 '21 at 23:16
  • You can circumvent the limit by using pipes though. Tested on ubuntu and using pipes I could read 10k characters, with the tty limit tested and confirmed on normal stdin – Lala5th Aug 12 '21 at 23:19
  • @NathanOliver But that doesn't explain how Python could read 10k+ character strings on the same terminal. The author of the linked question couldn't get it to work in C nor in Python, which leads me to believe that we do not have the same exact problem. – Jan Aug 12 '21 at 23:23
  • @Yakk-AdamNevraumont True – Packa Aug 12 '21 at 23:25
  • 1
    @Jan How are you "pasting it into python3"? `python3 -c "print(len(input()))"` suffers from the exact same problem (it prints 4095 with the same long string) – Artyer Aug 12 '21 at 23:28

1 Answers1

3

You're probably running into the 4096 line limit input of the Linux terminal when in canonical mode. If you try to enter a very long line (>4095 chars) into a Linux terminal, the excess will be discarded until you enter a newline or eof character to flush the terminal buffer.

There are a couple of ways you can work around this:

  • insert an eof/flush char (usually ctrl-D) every so often in the input -- each such character will flush the terminal buffer, resetting that 4K limit, and allowing a longer line to be entered. Be careful not to insert one immediately after a newline or another eof, as that will cause an EOF on the input
  • put the terminal into non-canonical mode. This will cause the buffer to be flushed whenever a process reads it; most processes will generally read much faster than characters can be typed or pasted into the terminal, so you'll never come close to the 4K limit.
user4581301
  • 33,082
  • 7
  • 33
  • 54
Chris Dodd
  • 119,907
  • 13
  • 134
  • 226