0

This is usually done with pipes. Pipes seem to work fine in the environments that I've been testing (and getting a lot of work done in!) in Windows 10, specifically:

  • Git bash
  • MSYS2 bash

I have found if I have a large file or binary stream I can accurately use some of the tools that are installed (cat largefile.JPG | wc -c), but whenever I write my own image processing programs with C++, whatever method I use to read stdin (whether with cstdio old school C methods, with a C program, or with C++ cin iostreams) I get only a small fraction of the stream showing up before it ends. The length of it seems to be deterministic, so the same file produces the same result.

Testing the same code on OS X or Linux obviously leads to proper functioning where the length of the stdin stream is the correct length. Hence making this a practical way on those platforms to pass data without hitting the disk. I'd been honing my bash-fu for a decade now, so it comes pretty naturally.

Of course other methods must exist that I can leverage, but I can't really come up with something quickly that I can expect to rely on. What are some things I could try to troubleshoot this here? I really like the set of unix tools I can install with pacman inside MSYS2, including

g++.exe (Rev2, Built by MSYS2 project) 7.1.0
Copyright (C) 2017 Free Software Foundation, Inc.

But this is my one big stumbling block so far. My simplest program compiled with this compiler is unable to slurp up a useful amount of data off the standard input stream. Why is that? If it's some limitation of the operating system, or of the posix layer and all of that black magic, then why does wc work perfectly?

Steven Lu
  • 41,389
  • 58
  • 210
  • 364
  • 1
    Please show your simplest, compilable, runnable program that fails to read all of `stdin` and how you run it. – Mark Setchell Sep 15 '17 at 07:01
  • I used [this](https://stackoverflow.com/a/3495410/340947) code, and run it like this: `cat large.JPG | ./size`. It is compiled with `gcc size.c -o size`. – Steven Lu Sep 15 '17 at 07:06
  • Have you looked at https://github.com/borgbackup/borg/pull/2032 ? – zortacon Sep 15 '17 at 07:06
  • 3
    You need to read the stream in *binary mode*. In Windows byte value 26 (Ctrl Z) indicates EOF for text mode, at least if it appears alone on a line. That said, a memory mapped file appears more efficient than copying huge quantities of data around. – Cheers and hth. - Alf Sep 15 '17 at 07:08
  • I presume it works correctly when run as `./size large.JPG`? – Mark Setchell Sep 15 '17 at 07:11
  • @MarkSetchell Yes, it does. I need a pipe, because the actual input in the thing i'm trying to do is a PGM output from GraphicsMagick, which is used to perform a convolution on the image. Just hoping to not have to write the intermediate result to disk, nor to have to resort to hand-coding that first stage of processing... – Steven Lu Sep 15 '17 at 07:13
  • @Cheersandhth.-Alf I tried `freopen(NULL, "rb", stdin);` in the code, but it did not fix the problem. HOWEVER i can confirm that a `1A` is in the right position in the file (the position immediately after the length provided to me by that program). So thanks for the tidbit on Ctrl+Z. Never knew about that. – Steven Lu Sep 15 '17 at 07:20
  • 1
    Based on the comment of Cheers, here are some ideas on opening std::cin as binary: https://stackoverflow.com/questions/7587595/read-binary-data-from-stdcin – stefaanv Sep 15 '17 at 07:26
  • 1
    `_setmode(_fileno(stdin), _O_BINARY);` seems to be the trick. – Steven Lu Sep 15 '17 at 13:34
  • I think credit probably goes to Alf. Please make an answer so I can accept it... – Steven Lu Sep 15 '17 at 13:35

2 Answers2

1

Not really an answer, just a tip for helping to work it out...

As you already have Graphicsmagick, you can create files with arbitrary length and content very easily for testing.

So, a 64kB file full of hex 27:

gm convert -depth 8 -size 64x1024 xc:"#272727" gray:- | wc -c

Or, a 32kB PGM file of zeroes:

gm convert -depth 8 -size 32x1024 xc:"#000" PGM:-

Or, a file with all the hex values between 0x0 and 0xff:

gm convert -depth 8 -size 1x256 gradient:black-white gray:- | xxd 

00000000: 0000 0102 0304 0506 0708 090a 0b0c 0d0e  ................
00000010: 0f10 1112 1314 1516 1718 191a 1b1c 1d1e  ................
00000020: 1f20 2122 2324 2526 2728 292a 2b2c 2d2e  . !"#$%&'()*+,-.
00000030: 2f30 3132 3334 3536 3738 393a 3b3c 3d3e  /0123456789:;<=>
00000040: 3f40 4142 4344 4546 4748 494a 4b4c 4d4e  ?@ABCDEFGHIJKLMN
00000050: 4f50 5152 5354 5556 5758 595a 5b5c 5d5e  OPQRSTUVWXYZ[\]^
00000060: 5f60 6162 6364 6566 6768 696a 6b6c 6d6e  _`abcdefghijklmn
00000070: 6f70 7172 7374 7576 7778 797a 7b7c 7d7e  opqrstuvwxyz{|}~
00000080: 7f80 8182 8384 8586 8788 898a 8b8c 8d8e  ................
00000090: 8f90 9192 9394 9596 9798 999a 9b9c 9d9e  ................
000000a0: 9fa0 a1a2 a3a4 a5a6 a7a8 a9aa abac adae  ................
000000b0: afb0 b1b2 b3b4 b5b6 b7b8 b9ba bbbc bdbe  ................
000000c0: bfc0 c1c2 c3c4 c5c6 c7c8 c9ca cbcc cdce  ................
000000d0: cfd0 d1d2 d3d4 d5d6 d7d8 d9da dbdc ddde  ................
000000e0: dfe0 e1e2 e3e4 e5e6 e7e8 e9ea ebec edee  ................
000000f0: eff0 f1f2 f3f4 f5f6 f7f8 f9fa fbfc fdfe  ................
Mark Setchell
  • 191,897
  • 31
  • 273
  • 432
0

Ctrl+Z signals EOF in Windows, and I was unaware of this.

Credit to Alf for the comment with this answer. If you post an answer I will switch the accept.

Steven Lu
  • 41,389
  • 58
  • 210
  • 364