1

I'm using Lua in a cmd window under Windows. I use "cat" (from UnxUtils) to feed a file to a Lua script. The script uses "io.read(1)" to read one byte at a time.

local b, n ;
n = -1 ;
b = true ;
while b do
  n = n + 1 ;
  b = io.read(1) ;
end ;
print( n, "bytes read" ) ;

When I feed the script a 333K .EXE file, it claims "24025 bytes read". Feed the same .EXE to "wc" (another UnxUtils), and wc correctly says 333008.

> cat "Firefox Installer.exe" | lua count.lua
24025   bytes read
cat: write error: Invalid argument
> cat "Firefox Installer.exe" | wc
   1408    8674  333008

Since I get the expected answer when I "cat | wc", I don't think there's anything wrong with the "cat" program, or with Windows' implementation of redirection.

I am not looking for advice on how to make the Lua script more efficient. I do not need advice on how to make the script read directly from a file (that works as expected). I am looking for a clue as to where to look for the reason I can't use Lua to write a filter (and be able to trust the results).

I have looked at the input file to see if a Ctrl-Z or Ctrl-D was the reason for the early shut-off -- they occur very early in the file.

I tried reading after "io.read()" returned "false": the script admitted to seeing more bytes, but still no more than 45K of the 333K input file.

  • Likely to be a Windows issue (see e.g. [this answer](https://stackoverflow.com/a/39339407/7185318)). Windows treats binary and text "streams" / files differently. I would assume that your programs stdin is a text stream by default; it isn't possible to change the mode of `stdin` to binary later on using plain Lua, you'll need a library for that. Something like `lfs = require("lfs"); lfs.setmode(io.stdin, "binary")` might work (using the [LuaFileSystem](https://lunarmodules.github.io/luafilesystem/manual.html) library). You could also try to fix your script invocation to set the correct mode. – Luatic Jan 31 '23 at 21:21
  • @LMD That seems to be my problem. A Visual Studio C++ program doing "fread(&c,1,1,stdin) also stops short. Adding "_setmode(_fileno(stdin),_O_BINARY)" results in the program reading the correct number of bytes. I don't understand why each program sees as much as it sees -- the Lua reads 24025 bytes, the C reads 1169 -- but this is almost certainly the problem. Post an answer and I'll pick it. – Jay Michael Feb 01 '23 at 03:30
  • There was a [post](http://lua-users.org/lists/lua-l/2022-02/msg00117.html) on mailing list, but the change requested was not implemented as it is non-portable (OS-specific and compiler-specific: the solution is different for MinGW and VisualStudio). – ESkri Feb 01 '23 at 04:57
  • @LMD - `You could also try to fix your script invocation to set the correct mode` - How is it possible to reopen stdin as binary from the outside of the program? – ESkri Feb 01 '23 at 04:59
  • @ESkri as you can tell, I'm not using Windows, and I haven't tried it ;) but I would imagine that it should be possible to have a program which sets `stdin` to binary mode, then invokes the Lua script [like this](https://gist.github.com/appgurueu/5574544fbfaed0e732baad521c0f4fba). That way you could leave the Lua untouched. You'd then have to change the invocation to `cat "Firefox Installer.exe" | ./stdbin lua count.lua`. – Luatic Feb 01 '23 at 07:10

2 Answers2

1

Copied from my comments:

Likely to be a Windows issue (see e.g. this answer). Windows treats binary and text "streams" / files differently. I would assume that your program's stdin is a text stream by default; it isn't possible to change the mode of stdin to binary later on using plain Lua, you'll need a library for that. Something like lfs = require("lfs"); lfs.setmode(io.stdin, "binary") might work (using the LuaFileSystem library).

You could also try to fix your script invocation to set the correct mode using a script which changes stdin to binary mode before invoking your Lua script:

./stdbin.c:

#include <stdio.h>
#include <unistd.h>
#include <assert.h>

int main(int argc, char** argv) {
    if (argc < 1) {
        printf("Arguments: <program> {args}\n");
        return 1;
    }

    // See  https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setmode?redirectedfrom=MSDN&view=msvc-170
    if (_setmode(_fileno(stdin), _O_BINARY) == -1)
        perror("_setmode failed");

    execvp("lua", ++argv);
    // execvp only returns if there is an error
    perror("execvp failed");
    return 1;
}

Note: Untested. Usage: ./stdbin lua count.lua.

Luatic
  • 8,513
  • 2
  • 13
  • 34
  • Wow! Windows has introduced `execvp`, it was absent 10 years ago. – ESkri Feb 01 '23 at 07:32
  • Hmm, I should probably use `_execvp` instead (https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/execvp?view=msvc-170). – Luatic Feb 01 '23 at 09:01
1

(This is an addition to LMD's answer)
In LuaJIT no external libraries and executables are needed:

local ffi = require"ffi"
ffi.cdef"int _setmode(int,int)"
ffi.C._setmode(0, 0x8000)
-- Now io.read() will return binary data
ESkri
  • 1,461
  • 1
  • 1
  • 8
  • Is relying on `_O_BINARY` to be `0x8000` fine? – Luatic Feb 01 '23 at 09:02
  • LuaJIT does not have access to actual Windows SDK files to determine the value of this constant in the current Windows version, so the only way is relying on the constant known at the time the code was written :-) – ESkri Feb 01 '23 at 09:35