Detect non-empty STDIN in Clojure

Question

How do you detect non-empty standard input (*in*) without reading from it in a non-blocking way in Clojure?

At first, I thought calling using the java.io.Reader#ready() method would do, but (.ready *in*) returns false even when standard input is provided.

How would you do it in Java? Then, you could just use Java interop. — Alan Thompson, Jan 10 '17 at 18:26
I've also looked for Java solutions, but alas, have found none. — Jindřich Mynarz, Jan 10 '17 at 18:35
What is the use-case for this? The simplest answer is to use a separate thread and just do blocking reads. — Alan Thompson, Jan 10 '17 at 19:05
The use case is to detect if the program was provided with some data on standard input. If that is the case, then the program reads the provided data, otherwise moves on. — Jindřich Mynarz, Jan 10 '17 at 19:10

Scott · Answer 1 · 2017-01-17T14:51:50.247

3

Is this what you are looking for? InputStream .available

(defn -main [& args]
  (if (> (.available System/in) 0)
    (println "STDIN: " (slurp *in*))
    (println "No Input")))

$ echo "hello" | lein run
STDIN:  hello

$ lein run
No Input

Update: It does seem that .available is a race condition checking STDIN. n alternative is to have a fixed timeout for STDIN to become available otherwise assume no data is coming from STDIN

Here is an example of using core.async to attempt to read the first byte from STDIN and append it to the rest of the STDIN or timeout.

(ns stdin.core
  (:require
   [clojure.core.async :as async :refer [go >! timeout chan alt!!]])
  (:gen-class))

(defn -main [& args]
  (let [c (chan)]
    (go (>! c (.read *in*)))
    (if-let [ch (alt!! (timeout 500) nil
                       c ([ch] (if-not (< ch 0) ch)))]
      (do
        (.unread *in* ch)
        (println (slurp *in*)))

      (println "No STDIN"))))

edited Jan 17 '17 at 14:51

answered Jan 11 '17 at 15:00

Scott

1,648
13
21

I tried this before and it seemed not to work. Your example works just like I wanted. I'll investigate it a bit more. – Jindřich Mynarz Jan 11 '17 at 19:42
Great, I see what your intent is now. I am going to delete my other answer. – Scott Jan 11 '17 at 20:26
`.available` seems to be non-deterministic. Running `lein run | lein run` can either output "STDIN: No Input" or "No Input". The same happens for `.ready`. Do you know what's happening here? – Jindřich Mynarz Jan 12 '17 at 15:30
1

I am not sure this is possible. See this answer for a good description of a similar problem http://stackoverflow.com/a/12384207/3438870 – Scott Jan 13 '17 at 15:00
1

to expand, I do see the same behavior as you. My guess is that there is a race condition between the OS adding data to stdin and the application checking. I did some more tests and if I check a few times with 10 ms sleep between checks there are cases where there is no data in stdin and then it changes to being available. That goes back to your original problem of it isn't really deterministic. You could block for a set amount of time and if nothing shows up on stdin assume nothing ever will and move on. – Scott Jan 13 '17 at 15:06
Reading the Go solution, I wonder if detecting non-empty STDIN would be feasible if wrapped in a clojure.async channel. – Jindřich Mynarz Jan 13 '17 at 16:24
Thats actually what I have been playing around with using a combination of `alts!!` `timeout` channel and a `go` block that is blocking. It isn't ideal because you are essentially putting a sleep in your application if no input is supplied – Scott Jan 13 '17 at 16:43
Can you re-wrap the byte stream into a `Reader`? A minor point: one final closing parenthesis is missing. – Jindřich Mynarz Jan 16 '17 at 15:04
I think there are a lot of different ways you could do this. A bit more condensed solution is do call `.unread` and put the byte read back on the stream https://docs.oracle.com/javase/7/docs/api/java/io/PushbackReader.html#unread(int) you can then just call `slurp *in*` I updated the example with that strategy – Scott Jan 17 '17 at 14:54
1

By calling `.unread` yes but it appears the way unread works is you have to supply what to _unread_. `reset` always throws an exception from the java docs "The reset method of PushbackReader always throws an exception." You cannot simply reset the stream after it is read. – Scott Jan 17 '17 at 19:12
I'm still getting non-deterministic results for `lein run | lein run`. I only changed printing STDIN to `(println (str "STDIN: " (slurp *in*)))`, so that I can distinguish when is STDIN found to be non-empty. I run `lein run | lein run` several times and both `No STDIN` and `STDIN: No STDIN`. – Jindřich Mynarz Jan 17 '17 at 21:54
1

That is more of a question of how does `|` work. Both processes of `lein run` are running in parallel with the left side of `|` sending STDOUT to STDIN of the right side. Both have a fixed timeout to wait for input, with the processes executing in parallel it is a race condition to which one will finish first. An example of the problem is if you run `(sleep 5 && echo "hi") | lein run` This will always show "No STDIN" because echo is called after the wait for STDIN has finished. But if you run `(sleep .1 && echo "hi") | lein run` You should always get the STDIN. – Scott Jan 18 '17 at 05:10
See http://stackoverflow.com/questions/9834086/what-is-a-simple-explanation-for-how-pipes-work-in-bash#comment31919026_9834118 and http://stackoverflow.com/a/9694841/3438870 for descriptions of the behavior you are seeing. So I believe that this is a partial answer to your question about detecting stdin but it does cause issues if you are chaining this strategy together. – Scott Jan 18 '17 at 05:13
I ran `(sleep 5 && echo "hi") | lein run` several times and I always got "STDIN: hi". I'd expect that any sleep longer than the timeout of 500 ms will cause STDIN not to be detected, but it doesn't seem to be the case. – Jindřich Mynarz Jan 18 '17 at 08:44
There is overhead from `lein run` as well `time lein run No STDIN lein run 5.59s user 0.35s system 107% cpu 5.508 total` (on my machine) I think you might want to refine or ask a new questions at this point. This may not be the solution you are looking for, for your use case. I think this should work fine if the start of the pipe is a producer of data like `cat` `echo` `printf` – Scott Jan 18 '17 at 14:23
In my case the initial producer is the same script, typically invoked via its uberjar or by `lein run`. At the end, I decided to ask the user to provide an explicit command-line flag that an input is piped, instead of trying to detect it automatically. – Jindřich Mynarz Jan 18 '17 at 17:39

score 0 · Answer 2 · answered Jan 10 '17 at 18:45

0

Have you looked at PushbackReader? You can use it like:

Read a byte (blocking). Returns char read or -1 if stream is closed.
When returns, you know a byte is ready.
If the byte is something you're not ready for, put it back
If stream is closed (-1 return val), exit.
Repeat.

https://docs.oracle.com/javase/8/docs/api/index.html?java/io/PushbackReader.html

If you need it to be non-blocking stick it into a future, a core.async channel, or similar.

answered Jan 10 '17 at 18:45

Alan Thompson

29,276
6
41
48

I tried that but found that you cannot unread a byte from STDIN. Moreover, this blocks until provided with input, so it would need to be combined with a timeout. – Jindřich Mynarz Jan 10 '17 at 18:58

Detect non-empty STDIN in Clojure

2 Answers2