0

I will first write a sequence of commands which I want to run

echo <some_input> | <some_command> > temp.dat & pid=$!
sleep 5
kill -INT "$pid"

The above is working perfectly fine when i run it one by one from bash shell, and the contents in the temp.dat file is exactly what I want. But, when I create a bash script containing the same set of commands, I am getting nothing in the temp.dat file.

Now, I'll mention why I'm writing those commands in such a way:

  1. <some_command> asks for an input, that's why I'm piping <some_input>
  2. I want the output of that command in a separate file, that's why I've redirected the output.
  3. I want to kill the command by sending SIGINT signal after some time.

I've tried running an interactive shell by writing #!/bin/bash -i in the first line of the shell script, but it's not working.

Any alternate method to achieve the same results will be appreciated.

  • Update: <some_command> is also invoking a python script, but I don't think that this will cause it to behave differently.
  • Update2: python script was the only cause of that different behavior.
sgalpha01
  • 356
  • 2
  • 12
  • I can't think of any reason why this would work differently when executed interactively versus in a script. – Barmar Jan 04 '23 at 17:24
  • 1
    Does `pstree` show `` running after you start the script? – Barmar Jan 04 '23 at 17:27
  • yes, it's showing. – sgalpha01 Jan 04 '23 at 17:34
  • 1
    Put `set -x` at the beginning of the script so you can see what it's doing when you run it. – Barmar Jan 04 '23 at 17:36
  • The other idea I have is that when you do it by hand you take a few seconds to type the commands, and that's allowing the command to finish writing to the file. Try increasing the sleep time. – Barmar Jan 04 '23 at 17:38
  • Replacing `echo some_input |` with `<<<"some_input"` would simplify things, ensuring that `pid` refers directly to `some_command` itself, not to a subshell running an `echo | some_command` pipeline. – Charles Duffy Jan 04 '23 at 17:40
  • @sgalpha01, also, if `some_command` is Python, make sure you have it configured for unbuffered output. You don't want your Python program to have written output _into a buffer_ but not have flushed it when the timeout hits. – Charles Duffy Jan 04 '23 at 17:41
  • @Barmar, I put set -x and this is the output: + pid=449568 + sleep 60 + echo + + kill -INT 449568 I don't know why the first two commands are swapped for execution. – sgalpha01 Jan 04 '23 at 17:44
  • Pipeline components all happen in parallel, so ordering isn't defined. So what you have in your `set -x` logs is not unusual in that respect. That _does_ confirm, even if we didn't already know it, that you're storing the PID of a subshell, not the PID of some_command, so it's that subshell, not some_command, that you're sending the SIGINT to. – Charles Duffy Jan 04 '23 at 17:45
  • @CharlesDuffy, it's a binary file which call the python file internally. So, I can't change it. – sgalpha01 Jan 04 '23 at 17:46
  • @sgalpha01, oh, you can change it; just depends on how it was implemented to determine how clever you have to be. If it's calling Python through a PATH lookup, then you put a hook earlier in the PATH. If it's starting Python through a `system()` or `popen()` call instead of a direct execve and your `/bin/sh` is provided by bash, you can export a function that overrides whichever shell commands you need to replace; etc. In extreme cases we might get into tricks like LD_PRELOAD hooks. – Charles Duffy Jan 04 '23 at 17:48
  • 1
    @sgalpha01, that said, Python can be configured to have unbuffered output through setting environment variables, so none of those fancy techniques are likely to be necessary here. Search for `PYTHONUNBUFFERED`. – Charles Duffy Jan 04 '23 at 17:48
  • Also, it's generally a good idea before getting too rabbit-holed on any particular presumptive cause to validate assumptions -- `strace` your process (or use sysdig if strace is unsuitable due to performance, side effects, or anti-ptrace precautions) and make sure it really is successfully reading some_input off its stdin, f/e; and that it isn't blocking on anything obvious. – Charles Duffy Jan 04 '23 at 17:50
  • @CharlesDuffy, unfortunately, modifying the binary is out of question. And, `<<<" > temp.dat & pid=$!` is working from bash shell, but not from bash script. – sgalpha01 Jan 04 '23 at 17:57
  • I already suggested several ways to modify the binary's behavior without modifying the binary itself; that's what the techniques above (`LD_PRELOAD`, exported functions, environment variables meaningful to the Python interpreter started as a subprocess, etc) are. You don't need to change the binary itself to modify that binary's runtime behavior. – Charles Duffy Jan 04 '23 at 17:57
  • 1
    Anyhow -- `strace` (with `-f` to follow forks) really is the best next step to take here, so you can compare behavior of `some_command` between working and broken scenarios and figure out how the cases differ under-the-hood. Knowing that would, at minimum, provide the information needed to build a [mre] -- code someone who's not you can run themselves to see the problem and test proposed answers. Right now all we can do is throw out guesses because we can't inspect the system or test proposed fixes. – Charles Duffy Jan 04 '23 at 17:59
  • 1
    @CharlesDuffy, adding the line `export PYTHONUNBUFFERED=1` in the bash script worked. Thanks a lot! :) – sgalpha01 Jan 04 '23 at 18:12
  • Side note: `timeout -s INT 5 some_command <<< 'some_input' > temp.dat` might do something similar without explicit PID tracking. A subtle problem with `pid` in the question is that it references a _shell_ (the one that executes the pipeline), not the `some_command` binary. My hypothesis would be that if the binary had received the `SIGINT` signal directly, it might have been able to flush its buffers correctly, even without `PYTHONUNBUFFERED`, before exiting. But when the shell that runs the pipeline gets the `SIGINT` instead, then all sorts of premature pipe `close()`s may happen. – Andrej Podzimek Jan 04 '23 at 18:47
  • BTW, when you said it worked fine when you were running it interactively, was the `>temp.dat` on the command you tested that way? It's not the "from a script" part that changes whether buffering is on by default; instead, it's whether stdout goes to a TTY or not. If you were leaving off the redirection when running the command by hand... well, that explains the difference in behavior (and would have let us jump much more directly to a buffering problem if it had been disclosed in the initial question). – Charles Duffy Jan 04 '23 at 19:38
  • @CharlesDuffy, yes it was on the command. At the time of question, I didn't even know that it was calling a python script internally. Thanks to @Barmar for his suggestion to use `pstree`. – sgalpha01 Jan 04 '23 at 19:47
  • Note that this is not a good way to run a command with a time limit. See [Timeout a command in bash without unnecessary delay](https://stackoverflow.com/q/687948/4154375) and [BashFAQ/068 (How do I run a command, and have it abort (timeout) after N seconds?)](https://mywiki.wooledge.org/BashFAQ/068). In short, use the `timeout` program. – pjh Jan 06 '23 at 11:53
  • See [How to make output of any shell command unbuffered?](https://stackoverflow.com/q/3465619/4154375) for general-purpose ways to prevent buffering of output. – pjh Jan 06 '23 at 11:58

2 Answers2

1

One likely cause here is that your Python process may not be flushing stdout within the allowed five seconds of runtime.

export PYTHONUNBUFFERED=1

...will cause content to be promptly written, rather than waiting for process exit / file close / amount of buffered content to reach a level sufficient to justify the overhead of a flush operation.

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
-3

will this work for you?

read -p "Input Data : " inputdata ; echo $inputdata > temp.data ; sleep 5; exit

obvs

#!/usr/bin/env bash 

read -p "Input Data : " inputdata
echo $inputdata > temp.data
sleep 5

should work as a script

to suit :D

#!/usr/bin/env bash 

read -p "Input Data : " inputdata
<code you write eg echo $inputdata> > temp.data
sleep 5
BobMonk
  • 178
  • 1
  • 10
  • out of curiosity, im on mac and zsh can play badly, bash needs update as default is old etc.... – BobMonk Jan 04 '23 at 19:18
  • 1
    The problem is that the OP is trying to use an arbitrary `some_command` they don't control, which runs for longer than 5 seconds but writes their desired content _within_ the first five seconds of runtime... but doesn't actually flush the buffer it wrote into until forced. This answer thus doesn't address the OP's problems at all. – Charles Duffy Jan 04 '23 at 19:35
  • Thank you for providing the extra information. As a soft skill, id offer advise to measure your tone in text, it could be deemed rude and or offensive. – BobMonk Jan 05 '23 at 10:06