0

I have a bash script that takes a wikipedia URL as an argument (wiki_txt_script.sh) and takes text from the page using wget. When called with the -r option it prompts the user whether they want the text from any of the pages linked from within the text, putting the chosen URLs into a temporary file, one URL per line. I've created a loop to recursively call the script on each URL in order to get the text from those pages as well:

while read line; do
  echo $line
  wiki_txt_script.sh -w $line
  echo finished a recursive call
done <temp_links.txt

This however only seems to loop through once, getting the text successfully from the first URL in temp_links.txt (and still performing the echo command in the loop after the recursive call), but the loop then ends, not calling the other lines of temp_links.txt. Removing the recursive call from the loop causes it loop through each line and echoes the contents as expected. What is causing the loop to end early when a recursive call is present?

Edit: Charles' answer solved my problem. After using the script with dev/null however, each line of output to the terminal began at the end of the last line of output to the terminal. This continued with other commands once the script was finished and appeared as so:

Strange command line appearance

Which would simply go away once I started a new session. Any thoughts on why this occurred?

L Lansing
  • 63
  • 5

1 Answers1

0

This has nothing whatsoever to do with recursion. If your code were recursing, you would have an endless loop, with the first line of temp_links.txt being written over and over.


The obvious reason for a while read loop to terminate early when a line is present is that line consuming all stdin, leaving nothing for the next read invocation.

To avoid this, either redirect the stdin of the command that would otherwise cause this fault:

wiki_txt_script.sh -w "$line" </dev/null

...or use a non-stdin file descriptor for the read:

while read line <&3; do
  ...
done 3< temp_links.sh
Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
  • Thanks for your help! Both solutions worked, but using the dev/null caused strange behavior in the terminal as shown in my edit of the question. Any thoughts on why this occurred? – L Lansing Jun 01 '18 at 15:37
  • The TTY's `onlcr` flag being set incorrectly can have that effect. I'd need to see a reproducer to figure out why it happened here -- maybe we the inner script depended on having its stdin pointed to a real TTY to assert correct settings when done? But really would need to actually read the individual script causing the issue. – Charles Duffy Jun 01 '18 at 15:46
  • (`onlcr` is what tells the terminal to implicitly add a carriage return every time it sees a linefeed; DOS text files use both characters together to send the control signals separately, UNIX text files only have a LF and consider the CR implicit). – Charles Duffy Jun 01 '18 at 15:47
  • Ok thanks again, seeing as the other method worked as well I'll use it instead – L Lansing Jun 01 '18 at 15:52