4

Is there any difference between these 2 lines?

for i in $(seq 1 10); do echo $i - `date`; sleep 1; done >> /tmp/output.txt

for i in $(seq 1 10); do echo $i - `date` >> /tmp/output.txt ; sleep 1; done

Because Robert told me that the first one only makes the I/O OP outside the for loop.

But if I type tail -f /tmp/output.txt this is behaving exactly the same way.

Community
  • 1
  • 1
Bast
  • 661
  • 2
  • 7
  • 23

2 Answers2

3

They do the same if they succeed. However, there might be notable differences if they fail for whatever reason.

The first one:

for ...; do
   # things
done >> file

This will redirect to the file supposedly after the loop is done. However, it might happen whenever Bash decides to flush the buffer.

Imagine something fails after the iteration number 3: you cannot tell what has been stored after in the file.

The second one:

for ...; do
   # things >> file
done

This will redirect to the file on every iteration.

Imagine something fails after the iteration number 3: you are sure the first two loops have been stored properly in the file.

From How to redirect output from an infinite-loop program:

If your program is using the standard output functions (e.g. puts, printf and friends from stdio.h in C, cout << … in C++, print in many high-level languages), then its output is buffered: characters accumulate in a memory zone which is called a buffer; when there's too much data in the buffer, the contents of the buffer is printed (it's “flushed”) and the buffer becomes empty (ready to be filled again). If your program doesn't produce much output, it may not have filled its buffer yet.

Also, from the answer you link:

Placing the redirection operator outside the loop doubles the performance when writing 500000 lines (on my system).

This makes sense: if you have to flush on every loop, it takes more time than letting Bash flush whenever it finds it convenient. It is easier to write five lines at the time than a line every single time.

Community
  • 1
  • 1
fedorqui
  • 275,237
  • 103
  • 548
  • 598
  • thanks. How can I know when the Bash will decide to flush its buffer? Can I force it to flush only every 5/100 loops, for example? – Bast Dec 28 '15 at 13:52
  • @Bast this is a quite sensitive thing and would need you to rewrite the loop definition itself. From [Force flushing of output to a file while bash script is still running](http://stackoverflow.com/q/1429951/1983854) _bash itself will never actually write any output to your log file. Instead, the commands it invokes as part of the script will each individually write output and flush whenever they feel like it. So your question is really how to force the commands within the bash script to flush, and that depends on what they are_. – fedorqui Dec 28 '15 at 13:59
0

There's another important difference that has not been mentioned: >> opens the file for writing every time. This can affect performance sensibly.

Also, if /tmp/output.txt gets deleted while the loop is running, echo ... >> /tmp/output.txt will re-create the file with new contents, while for ... done >> /tmp/output.txt will continue adding data to the same file.

This is something important to remember, especially if we are dealing with hard links or temporary files (in general, we unlink temporary files soon after creating them, to avoid having stale files if the Bash script exit unexpectedly).

Andrea Corbellini
  • 17,339
  • 3
  • 53
  • 69