I would leave comments, but there are just so many things which are wrong with this. Pardon me if this sounds harsh; this is a common enough misconception that I want to be terse and to the point rather than polite.
As a basic terminology fix, there is no threading here. There are two distinct models of concurrency and Bash only supports one of them, namely multiprocessing. Threading happens inside of a single process; but there is no way in Bash to manage the internals of other processes (and this would be quite problematic indeed, anyway). Bash can start and stop processes (not threads), and does that very well.
But adding CPU concurrency in an effort to speed up tasks which are not CPU bound is a completely flawed idea. The reason I/O takes time is that your disk is slow. Your CPU sits idle for the vast majority of the time while your spinning
disk (or even SSD) fills and empties DMA buffers at speeds which are glacial from the CPU's perspective.
In fact, adding more processes to compete for limited I/O capacity is likely to make things slower, not faster; because the I/O channel will be directed to try to do many things at once, where maintaining locality would be better (don't move the disk head between unrelated files because you will have to move back a few milliseconds from now; or similarly for an SSD, though with much less crucial effects, streaming a contiguous region of memory will be more efficient than scattered random access).
Adding to this, your buggy reimplementation of cat
is going to be horribly slow. Bash is notorious for being very inefficient in while read
loops. (The main bug is the quoting but there are corner cases with read
you want to avoid, too.)
Moreover, you are opening the file, seeking to the end of the file for appending, and closing it again each time through the loop. You can avoid this by moving the redirection outside the loop;
while IFS= read -r line || [[ -n $line ]]; do
printf '%s\n' "$line"
done >>final.txt
But this still suffers from the inherent excruciating slowness of while read
. If you really want to combine these files, I would simply cat
them all serially.
cat A.TXT B.TXT C.TXT >final.txt
If I/O performance is really a concern, combining many text files into a single text file is probably a step in the wrong direction, though. For information you need to read more than once, reading it into a database is a common way to speed it up. Initializing and indexing the database adds some overhead up front, but this is quickly paid back when you can iterate over the fields and records much more quickly and conveniently than when you have them in a sequential file.