"Way back in the day", mid 1990s, I had to write a C program I called "readline
" to avoid the global vs local variable issue created by subshelling when one did a construct like:
while read line
do
my_var=$(echo "$line" | cut -f 12 -d ":")
if [ "$my_var" == "$target" ] ;
then
found_target=1
fi
done < some_file
While updating this question to hopefully address some comments, I realized another issue I'm clueless about regarding this type of loop; how do you implement "we have found the target, we can quit reading now! with this type of loop? I'd guess it'd be something involving:
while [ -z "$found_target" ] &&
But I don't know how to finish the line! To "work correctly" the example would have to leave found_target and my_var as set in the loop for code following the loop to use.
Note that the code example here is not a code example I'm working with today for the simple reason that "having been burned" issues created by the input redirection (< file construct) to a while loop, I don't do that any more! In the "backstory" bit below, you might see how this idea got started, but it could be all based on a misunderstanding of Bash.
In short, it was observed that variables set in the loop during processing of the read - such as found_target in this example - were lost when the loop exits. Someone who was supposed to be a Bash expert (back in the 1995 to '97 era) told us - the team I was leading - it was because the interior of the loop was put into a subshell. I was a database guy who'd done machine language coding, etc, etc, etc, and didn't even think of Bash as a programming language. So, given the problem to solve as handed to me, I just handed back to the team a readline program that allowed them to move beyond their difficulties.
My simple program just has one or two arguments; you tell it the integer line number you want and either pass a file via stdin
or point it at a file via a filespec. It wasn't as inefficient as one might think due to operating system caching. And, it let these even less skilled programmers get on with their work.
This program was very satisfactory, especially for large files - the larger the file, the bigger the win since Bash isn't (or at least wasn't) particularly efficient at such use, to say nothing of the subshell / global variable issue. (Please note the section below for the use-case on WHY this would make sense!)
Now, however, I'd like to revisit this issue for two reasons: 1) Bash and its attendant utilities has/have come a long way in the intervening two plus decades, and; 2) I'd like to provide a bit of software to someone without the dependency on my readline
program and for that issue, the subshell issue is the real problem - that and that the people who will be working with it are, like the original people I wrote readline for, not really programmers. However, if there's an open-source version of my readline
, that'd work just fine!
In addition to those motives, while I've come a long way in understanding Bash since then, it's still a mostly a tertiary issue for me and I know I'm still profoundly ignorant of large chunks of it. And one thing I'm thinking could perhaps be "the right way" would be a more intelligent use of functions. Back then I was ignorant of the ability to redirect into and out of a BASH function. And, frankly, while now I know "it's a thing," I've never actually used it yet.
Some Backstory
This is definitely more the kind of thing for the "Retro-Computing" community: Back in 1995 or 1996 when this "don't do that!" idea came about, Bash was used as a part of a "glue layer" trying to join around 7 systems or so designed by different teams for different aspect of Earth Science. None of these systems were all that well designed, either, being done by Earth scientists whose passion was Earth, not computers. For most, their idea of a database was crude, typically huge lines of text in a big file, and all they wanted was to pick through a few possibly adjacent lines in the middle of what were at the time considered gigantic files. And, to join, say, the atmospheric data with the ocean surface data, the best that was practical was for some grad-students or post-doc to write Bash code that took little bits of other code and bring it together.
For what it's worth, my goal was to get 'em to use relational database engines and, indeed, modern PostgreSQL came from the same lab at the same time. However, the best I managed was to use the database as the meta-layer, knowing what data was in what systems, how to get to those systems, and what programs to call to actually do the scientific part of the data joins. Hope this digression gives it some perspective on why!
Hey, if the whole issue of subshells is just plain wrong, please school me! I can be taught! Else, a suggestion on replacing my readline would be nice.