0

I'm trying to teach myself bash and decided to try some simple loops. I made a script that prints a numbered list of the files in a directory along with their character count using different loops, which I put into functions just so I can comment/uncomment the function calls instead of the entire loops, like so:

function filelist { find . -maxdepth 1 -type f | sort -bdf; }
filecount=$(filelist | wc -l)

function until_loop {
                    i=1; until [ $i -gt "$filecount" ]; do
                        <do stuff>
                    done; echo ""
                    }

function while_loop {
                    i=1; while [ $i -le "$filecount" ]; do
                        <do stuff>
                    done; echo ""
                    }

function while_read_loop {
                    i=1; echo $(filelist) > /tmp/filelist.txt
                    while read line; do
                       <do stuff>
                    done < /tmp/filelist.txt; echo ""
                    }

function for_loop_1 {
                    i=1; for i in $(seq 1 "$filecount"); do
                         <do stuff>
                    done; echo ""
                    }

function for_loop_2 {
                    for ((i=1;i<=filecount;i++)); do
                        <do stuff>
                    done; echo ""
                    }

echo "List generated from an \"Until\" loop:"; until_loop

echo "List generated from a \"While\" loop:"; while_loop

echo "List generated from a \"While read\" loop:"; while_read_loop

echo "List generated from a \"For\" loop (method) 1:"; for_loop_1

echo "List generated from a \"For\" loop (method 2):"; for_loop_2

The contents of each loop are the same (except for specific parts such as not needing to increment the value of i in the body of for_loop_2).

All of these loops give the same output, except for for_loop_1, which throws a strange sed error and seems to be printing all of the file numbers as a single word.

Here's the complete loop:

i=1; for i in $(seq 1 "$filecount"); do
   linetext=$( filelist | sed -n ${i}p ); linetext=${linetext:2}
   charactercount="$(cat "$(filelist | sed -n "${i}p")" | wc -m)"
   printf "%02d" "$i" && printf ". The file "$linetext" has "$charactercount" characters.\n"
   i=$((i+1))
done; echo ""

And here are the outputs:

#Everyone else:

01. The file aa-shader-4.0-level2.slangp has 348 characters.
02. The file aa-shader-4.0.slangp has 186 characters.
03. The file advanced-aa.slangp has 318 characters.
04. The file fxaa.slangp has 98 characters.
05. The file reverse-aa.slangp has 106 characters.
06. The file smaa+linear.slangp has 600 characters.
07. The file smaa+sharpen.slangp has 996 characters.
08. The file smaa.slangp has 828 characters.

#for_loop_1

cat: 30: No such file or directory
01. The file aa-shader-4.0-level2.slangp has  characters.
cat: 23: No such file or directory
02. The file aa-shader-4.0.slangp has  characters.
cat: 21: No such file or directory
03. The file advanced-aa.slangp has  characters.
cat: 14: No such file or directory
04. The file fxaa.slangp has  characters.
cat: 20: No such file or directory
05. The file reverse-aa.slangp has  characters.
cat: 21: No such file or directory
06. The file smaa+linear.slangp has  characters.
cat: 22: No such file or directory
07. The file smaa+sharpen.slangp has  characters.
cat: 14: No such file or directory
08. The file smaa.slangp has  characters.
/home/user/Scripts/list_character_count.sh: line 48: 08: value too great for base (error token is "08")

#for_loop_1 with either IFS= or IFS="" set

sed: -e expression #1, char 3: unknown command: `
'
cat: 0: No such file or directory
sed: -e expression #1, char 3: unknown command: `
'
01
02
03
04
05
06
07
08. The file  has  characters.
/home/user/Scripts/list_character_count.sh: line 48: 01
02
03
04
05
06
07
08: syntax error in expression (error token is "02
03
04
05
06
07
08")

Line 44 is i=$((i+1)). There isn't a single `' or `(some newline character)' in the entire script. I don't know where cat is even getting those numbers. Honestly, I just don't know enough to even understand this simple error message, which is quite frustrating. I'm pretty sure the issue has to do with whitespaces, but that's as far as I got.

I tried using sed '${i}!d' instead of sed -n "${i}p", I tried using grep instead, I tried setting IFS= and IFS="", but whatever I do, it always works in the other loops but never in for_loop_1. I read the bash documentation and searched here and on DuckDuckGo but didn't find an explanation. Could someone shed some light? Please and thank you, and sorry for my code which is probably very ugly and full of rookie mistakes.

While I'm here, I have another, much simpler question: Why is it that i=$((i+1)) works but not i++ or i+=1? Using i++ in the first line of the second for loop works fine.

Calibre
  • 13
  • 5
  • When a number starts with a leading `0`, it's octal. It's fine to use `08` as a string; you can't use it as a number without explicitly specifying the base. – Charles Duffy Oct 19 '22 at 02:55
  • BTW, do note that each Stack Overflow question should be about _only exactly one_ narrow, specific technical problem, with a [mre] that constitutes the shortest possible code that reproduces that problem when run without changes. Part of being "reproducible" is that we need to be able to copy-and-paste your code to run it ourselves; obviously, that's impossible when it's full of `` placeholders. – Charles Duffy Oct 19 '22 at 02:57
  • Beyond that -- run `bash -x yourscript` to generate trace logs showing how the script is being executed. – Charles Duffy Oct 19 '22 at 02:57
  • As for the one complete code sample you show, observe the stated problem failing to reproduce at https://replit.com/@CharlesDuffy2/FondWorstJava#main.sh. I suspect, but can't prove (because -- as that link shows -- no working reproducer was provided) that you're running the code with a non-default value for `IFS`, thus causing `seq` to be misinterpreted. (`seq` separates its output with spaces; when you set `IFS=` or `IFS=''`, spaces no longer are honored for word-splitting so `seq` is treated as emitting just one big string). – Charles Duffy Oct 19 '22 at 03:03
  • BTW, as a random aside -- `function funcname {` is a holdout from pre-POSIX (which is to say, 1980s-era) ksh. For modern code, `funcname() {` is the standardized syntax; see the entry in the 3rd table at https://wiki.bash-hackers.org/scripting/obsolete – Charles Duffy Oct 19 '22 at 03:06
  • (egh, should have said "newlines" rather than "spaces" above in describing `seq` output, but... there's actually a relevant point there: one of the reasons `seq` is best avoided is that it's neither part of the shell nor POSIX-standardized, so its behavior is whatever your OS vendor says it is, if it even exists at all) – Charles Duffy Oct 19 '22 at 03:10
  • BTW, doing a loop and running `seq -n "${i}p"` over and over is slow. _Really_ slow. Use a `while read` loop instead -- that way you're reading the file just once, instead of reading it from the beginning every single line. – Charles Duffy Oct 19 '22 at 03:13
  • ...not that you should be using `filelist.txt` at all in the first place. `while IFS= read -r -d '' filename; do echo "processing $filename"; done < <(find . -name '*.txt' -print0)` requires no temporary files at all. And inside that processing, you can do something like `while IFS= read -r line; do echo "processing line: $line"; done <"$filename"` to read individual lines. – Charles Duffy Oct 19 '22 at 03:15
  • @CharlesDuffy Thank you for the several pointers. I'm sorry if I didn't provide a minimal reproducible example, but I did try to. I used as a placeholder to avoid making an already large post larger, but then I remarked that the insides of each loop were the same and later in the post pasted the complete "for_loop_1", which was the one giving me trouble. Your pointers were spot-on, though. the malfunctioning loop now works, but suddenly the "while read" loop that was working fine isn't anymore. But I think that I can figure it out now. – Calibre Oct 19 '22 at 21:56
  • @CharlesDuffy I am a bit confused by your replies on one topic, though.In your fourth comment, you suggested that a non-default setting of `IFS` was causing the issue (which seems to be correct) and then in the eighth one you use `IFS=` as part of your example code. I didn't understand the doumentation very clearly and don't actually know when and why to change the value of `IFS`. I understand why you said not to change it lest it break sed, but why did you resort to it in the other comment? – Calibre Oct 19 '22 at 21:57

0 Answers0