0

For a school assignment, I‘m trying to loop through directories and subdirectories recursively to sum up the size of files. The issue I’m having is that the construction:

for f in ./* ./.*; do
  # summing logic here
done

Is getting stuck on f = ./. It works fine stepping into each directory, but once it gets to a directory that it fully processes, after the last file, f gets set to ./.. I have logic to check if f is a directory, which it does, and then steps into f to process it. And loop there forever.

I‘ve tried including code to check if the string f matches to “./.” or ”./..”, but it does not ever evaluate to true. What is the mistake I’m making?

MAIN QUESTION: Why is if [[ "$f" != "./." ]] || [[ "$f" != "./.." ]]; then not working and what can I do to get the same result? Additionally, if I try something like for f in ./* ./.* ; do echo $f done, I don’t see ./. and ./.. get printed out. How is f getting set to those values in my script?

I’ve seen answers to similar questions that involve the bash-builtin shopt, but I use zsh and the school’s test server uses csh. I’m really hoping for something platform agnostic.

Minor note: As the code is right now, the assignment is done. We are only required to sum the sizes of the files in the current working directory, excluding subdirectories. I was curious about making the script recursive and am only doing this part to satisfy my interest. Thanks for the assistance.

#!bin/bash

total_size=0

get_file_size() {
    stat --printf="%s" "$1"
}

add_file_sizes() {
    for f in ./* ./.*; do
        echo "Currently processing: $f"
        if [ -d "$f" ] && [ "$1" == -r ]; then
            echo "$f is a directory"
            if [ "$f" !=  "./." ] || [ "$f" != "./.." ]; then
                echo "$f is not ./. or ./.."
                cd "$f"
                pwd
                add_file_sizes "-r"
                echo "$total_size"
                cd ../
            fi
        fi
        if [ ! -d "$f" ]; then
            echo "$f is not a directory"
            total_size=$((total_size + $(get_file_size "$f")))
            echo "$total_size"
        fi
    done
}

add_file_sizes $1

echo "$total_size"

Edit: Here’s some output:

Currently processing: list_size.sh
list_size.sh is not a directory
625
Currently processing: output.txt
output.txt is not a directory
759
Currently processing: test_dir
test_dir is a directory
test_dir is not ./. or ./..
/home/joe/dev/csc60/test_dir
Currently processing: file1
file1 is not a directory
759
Currently processing: file2
file2 is not a directory
759
Currently processing: test_subdir
test_subdir is a directory
test_subdir is not ./. or ./..
/home/joe/dev/csc60/test_dir/test_subdir
Currently processing: file3
file3 is not a directory
759
Currently processing: ./.
./. is a directory
./. is not ./. or ./..
/home/joe/dev/csc60/test_dir/test_subdir
Currently processing: file3
file3 is not a directory
759
Currently processing: ./.
./. is a directory
./. is not ./. or ./..
/home/joe/dev/csc60/test_dir/test_subdir
Currently processing: file3
file3 is not a directory
759
Currently processing: ./.
./. is a directory
./. is not ./. or ./..
/home/joe/dev/csc60/test_dir/test_subdir
Currently processing: file3
file3 is not a directory
759
Currently processing: ./.
./. is a directory
./. is not ./. or ./..
/home/joe/dev/csc60/test_dir/test_subdir
Currently processing: file3
file3 is not a directory
759
Currently processing: ./.
./. is a directory
./. is not ./. or ./..
/home/joe/dev/csc60/test_dir/test_subdir
Currently processing: file3
file3 is not a directory
759
Currently processing: ./.
./. is a directory
./. is not ./. or ./..
/home/joe/dev/csc60/test_dir/test_subdir
Currently processing: file3
file3 is not a directory
759

EDIT 2: Tweaked the initial for loop and generally improved script in response to a suggestion in an answer.

Output when I change for loop to for f in * .[!.]*:

Currently processing: list_size.sh
list_size.sh is not a directory
578
Currently processing: list_size_tweaked.sh
list_size_tweaked.sh is not a directory
1156
Currently processing: output_tweaked.txt
output_tweaked.txt is not a directory
1394
Currently processing: output.txt
output.txt is not a directory
1394
Currently processing: test_dir
test_dir is a directory
test_dir is not ./. or ./..
/home/joe/dev/csc60/test_dir
Currently processing: file1
file1 is not a directory
1394
Currently processing: file2
file2 is not a directory
1394
Currently processing: test_subdir
test_subdir is a directory
test_subdir is not ./. or ./..
/home/joe/dev/csc60/test_dir/test_subdir
Currently processing: file3
file3 is not a directory
1394
Currently processing: .[!.]*
.[!.]* is not a directory
1394
stat: cannot stat '.[!.]*': No such file or directory
./list_size_tweaked.sh: line 25: total_size + : syntax error: operand expected (error token is "+ ")
7670

This seems to happen because there are no dotfiles in the directory, so the glob doesn’t expand.

Joseph Morgan
  • 210
  • 2
  • 8
  • `find` is the "correct" way – KamilCuk Feb 06 '20 at 23:16
  • 1
    Does this answer your question? [Bash for loop with wildcards and hidden files](https://stackoverflow.com/questions/2135770/bash-for-loop-with-wildcards-and-hidden-files) also [unix.stackexchange how to glob every hidden file except current and parent directory](https://unix.stackexchange.com/questions/1168/how-to-glob-every-hidden-file-except-current-and-parent-directory) and [askubuntu https://askubuntu.com/questions/829796/how-should-i-glob-for-all-hidden-files](https://askubuntu.com/questions/829796/how-should-i-glob-for-all-hidden-files) etc. – KamilCuk Feb 06 '20 at 23:25
  • That was a really useful post when I was was doing my initial research. `shopt` would almost certainly solve the problem I’m having, but shopt is a bash builtin and I use zsh, professors test machine uses csh. I could see if if there is a csh equivalent, but ideally I’m looking for something platform agnostic. – Joseph Morgan Feb 06 '20 at 23:28
  • 1
    I think `.[!.]*` should work on any posix shell. [posix 2.13.3 Patterns Used for Filename Expansion](https://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_13_03). dot matches a dot, and `[!.]` matches anything but a dot. – KamilCuk Feb 06 '20 at 23:34
  • Additionally, just finished reading the manpage for `find`, could you give me a little bit more information on what you mean? – Joseph Morgan Feb 06 '20 at 23:35
  • I’ll try `for f in .[!.]*` and report back – Joseph Morgan Feb 06 '20 at 23:38

1 Answers1

2

Do:

for f in * .[!.]*; do

I think it should work on any posix compatible shell. The documentation can be found in posix Shell Command Language 2.13 Pattern Matching Notation. The . matches a dot, then [!.] is a pattern bracked expression that matches everything but a dot, so it effectively excludes . current directory and .. parent directory from the match.

Notes:

  • Great script, good coding, keep it up!
  • Quote your variables expansions, especially if they are filenames. Don't get_file_size $f, do get_file_size "$f". When to wrap quotes aroung a shell variable?
  • Don't use backticks `, they're use is discouraged. Use $(...) everywhere instead. Obsolete and deprecated syntax bash hackers wiki.
  • Don't use function name(), is a mix of two shell notations. Just name() { .. } to define a function, which is posix compatible and will work everywhere.
  • Just get_file_size() { stat --printf="%s" "$1"; }. No need for variable and echo.
  • The [[ is a bash extension. So on csh use [. Remember to quote your variable expansions.
  • I think I would find . -type f -printf "%s\n" | awk '{ sum+=$1 } END{print sum}'
KamilCuk
  • 120,984
  • 8
  • 59
  • 111
  • Thank you so much for all the tips! This is exactly what I’m looking for. I’m just learning bash, it’s difficult to tell what’s depriciated/bash specific/bad practice and some professors can be less than helpful in that regard. Getting an error message when I changed my for loop to yours, could be because of the errors you pointed out. I’ll update the script, and edit my question with the results. – Joseph Morgan Feb 06 '20 at 23:52
  • 1
    `it’s difficult to tell what’s depriciated/bash specific/bad practice` - totally agree. – KamilCuk Feb 06 '20 at 23:54
  • Alright, updated the question, it seems like the regex(?) in the for loop isn’t expanding in bash or zsh. Am I mistaken or is `f` getting set literally to `.[!.]*`? – Joseph Morgan Feb 07 '20 at 00:10
  • What seems *bizzare* to me is that, if I don’t enable recursion with the -r argument, the for loop as it exists (with `./* ./.*`) doesn’t touch ./. or ./.. in the directory that that script is called. – Joseph Morgan Feb 07 '20 at 00:14
  • 1
    Yes, there is a problem. The glob does not expand if it finds no files to match. So like `echo does_not_exists*` will output literally `does_not_exists*`. Ex. see `shopt -s nullglob`. So `.[!.]*` will expand to `.[!.]*` (literally) if there are no dotfiles. Instead of `if [ ! -d "$f" ]` do `if [ -e "$f" ]` or `if [ -f "$f" ]` - just check if the result really is a file. – KamilCuk Feb 07 '20 at 00:18
  • Ahh, that makes sense. So it seems like I have to figure out a way to check if a subdirectory has no dotfiles... – Joseph Morgan Feb 07 '20 at 00:19