0

How can I add progress percentage in the for loop of bash script. Below is one example of the script in which for loop takes a lot of time due to huge data. I would like to know the complete percentage of the for loop.

for domain in $(cat all-programs.txt); do
   cat 2020.json.gz | pigz -dc | grep ".$domain" | cut -d "," -f2 | cut -d ":" -f2 | cut -d '"' -f2 | sed '/^$domain/d' | sort -u | grep -w "$domain" 2>/dev/null >> all.txt; done
  • 3
    Does this answer your question? [How to add a progress bar to a shell script?](https://stackoverflow.com/questions/238073/how-to-add-a-progress-bar-to-a-shell-script) – Kassi Sep 12 '20 at 11:59
  • 1
    Off-topic: You can add a newline after each `|` (and indent), avoiding long lines. The `done` can be placed on a new line as well, – Walter A Sep 12 '20 at 13:16
  • 1
    Off-topic: The dot in your grep command is matching any character. Demo: `printf "%s\n" "sample.txt" "wrong.xtxt" | grep ".txt"`. You might want to replace it with `[.]`. Demo: `printf "%s\n" "sample.txt" "wrong.xtxt" | grep "[.]txt"` – Walter A Sep 12 '20 at 13:26
  • 1
    Off topic: Before putting too much effort in the progress bar, look for how to improve the performance of the loop. Different `cuts` can be joined together. More inportant: for a large number of `domains` you will also be calling `pigz` very often. Post a new question with sample input (at least 4 domains) and output, asking how to improve the performance. – Walter A Sep 12 '20 at 13:42

1 Answers1

0

The example you post actually have an issue of performance, since each loop will unzip the same file repeatedly, maybe you should first change the structure of the loop? The answer maybe very different if the structure changed.

However, for the general case of the problem, the showing percentage part, I recommend using awk.

Option 1: Show for Loop percentage

Calculate percentage for the for loop's current iteration / total iterations.

ALL=$(cat all-programs.txt)
LEN=${#ALL[@]}
for ((i = 0; i < $LEN; ++i)); do
    awk -v total=$LEN -v cur=$i '
    function bar(x){s="";i=0; while (i++ < x) s=s "#";return s}
    BEGIN{
        percent=int(cur / total * 100);
        printf "%s %s%%\r", bar(percent*.8), percent
    }
    END { print }
    '
done

Option 2: Show per file percentage

Calculate percentage for per file current length / total size, for the case showing percentage of ver_big_file.json progress.

for domain in $(cat all-programs.txt); do
    awk '
    function bar(x){s="";i=0;while (i++ < x) s=s "#";return s}
    BEGIN{
        ("ls -l " ARGV[1]) | getline total;
        split(total,array);
        total=array[5];
    }
    {
        cur+=length($0)+1;
        percent=int(cur / total * 100);
        printf "LINE %s:%s %s%%\r", NR, bar(percent*.8), percent 
    }
    END {print}' very_big_file.json | grep ".$domain" | ...
done

You can combine Option 1 and Option 2 together as you need.

James Yang
  • 476
  • 5
  • 6