1

Please note that this is a pseudo code to illustrate the problem. I just used ls for example and I'm aware that I shouldn't parse the output of ls. The original script is used to iterate through AWS S3 buckets.

Directory structure:

# tree
.
├── dir1
│   ├── d1_file1
│   ├── d1_file2
│
└── dir2
    ├── d2_file1
    ├── d2_file2

2 directories, 4 files

Desired outcome:

# bash tmp.sh
dir1 has 2 files(s)
dir1 files: d1_file1 d1_file2
dir2 has 2 files(s)
dir2 files: d2_file1 d2_file2
Warnings: 2

# Command grouping - file are listed correctly however WARNING variable is lost due to sub-shell created by pipe.

#!/bin/bash

WARNING=0

list=(
dir1
dir2
)

for dir in ${list[@]}; do
    ls -1 ./tmp/${dir} |
    {
    while read -a file; do
       files+=(${file})
    done
    echo "${dir} has ${#files[@]} files(s)"
    echo "${dir} files: ${files[@]}"
    if [ ${#files[@]} -gt 1 ]; then
        (( WARNING++ ))
    fi
}
done
echo "Warnings: ${WARNING}"

----- Output -----

dir1 has 2 files(s)
dir1 files: d1_file1 d1_file2
dir2 has 2 files(s)
dir2 files: d2_file1 d2_file2
Warnings: 0

# Process Substitution - WARNING variable is preserved outside of the loop but output is incorrect.

#!/bin/bash

WARNING=0

list=(
dir1
dir2
)

for dir in ${list[@]}; do
    while read -a file; do
       files+=(${file})
    done  < <(ls -1 ./tmp/${dir})
    echo "${dir} has ${#files[@]} files(s)"
    echo "${dir} files: ${files[@]}"
    if [ ${#files[@]} -gt 1 ]; then
        (( WARNING++ ))
    fi
done
echo "Warnings: ${WARNING}"

----- Output -----

dir1 has 2 files(s)
dir1 files: d1_file1 d1_file2
dir2 has 4 files(s)
dir2 files: d1_file1 d1_file2 d2_file1 d2_file2
Warnings: 2

# Here Document - here the output and number of warnings is incorrect

#!/bin/bash

WARNING=0

list=(
dir1
dir2
)

for dir in ${list[@]}; do
    while read -a file; do
       files+=(${file})
    done  <<<$(ls -1 ./tmp/${dir})
    echo "${dir} has ${#files[@]} files(s)"
    echo "${dir} files: ${files[@]}"
    if [ ${#files[@]} -gt 1 ]; then
        (( WARNING++ ))
    fi
done
echo "Warnings: ${WARNING}"

----- Output -----

dir1 has 1 files(s)
dir1 files: d1_file1
dir2 has 2 files(s)
dir2 files: d1_file1 d2_file1
Warnings: 1

Could you please advise what approach I should use in order to get the desired outcome and what method is actually recommended in my case.

codeforester
  • 39,467
  • 16
  • 112
  • 140
HTF
  • 6,632
  • 6
  • 30
  • 49

1 Answers1

1

Reasons for not working how you want

Command grouping

As you said the warning is lost due to the sub-shell, but also so is the files array giving the appearance of that part working

Process Substitution

Unlike the subshell, the files array is not reset and so each subsequent loop/directory still contains files from the previous one.

Here Document

Adds to the array incorrectly, only adding a single file at a time due to the way ls is output from the here document.


Solution

You can use process substitution and just reset the array between loops like

#!/bin/bash

WARNING=0

list=(
dir1
dir2
)

for dir in ${list[@]}; do
    {
    while read -a file; do
       files+=(${file})
    done < <(ls -1 tmp/${dir})
    echo "${dir} has ${#files[@]} files(s)"
    echo "${dir} files: ${files[@]}"
    if [ ${#files[@]} -gt 1 ]; then
        (( WARNING++ ))
    fi
        unset files   ## THIS LINE
}
done
echo "Warnings: ${WARNING}"

This is assuming you don't actually want the file array, and just want to output them

P.s Don't parse ls ;)

Community
  • 1
  • 1