0

I need to extract numbers from several files into a cumulative variable and print that variable for each parent directory as shown

├───Parent1
│   ├───20210824_2000
│   │   ├───200000_child1
│   │   │       report.md
│   │   │       log.log
│   │   │       file.xml
│   │   │       input.json
│   │   │       
│   │   ├───200030_child2
│   │   │       
│   │   ├───200034_child3
│   │   │       
│   │   ├───200039_child4
│   │   ...
│   ├───20210825_0800
│   │   ├───200000_child1
│   │   │       report.md
│   │   │       log.log
│   │   │       file.xml
│   │   │       input.json
│   │   │       
│   │   ├───200030_child2
│   │   │       
│   │   ├───200034_child3
│   │   │       
│   │   ├───200039_child4
│   │   ...
│   ...
├───Parent2
│   ├───20210824_2000
│   │   ├───200000_child1
│   │   │       report.md
│   │   │       log.log
│   │   │       file.xml
│   │   │       input.json
│   │   │       
│   │   ├───200030_child2
│   │   │       
│   │   ├───200034_child3
│   │   │       
│   │   ├───200039_child4
│   │   ...
│   ├───20210825_0800
│   │   ├───200000_child1
│   │   │       report.md
│   │   │       log.log
│   │   │       file.xml
│   │   │       input.json
│   │   │       
│   │   ├───200030_child2
│   │   │       
│   │   ├───200034_child3
│   │   │       
│   │   ├───200039_child4
│   │   ...
│   ...
...

I can't seem to extract the grep output into a numeric variable. The child folder has a timestamp attached so I sort since I only want the latest files.

Here's what I have so far:

#!/bin/bash
find . -type d -iname 'parent*' | while read -r dir; do
    sum=0;
    find "$dir" -maxdepth 1 -type d | sort -r | head -1 | 
    (
        while read -r subdir; do
            count="$(find "$subdir" -type f -iname '*report.md' -exec grep -ohP '(?<=\*)\d+(?=\*+ number of things)' {} \+)" 
            sum=$((sum + count))
        done
        basename "$dir" "$sum"
    )    
done 

But this doesn't seem to want to add count to sum it rather just prints count to the console for each file.

Izak Joubert
  • 906
  • 11
  • 29
  • [while-loop-subshell-dilemma-in-bash](https://stackoverflow.com/questions/13726764/while-loop-subshell-dilemma-in-bash) ? – tshiono Sep 15 '21 at 08:39
  • Wait... your Child directories have multiple directories in them? There's nothing in that image of your directory structure to suggest that (And why is it an image and not simple text? `tree(1)` is good for generating directory trees if you have it installed). Please update it to include examples of files you *don't* want included in the sum. – Shawn Sep 15 '21 at 08:48
  • @Shawn Is that better? – Izak Joubert Sep 15 '21 at 09:29
  • 1
    @anubhava Does the tree answer your question too? – Izak Joubert Sep 15 '21 at 09:29
  • @Lety What do I do in the while loop then? The grep? So: `while read -r subdir; do count="$(grep -ohP '(?<=\*)\d+(?=\*+ number of things)')" sum=$((sum + count)) done < < $(find "$subdir" -type f -iname '*report.md')` – Izak Joubert Sep 15 '21 at 13:10

1 Answers1

1

There is a problem due to the sub shell that is created by each while, so variable are not global, you can find useful information in "Shell variables set inside while loop not visible outside of it"

Also count variable could have problem because find should returns multiple line.

Try with this:

#!/bin/bash
find . -type d -iname 'parent*' | while read -r dir; do
    sum=0;
    while read -r subdir; do    
       while read report; do 
            count="$(grep -ohP '(?<=\*)\d+(?=\*+ number of things)' $report)" 
            sum=$((sum + count))
       done < <(find "$subdir" -type f -iname '*report.md')
    done < <(find "$dir" -maxdepth 1 -type d | sort -r | head -1)
    folder=$(basename $dir)
    echo "$folder $sum"
done 

Pay attention to the spaces in done < <(.

I have not tested, sorry.

Lety
  • 2,511
  • 21
  • 25