1

I need to check if multiple files (about 30) exist in multiple directories. The files have different prefixes (that match the name of the directory, e.g., sub1/sub1_file1.txt, sub2/sub2_file1.txt, sub3/sub3_file1.txt; sub1/sub1_file2.txt, sub2/sub2_file2.txt etc). I am using ls to achieve this and it works, however my script only prints whether the files exist or don't exist, but not which ones. In particular, I want to know which files do NOT exist in which directories. Could anyone help me? I am using bash. What I have so far is:

for d in */ ; do 
cd "$d"
if ls *_file1.txt *_file2.txt *_file3.txt > /dev/null 2>&1; then
// nothing
else
echo "$d" "files do not exist" >> missingFiles.txt
fi
cd ..
done

In missingFiles.txt currently I have (for example)

sub1/ files do not exist

And I would like

sub1/ sub1_file1.txt do not exist

Thanks so much.

Edit: example of directory

sub1
|_sub1_file1.txt
|_sub1_file2.txt
|_sub1_file3.txt
sub2
|_sub2_file1.txt
|_sub2_file2.txt
|_sub2_file3.txt
sub3
|_sub3_file1.txt
|_sub3_file2.txt

I need to check whether all files (file1, file2, file3) are present or not in all the 'sub' directories. In the example, the script should return that sub3_file3.txt is missing (or file3 in sub3). There are also other files in each directory that I am not interested in checking.

  • 3
    `I am using ls to achieve this` So do not use ls. Use `find` to find files. Does https://stackoverflow.com/questions/2937407/test-whether-a-glob-has-any-matches-in-bash asnwer your question? – KamilCuk Feb 20 '23 at 22:48
  • You say `sub1/` may contain `sub1/sub1_file1.txt` or `sub1/sub1_sub1_file2.txt` or whatever files which start with `sub1`. If `sub1/` contains no such files, that's it. How we can tell the missing file name as `sub1/ sub1_file1.txt do not exist`? – tshiono Feb 20 '23 at 23:29
  • Sorry I am not sure I understand your question? The files have different prefixes depending on the directories they are in, which is why I was using wildcards, but then they all have the same string (e.g., _file1.txt, _file2.txt etc). I just want to know if for example file1.txt is missing and where. – Elena Pozzi Feb 20 '23 at 23:44
  • Sorry but I still don't get it, `The same string` but `_file1.txt, _file2.txt etc` sound incoherent to me. Do you mean the all files (about 30) have either `_file1.txt` or `_file2.txt` or something else? Can you describe some possible examples as directory trees? – tshiono Feb 21 '23 at 00:21
  • Sorry if I wasn't clear, I added an example of directory. – Elena Pozzi Feb 21 '23 at 01:38
  • `ls` doesn't actually find files in the first place. When you run `ls *.txt`, the shell replaces `*.txt` with a list of files having that extension before `ls` starts running at all. So you can just use shell globbing directly and skip ls entirely, and you end up with more efficient, less buggy code; `ls` is a tool to format lists of files for human readers, not a tool for use in scripts. See also [ParsingLs](https://mywiki.wooledge.org/ParsingLs) – Charles Duffy Feb 21 '23 at 02:04

2 Answers2

2

Does this Shellcheck-clean Bash code do what you want?

#! /bin/bash -p

for d in */; do
    for f in _file1 _file2 _file3; do
        path=$d${d%/}$f.txt
        [[ -e $path ]] || printf "'%q' does not exist\n" "$path"
    done
done >missingFiles.txt
pjh
  • 6,388
  • 2
  • 16
  • 17
  • 1
    Maybe use `%q` instead of assuming that single quotes at front and the are correct escaping for arbitrary names. (Or `%s` with `"${path@Q}"` and no manually-added quotes for cleaner formatting at the cost of requiring bash 5.0) – Charles Duffy Feb 21 '23 at 02:11
  • Thanks! I am getting '_file1.txt' does not exist '_file2.txt' does not exist. . echo $file _file1.txt – Elena Pozzi Feb 21 '23 at 04:26
  • @ElenaPozzi, I tested the code before posting it, so I don't understand why it is not working for you. Since the `$path` variable value includes `$d`, and `$d` has at least a `/` in it, the `does not exist` message makes no sense. Also, I don't understand the reference to `echo $file` because there is no `$file` in the code. Maybe check for a typo in your version of the code, or copy and paste the code above exactly as it is so we can be sure that we are both using the same code. – pjh Feb 21 '23 at 18:47
  • @ElenaPozzi, maybe run [Shellcheck](https://www.shellcheck.net/) on your version of the code to check for common problems. Also see [How can I debug a Bash script?](https://stackoverflow.com/q/951336/4154375) for information about how to debug Bash code. – pjh Feb 21 '23 at 18:50
  • 1
    Oh you are right while adapting the code I made a typo! It works perfectly amazing thanks so much! – Elena Pozzi Feb 22 '23 at 04:17
0
$ tree
.
├── data
│   ├── sub1
│   │   ├── sub1_file1.txt
│   │   ├── sub1_file2.txt
│   │   └── sub1_file3.txt
│   ├── sub2
│   │   ├── sub2_file1.txt
│   │   ├── sub2_file2.txt
│   │   └── sub2_file3.txt
│   └── sub3
│       ├── sub3_file1.txt
│       └── sub3_file2.txt
└── script.sh

script.sh

#!/bin/bash

awk -F'[/_.]' '
    {
        dirs[$(NF-2)]; files[$(NF-1)]; map[$(NF-2)"-"$(NF-1)]
    } 
    END{
        for (d in dirs)
            for (f in files)
                if (d"-"f in map == 0) printf "%s in %s is missing\n", f, d 
    }
' < <(find ./data -type f -name "sub[1-3]_file[1-3].txt")

Output:

 $ ./script.sh
 file3 in sub3 is missing
ufopilot
  • 3,269
  • 2
  • 10
  • 12