1

I am trying to write a shell script that loops through all the directories under a Parent Directory and skip the directories that have empty folder "I_AM_Already_Processed" at leaf level.

Parent directory is provided as input to shell script as:

. selectiveIteration.sh /Employee

Structure under parent directory is shown below ( Employee directory contains data bifurcated by yearly -> monthly -> daily -> hourly basis )

/Employee/alerts/output/2014/10/08/HOURS/Actual_Files

Shell script is trying to find out which directory is not already processed. For Example:

Let us consider three hours of data for Date : 10/08/2014

1.  /USD/alerts/output/2014/10/08/2(hourly_directory)/Actual_file + 
     directory_with_name(I_AM_Already_Processed)
2.  /USD/alerts/output/2014/10/08/3(hourly_directory)/Actual_file + 
     directory_with_name(I_AM_Already_Processed)
3.  /USD/alerts/output/2014/10/08/(hourly_directory)/Actual_file 

in above example leaf directories 2 and 3 are already processed as they contain the folder named "I_AM_Already_Processed" and whereas directory 4 is not already processed.

So shell script should skip folders 2, 3 but should process directory 4 ( print this directory in output).

Research/work I did:

Till now i am being able to iterate through the directory structure and go through all folders/files from root to leaf level, but i am not sure how to check for specific file and skip the directory if that file is present. ( i was able to do this much after referring few tutorials and older posts on StackOverflow)

I am newbie to shell scripting, this is my first time writing shell script, so if this too basic question to ask please excuse me. Trying to learn.

Any suggestion is welcome. Thanks in advance.

user1188611
  • 945
  • 2
  • 14
  • 38

1 Answers1

0

To check if a some_directory has already been processed, just do something like

find some_directory -type d -links 2 -name 'I_AM_Already_Processed'

Which will return the directory name if it has, or nothing if it hasn't. Note -links 2 tests if the directory is a leaf (meaning it only has links to its parent and itself, but not to any subdirectories). See this answer for more information.

So in a script, you could do

#!/bin/bash
directory_list=(/dir1 /dir2)
for dir in "${directory_list[@]}"; do 
if [[ -n $(find "$dir" -type d -links 2 -name 'I_AM_Already_Processed' -print -quit) ]]; then
  echo 'Has been processed'
else
  echo 'Has not been processed'
fi
Community
  • 1
  • 1
Reinstate Monica Please
  • 11,123
  • 3
  • 27
  • 48