1

I want to archive the files above the mentioned date. I need to achieve two things,

I have sample files like this

ABCSMOKLO2Y_pm24hr_20180410.000006_4.csv

11randomletters_6randomalphanumeric_YYYYMMDD.randomnumbers_randomnum.csv

I want to tar files based on the date in the file name. Example, if I mention 10th April. Each and every date file should be tar on the specific date like this.

20180410.tar.gz
20180409.tar.gz
20180408.tar.gz

and if there is no file available on the specific date. I don't want to create a tar file for the date which has no file.

function archive_old_files(){

d=$(date -d"$DAYS_TO_ARCHIVE days ago" +%s)
echo "$d"
for i in {1..30}; do
    dt="$(date -d@$((d - i * 86400)) +%Y%m%d)"
    fn="$dt.tar.gz"
    if [ "$dt" -lt "$d" ] || [ -f *"$dt"* ]; then
        tar -czf "$DEST_ARCHIVE"/$fn "$SRC_DIR_ARCHIVE"/*$dt*
    fi
done
}

Summary :

  • I need to specify the directory in the for loop. How can I achieve
  • Tar all csv files above the mentioned date with file check(files available on specific date or not).

If there is no file available on any date. My script creates empty date.tar.gz.

user unknown
  • 35,537
  • 11
  • 75
  • 121
bosug
  • 11
  • 3
  • 1
    Passing a glob, as in `[ -f *"$dt"* ]`, isn't reliable -- if you have *two* files with the substring, it expands to `[ -f 20180414-1.txt 20180414-2.txt ]`, which isn't valid syntax. See [test whether a glob has any matches in bash](https://stackoverflow.com/questions/2937407/test-whether-a-glob-has-any-matches-in-bash). – Charles Duffy Apr 14 '18 at 21:10
  • 1
    Also, note that POSIX-standardized function declaration syntax is just `archive_old_files() {`, whereas legacy ksh function declaration syntax is `function archive_old_files {`; the amalgam used here is compatible with neither. See also http://wiki.bash-hackers.org/scripting/obsolete – Charles Duffy Apr 14 '18 at 21:11
  • Beyond that, though, it's just plain unclear what you mean. What would it look like to "specify the directory in the `for` loop"? Do you mean you want to iterate over directories, instead of iterate over numbers? – Charles Duffy Apr 14 '18 at 21:12

1 Answers1

1
#!/bin/bash

SRC_DIR_ARCHIVE=dates
DAYS_TO_ARCHIVE=30
DEST_ARCHIVE=dates/archiv

d=$(date -d"$DAYS_TO_ARCHIVE days ago" +%Y%m%d)
echo "$d"
for i in {1..30}
do
  dt="$(date -d "$d -${i} days" +%Y%m%d)"
  fn="$dt.tar.gz"
  toarc=("$SRC_DIR_ARCHIVE"/???????????_??????_${dt}.*_*.csv)
  test -f "${toarc[0]}" && \
    tar -czf "$DEST_ARCHIVE/$fn" ${toarc[*]} || \
    echo "no file, cmd: tar -czf $DEST_ARCHIVE/$fn ${toarc[*]}"
done

I restricted the filepattern a bit according to the possibilities of shell globbing and your description.

Then I collect the files in an array and check if the 1st element at index 0 is a file.

File/dirnames at the top are only to satisfy my testconditions.

And I didn't really understand why you calculate the date into seconds. I still don't understand why there is the variable DAYS_TO_ARCHIVE but in the loop there is {1..30}. Maybe because {1..$DAYS_TO_ARCHIVE} doesn't work, but

for i in $(seq 1 $DAYS_TO_ARCHIVE)

would work. I hope it helps you further.

If you do this for whole months and controlled by cron at the first of each month, I would consider using the date command for YYYYMM and - for 201802 for example (as for every month) just generate

for dt in 201802{01..31}
do 
  ...
done 

if missing files for (nonexistent) dates get flawlessly handled.

Testing for the array size of toarc

 ${#toarc[0]}

doesn't work, because, if no file is found, the array contains the filepattern as literal text.

I hope this helps.

user unknown
  • 35,537
  • 11
  • 75
  • 121