Let's say I have these files in folder Test1
AAAA-12_21_2020.txt
AAAA-12_20_2020.txt
AAAA-12_19_2020.txt
BBB-12_21_2020.txt
BBB-12_20_2020.txt
BBB-12_19_2020.txt
I want below latest files to folder Test2
AAAA-12_21_2020.txt
BBB-12_21_2020.txt
Let's say I have these files in folder Test1
AAAA-12_21_2020.txt
AAAA-12_20_2020.txt
AAAA-12_19_2020.txt
BBB-12_21_2020.txt
BBB-12_20_2020.txt
BBB-12_19_2020.txt
I want below latest files to folder Test2
AAAA-12_21_2020.txt
BBB-12_21_2020.txt
This code would work:
ls $1 -U | sort | cut -f 1 -d "-" | uniq | while read -r prefix; do
ls $1/$prefix-* | sort -t '_' -k3,3V -k1,1V -k2,2V | head -n 1
done
We first iterate over every prefix in the directory specified as the first argument, which we get by sorting the list of files and deleting duplicates, before extracting everything before -
. Then we sort those filenames by three fields separated by the _
symbol using the -k
option of sort (primarily by years in the third field, then months in second and lastly days). We use version sort to be able to ignore the text around and interpret numbers correctly (as opposed to lexicographical sort).
I'm not sure whether this is the best way to do this, as I used only basic bash functions. Because of the date format and the fact that you have to differentiate prefixes, you have to parse the string fully, which is a job better suited for AWK or Perl.
Nonetheless, I would suggest using day-month-year or year-month-day format for machine-readable filenames.
Using awk
:
ls -1 Test1/ | awk -v src_dir="Test1" -v target_dir="Test2" -F '(-|_)' '{p=$4""$2""$3; if(!($1 in b) || b[$1] < p){a[$1]=$0}} END {for (i in a) {system ("mv "src_dir"/"a[i]" "target_dir"/")}}'