1

i try to create a script that should detect the latest file of each group, and add prefix to its original name.

ll $DIR
asset_10.0.0.1_2017.11.19 #latest
asset_10.0.0.1_2017.10.28
asset_10.0.0.2_2017.10.02 #latest
asset_10.0.0.2_2017.08.15
asset_10.1.0.1_2017.11.10 #latest
...

2 questions:

1) how to find the latest file of each group?

2) how to rename adding only a prefix

I tried the following procedure, but it looks for the latest file in the entire directory, and doesn't keep the original name to add a prefix to it:

find $DIR -type f ! -name 'asset*' -print | sort -n | tail -n 1 | xargs -I '{}' cp -p '{}' $DIR...

What would be the best approach to achieve this? (keeping xargs if possible)

codeforester
  • 39,467
  • 16
  • 112
  • 140
faceless
  • 450
  • 4
  • 15
  • [How to recursively find the latest modified file in a directory?](https://stackoverflow.com/q/4561895/3776858) – Cyrus Nov 19 '17 at 13:39

2 Answers2

5

Selecting the latest entry in each group

You can use sort to select only the latest entry in each group:

find . -print0 | sort -r -z | sort -t_ -k2,2 -u -z | xargs ...

First, sort all files in reversed lexicographical order (so that the latest entry appears first for each group). Then, by sorting on group name only (that's second field -k2,2 when split on underscores via -t_) and printing unique groups we get only the first entry per each group, which is also the latest.

Note that this works because sort uses a stable sorting algorithm - meaning the order or already sorted items will not be altered by sorting them again. Also note we can't use uniq here because we can't specify a custom field delimiter for uniq (it's always whitespace).

Copying with prefix

To add prefix to each filename found, we need to split each path find produces to a directory and a filename (basename), because we need to add prefix to filename only. The xargs part above could look like:

... | xargs -0 -I '{}' sh -c 'd="${1%/*}"; f="${1##*/}"; cp -p "$d/$f" "$d/prefix_$f"' _ '{}'

Path splitting is done with shell parameter expansion, namely prefix (${1##*/}) and suffix (${1%/*}) substring removal.


Note the use of NUL-terminated output (paths) in find (-print0 instead of -print), and the accompanying use of -z in sort and -0 in xargs. That way the complete pipeline will properly handle filenames (paths) with "special" characters like newlines and similar.

randomir
  • 17,989
  • 1
  • 40
  • 55
2

If you want to do this in bash alone, rather than using external tools like find and sort, you'll need to parse the "fields" in each filename.

Something like this might work:

declare -A o=()                         # declare an associative array (req bash 4)

for f in asset_*; do                    # step through the list of files,
  IFS=_ read -a a <<<"$f"               # assign filename elements to an array
  b="${a[0]}_${a[1]}"                   # define a "base" of the first two elements
  if [[ "${a[2]}" > "${o[$b]}" ]]; then # compare the date with the last value
    o[$b]="${a[2]}"                     # for this base and reassign if needed
  fi
done

for i in "${!o[@]}"; do                 # now that we're done, step through results
  printf "%s_%s\n" "$i" "${o[$i]}"      # and print them.
done

This doesn't exactly sort, it just goes through the list of files and grabs the highest sorting value for each filename base.

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
ghoti
  • 45,319
  • 8
  • 65
  • 104
  • magnificent! too heavy for this task, but you definitely saved me a future question, as this is smth i was trying to "cook up" for another task. thanks again – faceless Nov 19 '17 at 16:09