2

I have a set of files (the list is larger, 4x43 files with extensions I.sto, Q.sto, U.sto and V.sto):

probni45069Q.sto probni45080I.sto probni45080V.sto probni45083U.sto 
probni45069U.sto probni45080Q.sto probni45083I.sto probni45083V.sto
probni45069I.sto probni45069V.sto probni45080U.sto probni45083Q.sto

My goal is to rename them in the sorted order starting with number 1:

  1. probni45069I.sto probni1I.sto
  2. probni45080I.sto probni2I.sto
  3. probni45083I.sto probni3I.sto
  4. probni45069Q.sto probni1Q.sto
  5. probni45080Q.sto probni2Q.sto
  6. probni45083Q.sto probni3Q.sto
  7. probni45069U.sto probni1U.sto
  8. probni45080U.sto probni2U.sto
  9. probni45083U.sto probni3U.sto
  10. probni45069V.sto probni1V.sto
  11. probni45080V.sto probni2V.sto
  12. probni45083V.sto probni3V.sto

I have followed the instructions and guides from Renaming files in a folder to sequential numbers and created the following bash script:

model='probni'

for sto in I Q U V;
do
    i=1
    for j in $model*$sto.sto;
    do
        echo "$j" `printf $model%1d$sto.sto $i`
        mv "$j" `printf $model%1d$sto.sto $i` 2>/dev/null || true
        i=$((i + 1))
    done
done

This script works great when used only once. The problem is when I use it multiple times on already sorted set of files or when I have a set of sorted files along with additional unsorted files (e.g. probni1I.sto + probni1346I.sto), I lose a certain number of files. When used repeatedly on my original 4x43 set of files, I finally end with the set of 4x7 files.

My questions is how to make this script to be idempotent for sorted files or how just to add new unsorted files to the sorted list without losing any files.

Community
  • 1
  • 1
  • The basic reason for your problems is your naming scheme: you can not immediately see if a file like `probni1346I.sto` is already a member of the sorted set or if it still needs to be processed. You could change the naming, so that a sorted file would look differently (i.e. probni1346I-sorted.sto) – Slizzered Jun 24 '15 at 23:49

3 Answers3

2

The problem is that file names are sorted lexicographically, which causes problems with numbers because 1 2 1346 will be sorted as 1 1346 2 since the character 1 comes before 2. You need a smarter sort that handles numbers more intuitively.

sort -g is just what the doctor ordered. Filter your file list through sort -g and 1346 won't cause you any more troubles.

for j in $(ls $model*$sto.sto | sort -g); do
    ...
done

Normally parsing the output of ls is a bad idea. If you want to do things really right you'll need to restructure things a bit.

for sto in I Q U V; do
    find -maxdepth 1 -name "$model*$sto.sto" -print0 |
        sort --files0-from - -z |
    {
        i=1
        while IFS= read -rd $'\0' old; do
            new=$model$i$sto.sto
            [[ $old == $new ]] && continue
            echo "$old" "$new"
            mv -- "$old" "$new" 2>/dev/null || true
            ((i++))
        done
    }
done

It's a heck of a lot of work, but this version handles weird file names much better. It uses NUL separators at all stages so file names with spaces and newlines won't gum up the works.

I don't expect that you have weird files like that, but hey, better safe than sorry. It's always good to try to do the right thing.

John Kugelman
  • 349,597
  • 67
  • 533
  • 578
  • The iteration using `find` looks basically good. Probably a bit over-engineered since the whole logic depends on files following a fixed pattern in their names - and no newlines. But I don't see how the logic to rename the files properly should work here. – hek2mgl Jun 25 '15 at 00:19
  • Oh, I missed the outer loop. You just need to add the `${old:start:len}` statements to disassemble `old` – hek2mgl Jun 25 '15 at 00:25
  • I get this error `sort: unrecognized option '--file0-from'`. I'm sorry for bothering, but I am really newb to all this and I couldnt figure out what exactly should be there from the `sort` options. @John Kugelman – Djordje Savic Jun 25 '15 at 16:28
  • Typo. I meant `--files0-from`. Does my first answer work? That's the one I'd recommend to be honest. For what you're doing you don't really need the complexity of the second answer. – John Kugelman Jun 25 '15 at 16:36
  • When i use the first answer, for the original set of 4x43 files, it is good. When i add new unsorted 4x6 set, i get something like this at the end: `probni45141V.sto probni38V.sto probni45142V.sto probni39V.sto probni45143V.sto probni40V.sto probni45144V.sto probni41V.sto probni45145V.sto probni42V.sto probni45148V.sto probni43V.sto probni4V.sto probni44V.sto probni5V.sto probni45V.sto probni6V.sto probni46V.sto probni7V.sto probni47V.sto probni8V.sto probni48V.sto probni9V.sto probni49V.sto`, so this looks like some shifting, but in the end i get a set of 4x49. – Djordje Savic Jun 25 '15 at 17:03
  • When i add new files, again i get this data loss. I still don't understand why sorted files do not copy into themselves. – Djordje Savic Jun 25 '15 at 17:17
0

What about incrementing i in a loop as long as the destination file already exists?

while [ ! -f FILENAME ]
do
    INCREMENT
done

Edit: I missed that your input and output files look the same. This would require you to either 1) change the naming scheme of either, 2) put them in different directories, or 3) start renaming no sooner than you encounter a gap in counting up from 1

Christoph Sommer
  • 6,893
  • 1
  • 17
  • 35
0

Normally parsing the output of ls is considered bad. However, having that the folder only contains files following the shown pattern, you can use the following shell script:

ls | sort -k1.12 -k2.6n | awk '{
    n=substr($0,7,5);
    l=substr($0,12,1);
    c[l]++;
    system("mv "$0" probni"c[l]""l".sto")
}'

sort -k1.12 -k2.6n will sort the files by the letter at the end and the number in the middle. This works only the file names have all the same length and the number is always 5 characters long and the prefix always probni (6 characters long).

The awk script at the end of the pipe disassembles the file name using substr() and reassembles it following the desired pattern. system() is used to issue the mv command.

hek2mgl
  • 152,036
  • 28
  • 249
  • 266