2

That title doesn't explain much but I couldn't summarize it quickly. Let's say I have files like this (all in the same directory)...

abc_foo_file1_morestuff.ext
abc_foo_file2_morestuff.ext
efg_goo_file1_morestuff.ext
jkl_zoo_file0_morestuff.ext
jkl_zoo_file1_morestuff.ext
jkl_zoo_file4_morestuff.ext
xyz_roo_file6_morestuff.ext

And I want them renamed to:

abc-1.ext
abc-2.ext
efg-1.ext
jkl-1.ext
jkl-2.ext
jkl-3.ext
xyz-1.ext

So basically some files in sets (abc, jkl, xyz) got removed and some got renamed to have a zero in them so they'd be listed first. But I want to resequence them to start at 1 and not have any gaps in the sequence.

I tagged this with Python because that's what I've attempted before, but if there's a simpler or cleaner approach, I'm all for it!

  • So you want to rename the files, keeping the first three characters, then adding a `-`, then a number (starting from 1, not right-aligned with zeroes), then the old extension? – Byte Commander Dec 02 '15 at 18:45
  • That's correct. No padded-zeros needed in this instance, but I tried to write the question to be useful to otthers, so maybe somebody else finding this one day would be interested in that. – DrinkingBird Dec 02 '15 at 18:57
  • 1
    Is this name wrong?: `xyz-6.ext` I think it should be `xyz-1.ext` – Aacini Dec 02 '15 at 19:35
  • 1. what is more important: using the first 3 characters, or everything before the first `_`? 2. does the sort order of the input files matter? (so `jkl_zoo_file0_morestuff.ext` *must* become `jkl-1.ext`, for instance?) – aschipfl Dec 02 '15 at 19:56
  • @Aacini - you're correct. I updated the question, thanks. – DrinkingBird Dec 02 '15 at 19:58
  • @aschipfl - What matters for segment1 is everything before the first underscore. And yes, the sort order is very much what matters... whatever is the lowest number of the set should become #1. – DrinkingBird Dec 02 '15 at 20:00
  • Okay, so the sorting is a bit tricky, because `dir` sorts _strings_, not _numbers_, so numbers like `1`, `2`, `10` will be sorted like `1`, `10`, `2` by `dir`; so when realising your script as a [tag:batch-file], sorting needs to be done programmatically; so is the portion containing the sort number always there _after the 2nd `_`_, and is it always prefixed with the word `file`? – aschipfl Dec 02 '15 at 20:06
  • @aschipfl - yes the sorting number is always after the second underscore and always prefixed with the word file. (sorry for my delays everybody) – DrinkingBird Dec 02 '15 at 21:03
  • Thanks for clarification; there is one thing left which is not clear to me: supposing there are two files `abc_AAA_file8_*.ext` and `abc_BBB_file5_*.ext`, how should they be sorted? should the part `AAA` or `BBB` be regarded for sorting, and if yes, should it take precedence over the index `8` or `5`? – aschipfl Dec 03 '15 at 01:23
  • @aschipfl - what you're asking would not happen in my setup. the first segment is sort of an ID code, and the second is more like a human-readable ID code. so they would remain paired identically. – DrinkingBird Dec 03 '15 at 17:39

3 Answers3

0

You can do this with a batch file:

@ECHO OFF
SETLOCAL EnableExtensions EnableDelayedExpansion

REM Process each EXT file in the specified directory.
FOR /F "usebackq tokens=* delims=" %%A IN (`DIR /B /S "C:\Path\To\Files\*.ext"`) DO (

    REM Extract the prefix.
    FOR /F "usebackq tokens=1 delims=_" %%X IN ('%%~nA') DO SET Prefix=%%X

    REM Running tally of each prefix.
    SET /A Count[!Prefix!] += 1

    REM Rename the file using the prefix.
    CALL SET NewName=!Prefix!-%%Count[!Prefix!]%%
    REN "%%A" "!NewName!%%~xA"
)

ENDLOCAL
Jason Faulkner
  • 6,378
  • 2
  • 28
  • 33
0

My general approach to this problem would be the following steps:

  • Get a list of all of the files you want renamed
  • Create a list of the starting sequences
  • Go through a list of the files that start with each sequence changing their names
  • Optionally, sort the list of files in each sequence

So, in python that would look like:

from os import listdir
from os.path import isfile, join, splitext
import shutil
import re

mypath = "./somewhere/"

# this function taken from an answer to http://stackoverflow.com/questions/4836710/does-python-have-a-built-in-function-for-string-natural-sort
def natural_sort_key(s, _nsre=re.compile('([0-9]+)')):
    return [int(text) if text.isdigit() else text.lower()
        for text in re.split(_nsre, s)]

# Get a list of all files in a directory
infiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]

# Create a set of starting sequences
sequences = {splitext(file)[0].split('_')[0] for file in infiles}

# Now go about changing the files
for seq in sequences:
    files = sorted([f for f in infiles if f.startswith(seq)],
                   key=natural_sort_key) # sort the files, if you like
    for idx, file in enumerate(files):
        new_name = seq + '-' + str(idx + 1) + '.' + splitext(file)[1][1:]
        print(file, " -> ", new_name)
        # copy or move
        #shutil.copy2(join(mypath,file), join(mypath,new_name))
        #shutil.move(join(mypath,file), join(mypath,new_name))
carmiac
  • 319
  • 4
  • 10
  • This also has the advantage of keeping everything around if you wanted to do some other sort of transformation on the filename, starting sequence or iteration index. – carmiac Dec 02 '15 at 20:06
  • Updated to include natural sorting of filenames and splitting on the '_'. – carmiac Dec 02 '15 at 22:18
  • Also, made creating the list of starting sequences and the filterable dictionary much more pythonic. – carmiac Dec 02 '15 at 22:34
  • Thank you for this. I was able to mold this a little more since, of course, the requirement is now different from what I initially asked. And I'm sure it will change again several times before the day ends. Such is the life of an underling. But this works great and I really appreciate it. – DrinkingBird Dec 03 '15 at 17:42
0

Here is a pure solution which regards all of your requirements (see also rem comments):

@echo off

setlocal EnableExtensions

rem definition of temporary file:
set "TEMPFILE=%TMP%\~AlphaNum.tmp"

rem first loop structure for listing the files with `dir`
rem and creating the temporary file for later sorting:
> "%TEMPFILE%" (
    for /F "eol=| tokens=1,2,3,* delims=_" %%I in ('
        dir /B /A:-D "*_*_file*_*.ext"
    ') do (
        rem store different parts of the file name into variables:
        setlocal DisableDelayedExpansion
        set "LINI=%%~I" & rem this is the very first part (prefix)
        set "LINL=%%~J" & rem this is the part left to `file*`
        set "LINM=%%~K" & rem this is the part `file*` (`*` is numeric)
        set "LINR=%%~L" & rem this is the part right to `file*`
        setlocal EnableDelayedExpansion
        rem extract the numeric part of `file*` and pad with leading `0`s;
        rem so the index is now of fixed width (12 figures at most here):
        set "LINN=000000000000!LINM:*file=!" & set "LINN=!LINN:~-12!"
        rem write file name with replaced fixed-width index and without
        rem word `file`, followed by `|`, followed by original file name,
        rem to the temporary file (the `echo` is redirected `>`):
        echo !LINI!_!LINL!_!LINN!_!LINR!^|!LINI!_!LINL!_!LINM!_!LINR!
        endlocal
        endlocal
    )
)

rem second loop structure for reading the temporary file,
rem sorting its lines with `sort`, generating new indexes
rem (sorting the previously built fixed-width indexes as text
rem result in the same order as sorting them as numbers) and
rem building the new file names (original first part + new index):
set "PREV="
for /F "eol=| tokens=2 delims=|" %%I in ('
    sort "%TEMPFILE%"
') do (
    setlocal DisableDelayedExpansion
    rem get the full original file name, and its extension:
    set "FILE=%%~I"
    set "FEXT=%%~xI"
    rem this loop iterates once only and extracts the part before the first `_`:
    for /F "eol=| tokens=1 delims=_" %%X in ("%%~I") do (
        set "CURR=%%~X"
    )
    setlocal EnableDelayedExpansion
    rem if the current prefix equals the previous one, increment the index;
    rem otherwise, reset the index to `1`:
    if /I "!CURR!"=="!PREV!" (
        set /A SNUM+=1
    ) else (
        set /A SNUM=1
    )
    rem remove `ECHO` from the following line to actually rename files:
    ECHO ren "!FILE!" "!CURR!-!SNUM!!FEXT!"
    rem this loop iterates once only and transports the values of
    rem some variable values past the `setlocal`/`endlocal` barrier:
    for /F "tokens=1,* delims=|" %%X in ("!SNUM!|"!CURR!"") do (
        endlocal
        endlocal
        set "SNUM=%%X"
        set "PREV=%%~Y"
    )
)

rem remove `REM` from the following line to delete temporary file:
REM del "%TEMPFILE%"

endlocal

The toggling of EnableDelayedExpansion and DisableDelayedExpansion is required to make the script robust for any special characters that might occur in file names, like %, !, (, ), & and ^. Type set /? into the command prompt to find brief information about delayed variable expansion.

This approach relies on the following assumptions:

  • the files match the pattern *_*_file*_*.ext;
  • none of the file name parts * contain _ characters on its own;
  • the indexes after the word file are decimal numbers with up to 12 digits;
  • the part between first and second _ takes precedence over the index after the word file with respect to the sort order; so for instance, supposing there are two files abc_AAA_file8_*.ext and abc_BBB_file5_*.ext, abc_AAA_file8_*.ext will be renamed to abc-1.ext and abc_BBB_file5_*.ext to abc-2.ext, because AAA comes before BBB; if this is not the desired behaviour, exchange the echo command line in the first loop structure by this one:
    echo !LINI!_!LINN!_!LINL!_!LINR!^|!LINI!_!LINL!_!LINM!_!LINR!;
  • the newly generated indexes are not padded with leading zeros;

With the sample files from your question, the temporary file contains the following lines:

abc_foo_000000000001_morestuff.ext|abc_foo_file1_morestuff.ext
abc_foo_000000000002_morestuff.ext|abc_foo_file2_morestuff.ext
efg_goo_000000000001_morestuff.ext|efg_goo_file1_morestuff.ext
jkl_zoo_000000000000_morestuff.ext|jkl_zoo_file0_morestuff.ext
jkl_zoo_000000000001_morestuff.ext|jkl_zoo_file1_morestuff.ext
jkl_zoo_000000000004_morestuff.ext|jkl_zoo_file4_morestuff.ext
xyz_roo_000000000006_morestuff.ext|xyz_roo_file6_morestuff.ext
aschipfl
  • 33,626
  • 12
  • 54
  • 99
  • I got this working with great ease, thank you! It will only let me select one accepted answer unfortunately and I have decided to go with the Python script since I was able to manipulate it fairly easily. Your reply has been really helpful for me in learning some useful batch programming though, thanks so much. – DrinkingBird Dec 03 '15 at 17:46
  • You're welcome! that's true, only one answer can be accepted; however, you can upvote multiple helpful answers... ;-) – aschipfl Dec 03 '15 at 21:34
  • No worries, I did! But my account is new so my votes apparently don't show up publicly until I have a higher rep score. – DrinkingBird Dec 04 '15 at 16:21