0

I need a windows batch file to create a folder based on part of a file name (the part before an underscore) and move any files that start with the folder name into the folder.

I'm not familiar with windows batch files. I've googled and tinkered a solution which works except that I cannot substring the file name at the underscore.

(Yes there are a few similar threads but nothing I could use to exactly answer my question)

FWIW my unsuccessful solution:

@ECHO OFF
setlocal enabledelayedexpansion
SETLOCAL

SET "sourcedir=C:\Development\test"
PUSHD %sourcedir%
FOR /f "tokens=1*" %%a IN (
 'dir /b /a-d "TTT*_*.*"'
 ) DO (  

 ECHO MD NEED FILE NAME BEFORE UNDERSCORE HERE
 ECHO MOVE "%%a" .\NEED FILE NAME BEFORE UNDERSCORE HERE\
)

(Ideally I'd remove the leading 'TTT' from files too but if necessary can create the files without this.)

Emu
  • 494
  • 6
  • 15
  • Are you sure you need a Windows batch file? It seems like you want to do something, including easy file handling and string manipulation. This can be handled by Windows batch file, but there the string manipulation is quite difficult. There are script languages, like Perl or Python, but Perl is not very readable, and there also is Powershell. – Dominique Sep 14 '18 at 06:39
  • 1
    @Dominique I am sorry, but what exactly do you mean by _"Perl is not very readable"_ ? Also, batch is more than capable to handle this type of manipulation without much effort. Powershell yes, that is prefered as it is the new _"batch"_ but for now batch is not going anywhere. – Gerhard Sep 14 '18 at 07:13

3 Answers3

2

Try this batch file code:

@echo off
setlocal EnableExtensions DisableDelayedExpansion
set "SourceDir=C:\Development\test"
set "DestDir=C:\Development\test"

for /F "eol=| delims=" %%A in ('dir /B /A-D-H "%SourceDir%\TTT*_*" 2^>nul') do (
    for /F "eol=| tokens=1 delims=_" %%B in ("%%~nA") do (
        md "%DestDir%\%%B" 2>nul
        move /Y "%SourceDir%\%%A" "%DestDir%\%%B\"
    )
)

endlocal

The first FOR executes in a separate command process started with cmd.exe /C in background the command line:

dir /B /A-D-H "C:\Development\test\TTT*_*" 2>nul

DIR searches in specified directory for

  • just non-hidden files because of /A-D-H (attribute not directory and not hidden)
  • matching the wildcard pattern TTT*_* which could be also just *_*
  • and outputs to handle STDOUT in bare format because of /B just the file names with file extension, but without file path.

The error message output by DIR to handle STDERR if the specified directory does not exist at all or there is no file matching the pattern is suppressed by redirecting it with 2>nul to device NUL.

Read also the Microsoft documentation about Using Command Redirection Operators for an explanation of 2>nul. The redirection operator > must be escaped with caret character ^ on FOR command line to be interpreted as literal character when Windows command interpreter processes this command line before executing command FOR which executes the embedded dir command line with using a separate command process started in background.

FOR captures everything written to STDOUT of started command process and processes the captured output line by line.

FOR ignores by default all empty lines (do not occur here) and all lines starting with a semicolon. A file name could begin with a semicolon. For that reason option eol=| is used to redefine end of line character to vertical bar which a file name can't contain, see Microsoft documentation Naming Files, Paths, and Namespaces. In this case on using TTT*_* as wildcard pattern it is not possible that a file name starts with a semicolon, but it would be possible on usage of *_* as wildcard pattern.

FOR would split up also each line into substrings (tokens) using space/tab as delimiters and would assign just the first space/tab separated string to specified loop variable A. This splitting behavior is not wanted here as file names can contain one or more space characters. Therefore the option delims= is used to define an empty list of delimiters which disables line splitting completely and results in assigning entire file name with extension to loop variable A.

The inner FOR processes just the file name (without extension) as string. This time the file name is split up using the underscore as delimiter because of delims=_ with assigning just first underscore delimited string to loop variable B because of tokens=1. Well, tokens=1 is the default on using for /F and so this option string could be removed from code.

So the outer FOR assigns to A for example TTTxy_test & example!.txt and the inner FOR processes TTTxy_test & example! and assigns to B the string TTTxy.

The command MD creates in set destination directory a subdirectory for example with name TTTxy. An error message is output also on directory already existing. This error message is suppressed by redirecting it to device NUL.

Then the file is moved from source to perhaps just created subdirectory in destination directory with overwriting an existing file with same name in target directory of the file.

The inner FOR loop could be optimized away when there are never files starting with an underscore or which have more than one underscore after first part of file name up to first underscore.

@echo off
setlocal EnableExtensions DisableDelayedExpansion
set "SourceDir=C:\Development\test"
set "DestDir=C:\Development\test"

for /F "eol=| tokens=1* delims=_" %%A in ('dir /B /A-D-H "%SourceDir%\TTT*_*" 2^>nul') do (
    md "%DestDir%\%%A" 2>nul
    move /Y "%SourceDir%\%%A_%%B" "%DestDir%\%%A\"
)

endlocal

Option tokens=1* results in assigning first underscore delimited part of file name to loop variable A and rest of file name to next loop variable B according to ASCII table without further splitting up on underscores.

But please take into account that the optimized version does not work for file names like

  • _TTTxy_test & example!.txt ... underscore at beginning (ignored by pattern), or
  • TTTxy__test & example!.txt ... more than one underscore after first part.

The optimized version can be further optimized to a single command line:

@for /F "eol=| tokens=1* delims=_" %%A in ('dir /B /A-D-H "C:\Development\test\TTT*_*" 2^>nul') do @md "C:\Development\test\%%A" 2>nul & move /Y "C:\Development\test\%%A_%%B" "C:\Development\test\%%A\"

Well, the not optimized version could be also written as even longer single command line:

@for /F "eol=| delims=" %%A in ('dir /B /A-D-H "C:\Development\test\TTT*_*" 2^>nul') do @for /F "eol=| tokens=1 delims=_" %%B in ("%%~nA") do @md "C:\Development\test\%%B" 2>nul & move /Y "C:\Development\test\%%A" "C:\Development\test\%%B\"

See also Single line with multiple commands using Windows batch file for an explanation of operator &.

For additionally removing TTT from file name on moving the file the first batch code is modified with using two additional commands SET and CALL:

@echo off
setlocal EnableExtensions DisableDelayedExpansion
set "SourceDir=C:\Development\test"
set "DestDir=C:\Development\test"

for /F "eol=| delims=" %%A in ('dir /B /A-D-H "%SourceDir%\TTT*_*" 2^>nul') do (
    for /F "eol=| tokens=1 delims=_" %%B in ("%%~nA") do (
        md "%DestDir%\%%B" 2>nul
        set "FileName=%%A"
        call move /Y "%SourceDir%\%%A" "%DestDir%\%%B\%%FileName:~3%%"
    )
)

endlocal

The file name is assigned to an environment variable FileName. The value of this environment variable cannot be referenced with just using %FileName% because of all references of environment variable values using percent signs are substituted by Windows command processor in entire command block starting with first ( and ending with matching ) before FOR is executed at all. Delayed expansion is usually used in such cases, but that would result here in file names containing one or more exclamation marks would not be corrected processed by the batch file.

The solution is using %% on both sides of FileName environment variable reference instead of % and force a double parsing of the command line by using command CALL.

For understanding the used commands and how they work, open a command prompt window, execute there the following commands, and read entirely all help pages displayed for each command very carefully.

  • call /?
  • dir /?
  • echo /?
  • endlocal /?
  • for /?
  • md /?
  • move /?
  • set /?
  • setlocal /?
Mofi
  • 46,139
  • 17
  • 80
  • 143
  • Brilliant answer thanks! Not only did the script work perfectly when first run - when I realised that my requirements needed to be slightly changed from the detailed explanation I was able to make the necessary changes myself! Thanks this answer is greatly appreciated! – Emu Sep 17 '18 at 02:06
1

It is really very simple:

@echo off
for /f "tokens=1-2 delims=_" %%i in ('dir /b /a-d "TTT*_*"') do (
    if not exist "%%i" mkdir "%%i"
    move "%%i_%%j" "%%i\%%j"
)

We split by _ into 2 tokens, %%i everything before _ and %%j everything after. We simply create folder (if it does not exist) then move the file with only the name after the _ into the new folder.

So as an example file TTT123_File1.txt will create a folder called TTT123 and place the file into it but rename it as File1.txt

Gerhard
  • 22,678
  • 7
  • 27
  • 43
0

You might consider using Tcl/Tk. Tcl/Tk is an open source script language. You can call it as a stand-alone or execute it from a windows batch file. You will need to install it first if you don't have it yet. The following Tcl script does what you want:

cd "C:/Development/test"
# glob is a tcl command to list all functions that match the requirements
set files [glob TTT*_*]
foreach f $files {
  # use the underscore as a separator to split f and store the parts in dir and fnew
  lassign [split $f "_"] dir fnew
  if {![file exist $dir]} {
    file mkdir $dir
  }
  file rename $f [file join $dir $fnew]
}

In my opinion, this is a very readable script, even if you don't know tcl. You can call this script from a batch file as:

tclsh script.tcl

if you have saved the script as script.tcl

HanT
  • 141
  • 4