-1

I want to create a .bat script to copy only one random file from each folder (also subfolders, so recursively) whilst also keeping the folder structure. I've tried the following code which comes close to what I want but doesn't copy the folder structure and one file per folder.

@ECHO OFF

SETLOCAL EnableExtensions EnableDelayedExpansion
SET Destination=H:\Temp
SET FileFilter=.ape
SET SubDirectories=/S

SET Source=%~dp1
SET FileList1Name=FileList1.%RANDOM%.txt
SET FileList1="%TEMP%\%FileList1Name%"
SET FileList2="%TEMP%\FileList2.%RANDOM%.txt"

ECHO Source: %Source%
IF /I {%SubDirectories%}=={/S} ECHO + Sub-Directories
IF NOT {"%FileFilter%"}=={""} ECHO File Filter: %FileFilter%
ECHO.
ECHO Destination: %Destination%
ECHO.
ECHO.
ECHO Building file list...

CD /D "%Source%"
DIR %FileFilter% /A:-D-H-S /B %SubDirectories% > %FileList1%

FOR /F "tokens=1,2,3 delims=:" %%A IN ('FIND /C ":" %FileList1%') DO SET     TotalFiles=%%C
SET TotalFiles=%TotalFiles:~1%

ECHO The source has %TotalFiles% total files.
ECHO Enter the number of random files to copy to the destination.
SET /P FilesToCopy=
ECHO.

IF /I %TotalFiles% LSS %FilesToCopy% SET %FilesToCopy%=%TotalFiles%

SET Destination="%Destination%"
IF NOT EXIST %Destination% MKDIR %Destination%

SET ProgressTitle=Copying Random Files...

FOR /L %%A IN (1,1,%FilesToCopy%) DO (
    TITLE %ProgressTitle% %%A / %FilesToCopy%
    REM Pick a random file.
    SET /A RandomLine=!RANDOM! %% !TotalFiles!
    REM Go to the random file's line.
    SET Line=0
    FOR /F "usebackq tokens=*" %%F IN (%FileList1%) DO (
        IF !Line!==!RandomLine! (
            REM Found the line. Copy the file to the destination.
            XCOPY /V /Y "%%F" %Destination%
        ) ELSE (
            REM Not the random file, build the new list without this file included.
            ECHO %%F>> %FileList2%
        )
        SET /A Line=!Line! + 1
    )
    SET /A TotalFiles=!TotalFiles! - 1
    REM Update the master file list with the new list without the last file.
    DEL /F /Q %FileList1%
    RENAME %FileList2% %FileList1Name%
)

IF EXIST %FileList1% DEL /F /Q %FileList1%
IF EXIST %FileList2% DEL /F /Q %FileList2%

ENDLOCAL

The destination should be set in the .bat code like the code above. Can anybody please help me with this? Thanks in advance!

  • 1
    First of all, could you provide the code you tried so far? Also, copy to where? The folder the .bat is in? – Dennis van Gils Jan 10 '16 at 12:15
  • You may have a a look at [enter link description here](http://stackoverflow.com/questions/5552368/windows-batch-file-script-to-pick-random-files-from-a-folder-and-move-them-to-an) on how to select random files – Frank Jan 10 '16 at 12:18

3 Answers3

1

Copying a directory tree structure (folders only) is trivial with XCOPY.

Selecting a random file from a given folder is not too difficult. First you need the count of files, using DIR /B to list them and FIND /C to count them. Then use the modulo operator to select a random number in the range. Finally use DIR /B to list them again, FINDSTR /N to number them, and another FINDSTR to select the Nth file.

Perhaps the trickiest bit is dealing with relative paths. FOR /R can walk a directory tree, but it provides a full absolute path, which is great for the source, but doesn't do any good when trying to specify the destination.

There are a few things you could do. You can get the string length of the root source path, and then use substring operations to derive the relative path. See How do you get the string length in a batch file? for methods to compute string length.

Another option is to use FORFILES to walk the source tree and get relative paths directly, but it is extremely slow.

But perhaps the simplest solution is to map unused drive letters to the root of your source and destination folders. This enables you to use the absolute paths directly (after removing the drive letter). This is the option I chose. The only negative aspect of this solution is you must know two unused drive letters for your system, so the script cannot be simply copied from one system to another. I suppose you could programatically discover unused drive letters, but I didn't bother.

Note: It is critical that the source tree does not contain the destination

@echo off
setlocal

:: Define source and destination
set "source=c:\mySource"
set "destination=c:\test2\myDestination"

:: Replicate empty directory structure
xcopy /s /t /e /i "%source%" "%destination%"

:: Map unused drive letters to source and destination. Change letters as needed
subst y: "%source%"
subst z: "%destination%"

:: Walk the source tree, calling :processFolder for each directory.
for /r y:\ %%D in (.) do call :processFolder "%%~fD"

:: Cleanup and exit
subst y: /d
subst z: /d
exit /b


:processFolder
:: Count the files
for /f %%N in ('dir /a-d /b %1 2^>nul^|find /c /v ""') do set "cnt=%%N"

:: Nothing to do if folder is empty
if %cnt% equ 0 exit /b

:: Select a random number within the range
set /a N=%random% %% cnt + 1

:: copy the Nth file
for /f "delims=: tokens=2" %%F in (
  'dir /a-d /b %1^|findstr /n .^|findstr "^%N%:"'
) do copy "%%D\%%F" "z:%%~pnxD" >nul

exit /b

EDIT

I fixed an obscure bug in the above code. The original COPY line read as follows:

copy "%%~1\%%F" "z:%%~pnx1" >nul

That version fails if any of the folders within the source tree contain %D or %F in their name. This type of problem always exists within a FOR loop if you expand a variable with %var% or expand a :subroutine parameter with %1.

The problem is easily fixed by using %%D instead of %1. It is counter-intuitive, but FOR variables are global in scope as long as any FOR loop is currently active. The %%D is inaccessible throughout most of the :processFolder routine, but it is available within the FOR loops.

Community
  • 1
  • 1
dbenham
  • 127,446
  • 28
  • 251
  • 390
  • 1
    To deal with relative paths you could use `xcopy /L ".\*.*" "%TEMP%\"` because it lists relative paths in case a relative source path is provided; `/L` means list but don't copy; with `| find ".\"` you can remove the summary line `# file(s) copied`; see also [this post](http://stackoverflow.com/a/34667343/5047996) where I used this technique... – aschipfl Jan 10 '16 at 18:16
  • @aschipfl - At first I thought that was good idea, but it only seems to work with file paths. But I need the relative paths of all the folders only. – dbenham Jan 10 '16 at 19:06
  • Yes, it returns only files; but you can take advantage of it, because you only need to loop once through the output `xcopy /L` rather than establishing additional loops per each subdirectory in the tree; I just did some performance tests with [my approach](http://stackoverflow.com/a/34716874/5047996) compared to yours and also [Aacini's](http://stackoverflow.com/a/34710456/5047996), and I found that mine "won" with huge directory tree (with `copy` commands `echo`ed out); of course mine and Aacini's will fail for huge numbers of files per directory due to limited array size/environment space... – aschipfl Jan 11 '16 at 22:41
1

The "natural" way to process a directory tree is via a recursive subroutine; this method minimize the problems inherent to this process. As I said at this post: "You may write a recursive algorithm in Batch that gives you exact control of what you do in every nested subdirectory". I taken the code at this answer, that duplicate a tree, and slightly modified it in order to solve this problem.

@echo off
setlocal

set "Destination=H:\Temp"
set "FileFilter=*.ape"

rem Enter to source folder and process it
cd /D "%~dp1"
call :processFolder
goto :EOF


:processFolder
setlocal EnableDelayedExpansion

rem For each folder in this level
for /D %%a in (*) do (

   rem Enter into it, process it and go back to original
   cd "%%a"
   set "Destination=%Destination%\%%a"
   if not exist "!Destination!" md "!Destination!"

   rem Get the files in this folder and copy a random one
   set "n=0"
   for %%b in (%FileFilter%) do (
      set /A n+=1
      set "file[!n!]=%%b"
   )
   if !n! gtr 0 (
      set /A "rnd=!random! %% n + 1"
      for %%i in (!rnd!) do copy "!file[%%i]!" "!Destination!"
   )

   call :processFolder
   cd ..
)
exit /B
Community
  • 1
  • 1
Aacini
  • 65,180
  • 12
  • 72
  • 108
  • I thought about writing a recursive routine, but decided it wasn't worth it. This solution fails always if any of the folder paths contain `!`, and might fail if a file name contains `!`. Of course this can be fixed. With my test tree with many files, your code took more than twice as long as mine. I used your recursive method combined with an extra CALL to eliminate need for delayed expansion, and my method for randomly selecting a file, and it was 25% faster than my posted solution. – dbenham Jan 10 '16 at 22:40
  • @dbenham: Well, this seems to be related to the environment size, as usual. If the environment would be completely emptied (calling `find/findstr` via its whole path) I think the process would be even faster. I also think that using delayed expansion should be slightly faster than using the CALL trick. – Aacini Jan 11 '16 at 17:04
0

Here is anther approach using xcopy /L to walk through all files in the source directory, which does not actually copy anything due to /L but returns paths relative to the source directory. For explanation of the code see all the remarks:

@echo off
setlocal EnableExtensions DisableDelayedExpansion

rem Define source and destination directories here:
set "SOURCE=%dp~1"
set "DESTIN=H:\Temp"

rem Change to source directory:
cd /D "%SOURCE%"
rem Reset index number:
set /A "INDEX=0"
rem Walk through output of `xcopy /L`, which returns
rem all files in source directory as relative paths;
rem `find` filters out the summary line; `echo` appends one more line
rem with invalid path, just to process the last item as well:
for /F "delims=" %%F in ('
    2^> nul xcopy /L /S /I /Y "." "%TEMP%" ^
        ^| find ".\" ^
        ^& echo^(C:\^^^|\^^^|
') do (
    rem Store path to parent directory of current item:
    set "CURRPATH=%%~dpF"
    setlocal EnableDelayedExpansion
    if !INDEX! EQU 0 (
        rem First item, so build empty directory tree:
        xcopy /T /E /Y "." "%DESTIN%"
        endlocal
        rem Set index and first array element, holding
        rem all files present in the current directory:
        set /A "INDEX=1"
        set "ITEMS_1=%%F"
    ) else if "!CURRPATH!"=="!PREVPATH!" (
        rem Previous parent directory equals current one,
        rem so increment index and store current file:
        set /A "INDEX+=1"
        for %%I in (!INDEX!) do (
            endlocal
            set /A "INDEX=%%I"
            set "ITEMS_%%I=%%F"
        )
    ) else (
        rem Current parent directory is not the previous one,
        rem so generate random number from 1 to recent index
        rem to select a file in the previous parent directory,
        rem perform copying task, then reset index and store
        rem the parent directory of the current (next) item:
        set /A "INDEX=!RANDOM!%%!INDEX!+1"
        for %%I in (!INDEX!) do (
            xcopy /Y "!ITEMS_%%I!" "%DESTIN%\!ITEMS_%%I!"
            endlocal
            set /A "INDEX=1"
            set "ITEMS_1=%%F"
        )
    )
    rem Store path to parent directory of previous item:
    set "PREVPATH=%%~dpF"
)
endlocal
exit /B

For this approach the destination directory can also be located within the source directory tree.

aschipfl
  • 33,626
  • 12
  • 54
  • 99