3

I have a massive number of files in one directory that I need to validate.

The problem was, the file explorer takes too much time to load the file list and my whole computer becomes slow.

So I wrote the following code to group files by moving certain number of files(shown as %limit% and will be 700) to numbered folders(shown as %DirN%)

for /f "tokens=1-2 delims=:" %%a in ('dir /b /a-d ^|findstr /n /v ".bat .cmd .txt"') do if %%a lss %limit% robocopy "%cd%" "%cd%\%DirN%" "%%b" /mov >nul

This code itself worked fine just as it was designed, but an additional problem was found: speed.

Since I am dealing with the files that are occupying 20 GB of my disk, the code seems to take forever to move files this way.

Is there any faster way to copy(move) files?

ps. I've tried /move and /xcopy commands but did not see much differences.


Since there was a request for context, I attach full code:

@echo off
pushd %~dp0

set DirN=-1
:Check_DirN
set LeftOver=
for /f "tokens=*" %%a in ('dir /b /a-d ^|findstr /v ".bat .cmd .txt"') do (set LeftOver=%%a)
if "%LeftOver%"=="" goto Done

set /a DirN+=1
if exist "%cd%\%DirN%" goto Check_DirN

:Create
md %DirN%

:Move
cls
echo Moving files to Directory %DirN%...
set /a limit=700+2
for /f "tokens=1-2 delims=:" %%a in ('dir /b /a-d ^|findstr /n /v ".bat .cmd .txt"') do if %%a lss %limit% robocopy "%cd%" "%cd%\%DirN%" "%%b" /mov >nul
goto Check_DirN

exit
:Done
del list.txt>nul 2>&1
echo Task Done!
pause>nul

Comments

  1. I used set /a to adjust %limit% that are off due to findstr /n /v
  2. This script will be compiled to .bat file and will be put into a folder containing files to sort.

Example Environment(minimized):

There are 1,500 documents with subfolders named 0,2 and 4 in a parent folder. The script will be placed inside of the parent folder and be executed.


Script requirements:

  1. Create numbered directory starting from 0, only if the directory doesn't exist
  2. Move 700 files to newly created directory. The files will be moved even if the number of files is less than 700.
  3. Repeat task 1 and 2 until there are no remaining files left in the parent directory.

Example Result of Script Execution:

There are subfolders named 0, 1, 2, 3, 4 and 5 with a script in a parent folder. There will be 700 documents each in subfolder 1 and 3. There will be 100 documents in subfolder 5. The will be no change in subfolders 0, 2 and 4.

2 Answers2

6

I am providing this as an alternative to Magoo's answer. I have used your initial RoboCopy command and because that is an external command, removed the dependency on the external FindStr to hopefully take account of any speed difference.

@Echo Off
If /I Not "%__CD__%"=="%~dp0" PushD "%~dp0" 2>Nul||Exit/B
SetLocal EnableDelayedExpansion
Set "DirN=-1"

:Check_DirN
Set/A "DirN+=1"
If Exist "%DirN%" GoTo Check_DirN
Set "limit=700"
For %%A In (*.bat *.cmd *.txt) Do (
    If Not Exist "%DirN%" MD "%DirN%"
    If /I Not "%%~nxA"=="%~nx0" RoboCopy . "%DirN%" "%%A" /MOV 1>NUL
    Set/A "limit-=1"
    If !limit! Lss 0 GoTo Check_DirN
)
Echo(Task Done!
Timeout -1 1>Nul
Compo
  • 36,585
  • 5
  • 27
  • 39
  • Thanks for suggesting new method! Although it works little different than what I was expecting, but it is fixed with minimal touch. – A Cat Named Tiger Jan 06 '17 at 01:40
  • For an additional question, would it be better if I run this script in parallel? I found out that the script itself does not use much memory – A Cat Named Tiger Jan 06 '17 at 01:46
  • I'm not sure how each process would separate their files from each other. _Whilst you can't move a file that has already been moved, the count may be affected_. There may be a way to do it by having each process look at a different extension, but I wouldn't personally see any benefit in trying it. – Compo Jan 06 '17 at 09:16
  • @Compo indeed having each process look at a different extension is not really the best option. @ACatNamedTiger you could instead save the output of `dir /b *.bat *.cmd *.txt` in a file and have each process look at a chunk of the file (with the `skip` option of the [`for /f`](http://ss64.com/nt/for_f.html) for example) – J.Baoby Jan 06 '17 at 13:30
2
@echo off
pushd %~dp0

set DirN=-1
:Check_DirN
set /a DirN+=1
if exist "%cd%\%DirN%" goto Check_DirN
md %DirN%
set /a limit=700
for /f "tokens=1* delims=:" %%a in ('dir /b /a-d ^|findstr /v ".bat .cmd .txt" ^|findstr /n "." ') do (
 if %%a gtr %limit% goto Check_DirN
 set /a limit=0
 echo(move "%%b" "%cd%\%DirN%\"
)
if %limit% neq 0 rd %DirN%
echo Task Done!
pause>nul

The required MOVE commands are merely ECHOed for testing purposes. After you've verified that the commands are correct, change ECHO(MOVE to MOVE to actually move the files. Append >nul to suppress report messages (eg. 1 file moved)

First, create the new directory. No need to involve the for in this.

Next, establish your limit, then select-and-number your directorylist and move each file in turn to the destination.

When the file limit is reached, go back and create a new destination directory.

Note that delayedexpansion is exploited. %limit% will be replaced by its value in the parsing phase, and will be set to 0 if any file is moved. Since the replacement of its real value has already been done by the parser, this will not affect the loop, but can be detected after the loop is finish to mean "a file was moved".

If no file was moved, then limit will remain non-zero on exiting the for loop, hence the newly-created directory is empty and can be deleted.

I'd suggest you try this with a smaller limit on a dummy test directory to ensure it works. Should be substantially faster.

[edit - create directory after having checked it doesn't already exist] [edit2- cascade 2 findstrs - the first to exclude the extensions, the second to number the lines. Attempting to use one findstr will number according to the position in the dir command, so excluded files will be allotted a number, but not be transferred; hence the count-moved would be short of limit

Note that since the files are not actually moved with the echo(move in place, the same list of files will be repeated over and over. When the move is invoked, the files that have been moved obviously won't be found in a leter iteration]

Magoo
  • 77,302
  • 8
  • 62
  • 84
  • Yes - the directory-creation should occur only **after** it's been found to be absent. You probably created a few directories - code fixed... – Magoo Jan 05 '17 at 22:27
  • I guess my explanation wasn't clear. What I intended to do was, for example, to put 700 files into each folder named 0 and 1, and 100 remaining files into folder 2 if I had 1,500 files in my parent folder. – A Cat Named Tiger Jan 05 '17 at 22:37