2
@echo off
setlocal EnableDelayedExpansion
set /a N=0
for /f "tokens=* delims=" %%g in ('dir !FOLDERPATH! /b') do (
setlocal DisableDelayedExpansion
  set "item=%%g"
endlocal
  set /a N+=1
REM next line loses exclamation marks. replacing %%g with %%item%% gives error: not defined $$ variable
  call set "$$%%N%%=%%g"
  )
set "ind=%N%"
REM View the results
echo !ind!
for /f "tokens=* delims=" %%i in ('set $$') do echo %%i
pause
EXIT /b

I read many solutions to my problem (stackoverflow.com/questions/3262287; stackoverflow.com/questions/28682268; stackoverflow.com/questions/29869394; stackoverflow.com/questions/3262287) and am still baffled. Any help appreciated

I have a folder with mp3 filenames containing exclamation marks (and percentage signs and ampersands). The above subroutine is supposed to fill an array=$$ with these filenames. It gets called 2000 times, each time for a different folder.

I want to use EnableDelayedExpansion as the master setlocal (twice as fast). In my entire batch program this is only line (set "item=%%g") where I need DisableDelayedExpansion. I need to know how to efficiently pass this variable (item) across the DisableDelayedExpansion endlocal boundary (and into the $$ set). Alternatively I guess I could fill the $$set within the Disabled environment, and then I need to pass the set across the boundary.

I'm looking for speed. I can use Disabled for the entire subroutine, but this doubles my processing time (for very few exclamation marks and percentage signs).


Based on the responses I see it might be good to include the entire batch script (simplified). This script takes 28 minutes to run with my database on my puny machine. It handles exclamation marks, percentage signs, ampersands and anything else you can throw at it.

@echo off
chcp 1254>nul
setlocal DisableDelayedExpansion
set /p COUNT=Select Desired Number of Random Episodes per Album:
for /d %%f in (H:\itunes\Podcasts\*) do (
  set buffer="%%f"
  set /a ind = 0
  call:Set$$Variables
setlocal EnableDelayedExpansion
  if !COUNT! LEQ !ind! ( set DCOUNT=!COUNT! ) ELSE set DCOUNT=!ind!
  for /l %%g in (1, 1, !DCOUNT!) do (
    call:GenerateUniqueRandomNumber
    for %%N in (!num!) do echo !$$%%N!>>"%USERPROFILE%\Desktop\RandomEpisodes.m3u8"
    )
endlocal
  )
pause
Exit /b

:Set$$Variables
set /a N = 0
for /f "tokens=* delims=" %%g in ('dir %buffer% /b') do (
  set "item=%%g"
  set /a N+=1
  call set "$$%%N%%=%%item%%"
  )
set "ind=%N%"
EXIT /b

:GenerateUniqueRandomNumber
:nextone
set /a "num = (((!random! & 1) * 1073741824) + (!random! * 32768) + !random!) %% !ind! + 1"
for %%N in (!num!) do (
  if !RN%%N!==1 (
  goto:nextone
  )
  set "RN%%N=1"
)
EXIT /b

A simple change to the following and it runs in 13 minutes. But it doesn't handle exclamation marks (bangs=!).

@echo off
chcp 1254>nul
setlocal EnableDelayedExpansion
set /p COUNT=Select Desired Number of Random Episodes per Album:
for /d %%f in (H:\itunes\Podcasts\*) do (\
setlocal
  set buffer="%%f"
  set /a ind = 0
  call:Set$$Variables
  if !COUNT! LEQ !ind! ( set DCOUNT=!COUNT! ) ELSE set DCOUNT=!ind!
  for /l %%g in (1, 1, !DCOUNT!) do (
    call:GenerateUniqueRandomNumber
    for %%N in (!num!) do echo !$$%%N!>>"%USERPROFILE%\Desktop\RandomEpisodes.m3u8"
    )
endlocal
  )
pause
Exit /b

:Set$$Variables
set /a N = 0
for /f "tokens=* delims=" %%g in ('dir %buffer% /b') do (
  set "item=%%g"
  set /a N+=1
  call set "$$%%N%%=%%item%%"
  )
set "ind=%N%"
EXIT /b

:GenerateUniqueRandomNumber
:nextone
set /a "num = (((!random! & 1) * 1073741824) + (!random! * 32768) + !random!) %% !ind! + 1"
for %%N in (!num!) do (
  if !RN%%N!==1 (
  goto:nextone
  )
  set "RN%%N=1"
)
EXIT /b

More than double the processing time to handle a very few filenames containing exclamation marks. The resource - time hog is the subroutine Set$$Variables. So my hope is someone out there can find a happy middle ground between these two times: 13 minutes vs. 28 minutes. It seems to me there must be a way to handle the dozen or so affected files in the 15 minute difference.

RKO
  • 161
  • 10
  • Does any of your filenames cotain `!`? – jeb Sep 23 '16 at 14:33
  • Your setlocal disabledDelayedExpansion/endlocal block will not set any variable, as it is destroyed after the endlocal – jeb Sep 23 '16 at 14:48
  • I suppose your code is normally a function or do you write all your code sequentially top down? – jeb Sep 23 '16 at 15:00
  • 1
    Two comments about your code/comments: 1- Why you say that "DisableDelayedExpansion ... kills processing speed"? The same could be said of EnableDelayedExpansion (I don't understand your rationale). 2. The apostrophe is `'`, the exclamation-mark or bang is `!`. – Aacini Sep 23 '16 at 16:40
  • I have a set of 2000 folders filled with files ... and I fill the $$ set with the contents of each folder sequentially (emptying out between each one) ... I run the batch with disabled as the primary (and only) setlocal and it takes 28 minutes. If I ignore apostrophes and percentage signs and use enabledelayedexpansion it takes 13 minutes. So I'm looking for a happy medium (since there are very few apost & percent signs) – RKO Sep 23 '16 at 17:39
  • Apostrophes `'` are not affected in batch-files, independend of the delayed expansion mode. Do you can show a examples of filenames which will be destroyed? – jeb Sep 23 '16 at 21:59
  • Sorry, I edited the original post replacing the word apostrophes with exclamation marks (bang!) ... I have no problem with apostrophes. – RKO Sep 23 '16 at 23:35
  • @Aacini - I think RKO is comparing delayed expansion `!var!` against `CALL %%var%%` – dbenham Sep 24 '16 at 05:24

6 Answers6

3

Best if you truly understand what is slowing down your script.

The following all contribute to unacceptable performance:

  • Excessive CALLs in a loop
  • Randomly selecting a number until you get one that has not been selected yet
  • Redirection in append mode for each file - best to redirect only once

Enabling and disabling delayed expansion normally does not contribute much to bad performance (unless you have a massively large environment space)

My code below uses the FINDSTR technique to quickly build the array, without needing delayed expansion.

I then can enable delayed expansion just once for each folder. I never have to pass any values across the endlocal "barrier"

I guarantee that each random operation selects an available file by building a list of possible numbers, fixed width of 4 with leading spaces. The list must fit in a single variable with max length of ~8190 bytes, so this solution supports up to ~2040 files per folder. Each random number specifies which position to take from the list, and then the value is extracted and the count decremented.

I enclose the entire outer loop in an extra set of parentheses so that I only need to redirect once.

I'm pretty sure this code will be significantly faster than even your 2nd code that does not support ! etc.

benham1.bat

@echo off
chcp 1254>nul
setlocal DisableDelayedExpansion
set /p "maxCnt=Select Desired Number of Random Episodes per Album:"

pushd "H:\itunes\Podcasts"
>"%USERPROFILE%\Desktop\RandomEpisodes.m3u8" (
  for /d %%F in (*) do (
    pushd "%%F"
    set "fileCnt=0"
    for /f "delims=: tokens=1,2" %%A in ('dir /b /a-d 2^>nul^|findstr /n "^"') do (
      set "$$%%A=%%B"
      set "fileCnt=%%A"
    )
    setlocal enableDelayedExpansion
    set "nums="
    for /l %%N in (1 1 !fileCnt!) do (
      set "n=   %%N"
      set "nums=!nums!!n:~-4!"
    )
    if !fileCnt! lss !maxCnt! (set "cnt=!fileCnt!") else set "cnt=!maxCnt!"
    for /l %%N in (1 1 !cnt!) do (
      set /a "pos=(!random!%%fileCnt)*4, next=pos+4, fileCnt-=1"
      for /f "tokens=1,2" %%A in ("!pos! !next!") do (
        for /f %%N in ("!nums:~%%A,4!") do echo !$$%%N!
        set "nums=!nums:~0,%%A!!nums:~%%B!"
      )
    )
    endlocal
    popd
  )
)
popd

Update and solution

It really bothered me that this code behaved so poorly in RKO's hands (see his comment). So I did some tests of my own to figure out what is happening.

I ran the code against 500 folders with 100 files in each folder, and I asked for an output of 50 files for each folder. I modified the loop to print out the name of each folder to stderr so I could monitor progress. As expected, each folder was processed very quickly at the beginning. But as the program progressed, each folder became slower than the previous one, in a non-linear fashion. By the time it reached the end, each folder was painfully slow.

It took ~3 minutes to process 500 folders. I then doubled the number of folders, and the time exploded to ~18 minutes. Very nasty.

Way back when, a group of us at DosTips tried to investigate how cmd.exe manages the environment space, and how large environments impact performance. See Why does SET performance degrade as environment size grows?

We determined that ENDLOCAL does not actually release allocated memory, which causes SET performance to degrade as the number of SET operations accumulates. This problem has lots of SET operations, so it makes sense that it becomes slow.

But RKO has code with lots of inefficiencies that is performing better than mine. In the absence of environment size issues, his code should be much slower. So somehow his code must not be accumulating memory like mine. So I went on a quest to isolate the memory allocation for each folder from all the rest.

My first attempt was to put all the memory allocation for a single folder within a new cmd.exe process. And it worked! Processing 500 folders now took ~1 minute, and doubling to 1000 folders basically doubled the time to ~2 minutes!

benham2.bat

@echo off
if "%~1" equ ":processFolder" (
  pushd "%folder%"
  set "fileCnt=0"
  for /f "delims=: tokens=1,2" %%A in ('dir /b /a-d 2^>nul^|findstr /n "^"') do (
    set "$$%%A=%%B"
    set "fileCnt=%%A"
  )
  setlocal enableDelayedExpansion
  set "nums="
  for /l %%N in (1 1 !fileCnt!) do (
    set "n=   %%N"
    set "nums=!nums!!n:~-4!"
  )
  if !fileCnt! lss !maxCnt! (set "cnt=!fileCnt!") else set "cnt=!maxCnt!"
  for /l %%N in (1 1 !cnt!) do (
    set /a "pos=(!random!%%fileCnt)*4, next=pos+4, fileCnt-=1"
    for /f "tokens=1,2" %%A in ("!pos! !next!") do (
      for /f %%N in ("!nums:~%%A,4!") do echo !$$%%N!
      set "nums=!nums:~0,%%A!!nums:~%%B!"
    )
  )
  popd
  exit
)

setlocal DisableDelayedExpansion
set /p "maxCnt=Select Desired Number of Random Episodes per Album:"

pushd "H:\itunes\Podcasts"
>"%USERPROFILE%\Desktop\RandomEpisodes.m3u8" (
  for /d %%F in (*) do (
    set "folder=%%F"
    cmd /v:off /c ^""%~f0" :processFolder^"
  )
)
popd
exit /b

But then I realized that I didn't have to restart my console for each test of my original benham1 code - when the batch script terminated, the memory seemed to have reset because the next run would start out just as fast as the prior one.

So I thought, why not simply CALL a :subroutine instead of initiating a new cmd.exe. This worked about the same, just a little bit better!

benham3.bat

@echo off
setlocal DisableDelayedExpansion
set /p "maxCnt=Select Desired Number of Random Episodes per Album:"

pushd "H:\itunes\Podcasts"
>"%USERPROFILE%\Desktop\RandomEpisodes.m3u8" (
  for /d %%F in (*) do (
    set "folder=%%F"
    call :go
  )
)
popd
exit /b

:go
setlocal
cd %folder%
set "fileCnt=0"
for /f "delims=: tokens=1,2" %%A in ('dir /b /a-d 2^>nul^|findstr /n "^"') do (
  set "$$%%A=%%B"
  set "fileCnt=%%A"
)
setlocal enableDelayedExpansion
set "nums="
for /l %%N in (1 1 !fileCnt!) do (
  set "n=   %%N"
  set "nums=!nums!!n:~-4!"
)
if !fileCnt! lss !maxCnt! (set "cnt=!fileCnt!") else set "cnt=!maxCnt!"
for /l %%N in (1 1 !cnt!) do (
  set /a "pos=(!random!%%fileCnt)*4, next=pos+4, fileCnt-=1"
  for /f "tokens=1,2" %%A in ("!pos! !next!") do (
    for /f %%N in ("!nums:~%%A,4!") do echo !$$%%N!
    set "nums=!nums:~0,%%A!!nums:~%%B!"
  )
)
exit /b

Another Update

I substituted Aacini's superior Array based method for guaranteeing each random operation selects a unique file name in place of my string based method. It yields slightly better performance:

benham-aacini.bat

@echo off
setlocal DisableDelayedExpansion
set /p "maxCnt=Select Desired Number of Random Episodes per Album:"

pushd "H:\itunes\Podcasts"    
>"%USERPROFILE%\Desktop\RandomEpisodes.m3u8" (
  for /d %%F in (*) do (
    set "folder=%%F"
    call :go
  )
)
popd
exit /b

:go
setlocal
cd "%folder%"
set "fileCnt=0"
for /f "delims=: tokens=1,2" %%A in ('dir /b /a-d 2^>nul^|findstr /n "^"') do (
  set "$$%%A=%%B"
  set /a "fileCnt=%%A, RN%%A=%%A"
)
setlocal enableDelayedExpansion
if !fileCnt! lss !maxCnt! (set end=1) else set /a "end=fileCnt-maxCnt+1"
for /l %%N in (!fileCnt! -1 !end!) do (
  set /a "ran=!random!%%%%N+1"
  set /a "num=RN!ran!, RN!ran!=RN%%N
  for %%N in (!num!) do echo !cd!\!$$%%N!
)
exit /b

Here is a summary of the timings of each version:

folders  | benham1  benham2   benham3   benham-aacini
---------+-------------------------------------------
    500  |    2:49     0:53      0:43       0:41
   1000  |   17:48     1:56      1:44       1:34

So our original thinking at DosTips that the environment never shrinks is wrong. But I haven't had time to fully test and determine exactly when it shrinks.

Regardless, I think I finally have a version that is truly faster than your current 13 minute code :-)

Yet Another Update (Assume few collisions)

Multiple random selections from the complete set of files might result in duplicates. My code assumes that the number of requested files might be a large portion of a large list, in which case you can expect to get many duplicates.

So all of my previous solutions did some extra bookkeeping to guarantee that each random operation results in a unique file.

But RKO seems to typically select just a few files from each folder. So the chance of collision is small. His code just randomly selects a file, and then if the file has been selected before, it loops back and tries again until it finds a new file. But since the chance of collision is small, the retry rarely happens. This method has significantly less bookkeeping. The result is that his random selection algorithm is faster as long as the number requested remains small.

So I have adopted my code to use a slightly modified version of RKO's selection method. I expect this to be the fastest yet for a small request counts.

benham4.bat

@echo off
setlocal DisableDelayedExpansion
set /p "maxCnt=Select Desired Number of Random Episodes per Album:"

pushd "H:\itunes\Podcasts"    
>"%USERPROFILE%\Desktop\RandomEpisodes.m3u8" (
  for /d %%F in (*) do (
    set "folder=%%F"
    call :selectRandom
  )
)
popd
exit /b

:selectRandom
setlocal
cd "%folder%"
set "fileCnt=0"
for /f "delims=: tokens=1,2" %%A in ('dir /b /a-d 2^>nul^|findstr /n "^"') do (
  set "$$%%A=%%B"
  set "fileCnt=%%A"
)
setlocal enableDelayedExpansion
if !fileCnt! lss !maxCnt! (set /a end=fileCnt) else set /a end=maxCnt
for /l %%N in (1 1 !end!) do (
  set /a "N=!random!%%fileCnt+1"
  if not defined $$!N! call :tryAgain
  for %%N in (!N!) do echo !folder!\!$$%%N!
  set "$$!N!="
)
exit /b

:tryAgain
set /a "N=!random!%%fileCnt+1"
if not defined $$!N! goto :tryAgain
exit /b
dbenham
  • 127,446
  • 28
  • 251
  • 390
  • 28 Minutes. Thanks for this. There's lots of great learning material in the code you shared. I'll enjoy deciphering it. But it's not faster. My thought is to use my 13 minute version and to pull out for special processing any having exclamation marks ... any hints or tips in that regard? – RKO Sep 24 '16 at 15:27
  • @RKO - See my updated answer. I think I solved my performance problem, which should give a solution much faster than what you currently use. – dbenham Sep 25 '16 at 04:44
  • Thanks. I'll check it out, but won't be able to get back to you until tomorrow. I've managed to come up with a solution (using my EnableDelayed version) that's tentatively clocking in at 10 minutes. More testing needed. – RKO Sep 26 '16 at 02:18
  • @RKO - I'm curious what you get with my latest code. I predict it will take less than 5 minutes. I won't be shocked if it is around 3 minutes. – dbenham Sep 26 '16 at 02:28
  • Sorry I'm unpracticed at using call while passing variables (cd %1) ... and your code only prints out the files contained in the originating directory (where the .bat resides) ... can you post the probably very trivial correction to your code? – RKO Sep 26 '16 at 12:39
  • @RKO - I've fixed all 4 of my scripts to make them work independently of your current directory. It just required a PUSHD to the correct location, and then the outer FOR /D loop simply iterates all folders in the current directory. – dbenham Sep 27 '16 at 04:00
  • 11 minutes! Impressive. It doesn't handle directories containing percent % ... but that's an easy fix. And I bet if I can incorporate my approach to generating random numbers then I can get this code down to 9 minutes. Thank you very much for all the time and effort you put into this answer. Very generous of you. – RKO Sep 27 '16 at 13:35
  • I've fixed the `%` issue in all my code. Regarding your random number generation, it all depends on the ratio of selected files / total files as well as the total number of files. If your folder has a lot of files, and you request a lot of files, then chances are you will get many collisions and spend a significant time waiting to randomly select an available file. In this case aacini and my methods will be faster. But if you routinely request a small number of files, then the chance of collision is low, and your method will be faster. – dbenham Sep 27 '16 at 14:15
  • @RKO - I've provided yet another version, this time with your random selection method. This should beat out your 10 min version. – dbenham Sep 27 '16 at 19:23
  • We both worked out the same solution simultaneously, and came up with the same result ... it knocks processing time down to 8 minutes (sometimes) ... and 9 minutes mostly. So this is the winner (at least for my purposes). I really don't like writing anything to external files (my 10 minute solution) so I'm voting for this as the winner. – RKO Sep 27 '16 at 21:26
  • @RKO - Don't forget to accept your favorite answer if you feel it fully answers your question and/or solves your problem. – dbenham Sep 27 '16 at 22:04
1

There isn't any good way to transer many variables out of the scope (over the endlocal barrier).

But when you transfer one by one than it works.

@echo off
setlocal
set "folderPath=C:\temp"
call :func
set $$
exit /b

:func
setlocal EnableDelayedExpansion
FOR /F "delims=" %%F in ("!FOLDERPATH!") DO (
    endlocal
    setlocal DisableDelayedExpansion
    set /a Counter=0
    for /f "tokens=* delims=" %%g in ('dir %%F /b') do (        
        set "item=%%g"
        setlocal EnableDelayedExpansion
        FOR /F "tokens=1,*" %%C in ("!Counter! !item!") DO (

            endlocal
            endlocal
            set "$$%%C=%%D"
            setlocal DisableDelayedExpansion
            set /a Counter=%%C + 1
        )
    )
)

EXIT /b

This solution assumes, that none of your filenames contain a bang ! or that you call your function from a disabled delayed expansion context.

Another solution
When you need a bullet proof variant, then you could replace the endlocal/set "$$%%C=%%D" with the macroReturn technic.

When you can live with a temporary file you could also use the set /p technic.

:collectFiles
dir /b !FOLDERPATH! > temp.$$$
FOR /F %%C in ('type temp.$$$ ^| find /v /c ""') DO (
    echo count %%C
    < temp.$$$ (
        for /L %%n in (0 1 %%C) DO (
            set /p $$%%n=
        )
    )
)
exit /b
Community
  • 1
  • 1
jeb
  • 78,592
  • 17
  • 171
  • 225
  • My original code works great (and is verrrry fast) if I don't worry about bangs. But I do. I don't think I can live with temporary files. macroReturn technic may work, but it's length and complexity left me doubtful it could meet my need for speed (I didn't even try it). – RKO Sep 23 '16 at 16:27
  • "over the endlocal _barrier_" **`:/`** – Aacini Sep 23 '16 at 16:45
1

This method create the array in a disabled delayed expansion environment in a very fast way:

@echo off
setlocal DisableDelayedExpansion

for /f "tokens=1* delims=:" %%g in ('dir %FOLDERPATH% /b ^| findstr /N "^"') do (
   set "$$%%g=%%h"
   set "ind=%%g"
)

REM View the results
echo %ind%
set $$

REM View the results using DelayedExpansion
setlocal EnableDelayedExpansion
for /L %%i in (1,1,%ind%) do echo %%i- !$$%%i!

pause
EXIT /b

You may also transfer the entire array to the environment of the caller program in a very simple way:

for /F "delims=" %%a in ('set $$') do (
   endlocal
   set "%%a"
)

In this method the endlocal command is executed several times, but it just works the first time when it is matched with the initial setlocal of the function. If this method could release other previous environments, then just add a simple test:

set _FLAG_=1
for /F "delims=" %%a in ('set $$') do (
   if defined _FLAG_ endlocal
   set "%%a"
)

EDIT: Complete example code added

I wrote the code below after the complete example code was posted by the OP; I used the same commands and variable names from the original code. This solution use the method I originally posted here to create the array in a disabled delayed expansion environment (that generate the indices of the elements via findstr /N "^" command), and then extract the elements of the array in random order using a very efficient method that I already used at this answer. I also inserted a couple modifications that increase the efficiency, like avoid call commands and change the append redirection >> (that is executed one time for each output line) by a standard redirection > (that is executed just once). The resulting program should run much faster than the original OP's code.

@echo off
chcp 1254>nul
setlocal DisableDelayedExpansion
set /p COUNT=Select Desired Number of Random Episodes per Album:

(for /d %%f in (H:\itunes\Podcasts\*) do (
   for /f "tokens=1* delims=:" %%g in ('dir "%%f" /b /A-D 2^>NUL ^| findstr /N "^"') do (
      set "$$%%g=%%h"
      set "ind=%%g"
      set "RN%%g=%%g"
   )
   setlocal EnableDelayedExpansion
   if %COUNT% LEQ !ind! (set /A DCOUNT=ind-COUNT+1) ELSE set DCOUNT=1
   for /l %%g in (!ind!, -1, !DCOUNT!) do (
      set /A "ran=(!random!*%%g)/32768+1"
      set /A "num=RN!ran!, RN!ran!=RN%%g"
      for %%N in (!num!) do echo !$$%%N!
   )
   endlocal
)) > "%USERPROFILE%\Desktop\RandomEpisodes.m3u8"

pause
Exit /b

2ND EDIT

After read the explanation of user dbenham about a problem with the environment release in our original methods, I introduced the same modification suggested by him in my code in order to fix the problem. It is expected that both codes now run faster than the original OP's code...

@echo off
chcp 1254>nul
setlocal DisableDelayedExpansion
(
for /F "delims==" %%a in ('set') do set "%%a="
set "ComSpec=%ComSpec%"
set "USERPROFILE=%USERPROFILE%"
)
set /p COUNT=Select Desired Number of Random Episodes per Album:
(for /d %%f in (H:\itunes\Podcasts\*) do call :Sub "%%f"
) > "%USERPROFILE%\Desktop\RandomEpisodes.m3u8"
pause
Exit /b

:Sub
setlocal DisableDelayedExpansion
   cd %1
   for /f "tokens=1* delims=:" %%g in ('dir /b /A-D 2^>NUL ^| findstr /N "^"') do (
      set "$%%g=%%h"
      set /A "ind=%%g, N%%g=%%g"
   )
   setlocal EnableDelayedExpansion
   if %COUNT% LEQ %ind% (set /A DCOUNT=ind-COUNT+1) ELSE set DCOUNT=1
   for /l %%g in (%ind%, -1, %DCOUNT%) do (
      set /A "ran=!random!%%%%g+1"
      set /A "num=N!ran!, N!ran!=N%%g"
      for %%N in (!num!) do echo %CD%\!$%%N!
   )
exit /B

3RD EDIT

The method to generate unique random numbers used by dbenham and me is a general-purpose method that efficiently manage most situations; however, it seems that such method is not best suited for this particular problem. The new code below is an attempt to write a solution for this problem that run in the fastest possible way.

Mod: A small bug have been fixed.

@echo off
chcp 1254>nul
setlocal DisableDelayedExpansion
set "findstr=C:\Windows\System32\findstr.exe"
for /F "delims=" %%a in ('where findstr 2^>NUL') do set "findstr=%%a"
(
for /F "delims==" %%a in ('set') do set "%%a="
set "ComSpec=%ComSpec%"
set "USERPROFILE=%USERPROFILE%"
set "findstr=%findstr%"
)
set /p COUNT=Select Desired Number of Random Episodes per Album:
(for /d %%f in (H:\itunes\Podcasts\*) do call :Sub "%%f"
) > "%USERPROFILE%\Desktop\RandomEpisodes.m3u8"
pause
Exit /b

:Sub
setlocal DisableDelayedExpansion
   cd %1
   for /f "tokens=1* delims=:" %%g in ('dir /b /A-D 2^>NUL ^| "%findstr%" /N "^"') do (
      set "$%%g=%%h"
      set "ind=%%g"
   )
   setlocal EnableDelayedExpansion
   if %COUNT% LEQ %ind% (set "DCOUNT=%COUNT%") ELSE set "DCOUNT=%ind%"
   for /l %%g in (1, 1, %DCOUNT%) do (
      set /A "num=!random!%%%%g+1"
      if not defined $!num! call :nextNum
      for %%N in (!num!) do echo %CD%\!$%%N!& set "$%%N="
   )
exit /B

:nextNum
   set /A num=num%%ind+1
   if not defined $%num% goto nextNum
exit /B
Community
  • 1
  • 1
Aacini
  • 65,180
  • 12
  • 72
  • 108
  • slower than the original code i posted (after you change it to disabledelayedexpansion) ... and my original code takes twice as long as using enableddelayedexpansion and ignoring the apostrophes. So my thinking is there's something in-between the speed of disabled and enabled ... hence I want to be verrrry judicious about the use of disableddelayedexpansion. – RKO Sep 23 '16 at 17:29
  • Nice idea, but the `set "%%a"` will still destroy the `!` when it jumps to an enabled delayed expansion context – jeb Sep 23 '16 at 22:05
  • thanks for the code, but I had to shut it down after 1 hour 20 minutes ... it was only about 3/4 done. too slow. did I mention some of the folders can contain 1500 files? others can contain just 2. – RKO Sep 24 '16 at 17:08
  • @Aacini - Our old friend [Why does SET performance degrade as environment size grows?](http://www.dostips.com/forum/viewtopic.php?f=3&t=2597&start=30) has reared its ugly head. Putting the body of the outer loop in a CALLed subroutine, with enclosing SETLOCAL/ENDLOCAL, solved the issue for me. See [my answer](http://stackoverflow.com/a/39672776/1012053). – dbenham Sep 25 '16 at 15:44
  • @Aacini - I updated my answer with CALL to substitute your array method in place of my string method for managing available numbers, and achieved slightly better performance. – dbenham Sep 25 '16 at 18:19
  • @dbenham: Thanks for the advice! I modified my code accordingly to your explanation. I have not reviewed your second modification yet (about the CALL)... – Aacini Sep 25 '16 at 18:38
  • @dbenham: Ok. It seems that the number movements of `set /A` command run faster than the substring extraction and assignment of the string method... – Aacini Sep 25 '16 at 18:57
  • Running the code shown in 2ND EDIT, I get the error "the system cannot find the path specified" .. any hints? – RKO Sep 27 '16 at 00:38
  • Ops! There was a small bug: the line `set "USERPROFILE=%USERPROFILE%"` must be inserted below `set "ComSpec=%ComSpec%"`. I already did that in the code above... – Aacini Sep 27 '16 at 02:44
  • Didn't you tested the method of the 2ND EDIT yet? It is possible that it run slightly faster than last dbenham's method... I posted a new solution under the 3RD EDIT that I hope be even faster. I invite you to test it and post the result... – Aacini Sep 27 '16 at 16:56
  • Excuse me. The volunteers that participate in this site for free try to do our best to give good solutions. In the case of this question the core problem was an efficiency one, and I posted a couple solutions trying to solve your problem; however, you had not reported the results. May I ask you to run my last code and post the result? Thanks... – Aacini Sep 29 '16 at 19:02
  • I'm so sorry. There were so many threads going on that I missed your last update. There's a small bug somewhere in your code that I'm looking for and as soon as I find it I'll run the test and get back to you. – RKO Sep 29 '16 at 19:03
  • Couldn't get 3RD EDIT to work ... the error message goes by fast for me to be sure, but it seems to be 'findstr is not a recognized as an internal or ... program ... ' In any regard 3RD EDIT fails at the for /f "tokens=1* delims=:" %%g in ('dir /b /A-D 2^>NUL ^| findstr /N "^"') line. Recall this .bat is NOT on the same drive as the h:\itunes\podcasts. It's on the g: drive. Any thoughts? – RKO Sep 29 '16 at 19:30
  • Ok. I fixed a small bug in the last code, just copy it again and test it. Please, post the results... – Aacini Oct 04 '16 at 23:41
  • So? What happen with the results? – Aacini Oct 15 '16 at 17:16
1

10 Minutes! It's a modified version of the 13 Minute EnabledDelayedExpansion version shown in the original post. I'm waiting for a slight coding fix from dbenham to see if his is an even faster solution.

Here's the critical piece of coding:

for /f "tokens=* delims=" %%g in ('dir "!buffer!" /b') do (
  setlocal DisableDelayedExpansion
  for /f "tokens=1* delims=!" %%m in ("%%g") do if not "%%m"=="%%g" (
    set "item=%%g"
    call set BangFile=%%item:^&=¬%%
    call set BangFile=%%Bangfile:!=^^^^!%%
    call echo %%BangFile%%>"G:\BangFilename.txt"
)
  endlocal

Yes it's ugly. I don't like writing to temporary files, but could find no other way. And again there's probably only a few dozen filenames containing bangs (!) in the entire 120K collection, so very few read-writes. I had to use the above code twice: once for the directory names and once for the filenames.

The full code is here:

@echo off
chcp 1254>nul
setlocal EnableDelayedExpansion
IF EXIST "%USERPROFILE%\Desktop\RandomEpisodes.m3u8" del "%USERPROFILE%\Desktop\RandomEpisodes.m3u8"
set /p COUNT=Select Desired Number of Random Episodes per Album:
call:timestart
for /d %%f in (G:\itunes\Podcasts\* H:\itunes\Podcasts\*) do (

setlocal DisableDelayedExpansion
  if exist g:\BangDirectory.txt del g:\BangDirectory.txt
  for /f "tokens=1* delims=!" %%d in ("%%f") do if not "%%d"=="%%f" (
    set "BangDir=%%f"
    call set BangDir=%%BangDir:^&=¬%%
    call set BangDir=%%BangDir:!=^^^^!%%
    call echo %%BangDir%%>g:\BangDirectory.txt
  )
endlocal

setlocal
  set "buffer=%%f"
  set directory=%%~nf
  call:timecalc

  call:Set$$Variables

  if !COUNT! LEQ !ind! ( set DCOUNT=!COUNT! ) ELSE set DCOUNT=!ind!
  for /l %%g in (1, 1, !DCOUNT!) do (
    call:GenerateUniqueRandomNumber

    for %%N in (!num!) do echo !$$%%N!>>"%USERPROFILE%\Desktop\RandomEpisodes.m3u8"
    )
endlocal
  )
pause
Exit /b

:Set$$Variables
set /a cnt = 0
  if exist g:\BangDirectory.txt for /f "usebackq delims=" %%t in ("G:\BangDirectory.txt") do (
  set "buffer=%%t"
  set "buffer=!buffer:¬=&!"
  )
for /f "tokens=* delims=" %%g in ('dir "!buffer!" /b') do (
  setlocal DisableDelayedExpansion
  if exist g:\BangFilename.txt del g:\BangFilename.txt
  for /f "tokens=1* delims=!" %%m in ("%%g") do if not "%%m"=="%%g" (
    set "item=%%g"
    call set BangFile=%%item:^&=¬%%
    call set BangFile=%%Bangfile:!=^^^^!%%
    call echo %%BangFile%%>"G:\BangFilename.txt"
)
  endlocal

  if exist g:\BangFilename.txt for /f "usebackq tokens=* delims=" %%p in ("G:\BangFilename.txt") do (
  set Filename=%%p
  set "Filename=!Filename:¬=&!"
  set $$!cnt!=!buffer!\!Filename!
  )
  if not exist g:\BangFilename.txt set "$$!cnt!=!buffer!\%%g"
  set /a cnt+=1
)
set "ind=!cnt!"
EXIT /b

:GenerateUniqueRandomNumber
:nextone
set /a "num = (((!random! & 1) * 1073741824) + (!random! * 32768) + !random!) %% !ind!"
for %%N in (!num!) do (
  if !RN%%N!==1 (
  goto:nextone
  )
  set "RN%%N=1"
)
EXIT /b

:timestart
for /F "tokens=1-4 delims=:.," %%a in ("%time%") do (
   set /A "start=(((%%a*60)+1%%b %% 100)*60+1%%c %% 100)*100+1%%d %% 100")
exit /b

:timecalc
REM Get end time:
for /F "tokens=1-4 delims=:.," %%a in ("%time%") do (
   set /A "end=(((%%a*60)+1%%b %% 100)*60+1%%c %% 100)*100+1%%d %% 100"
)
REM Get elapsed time:
set /A elapsed=end-start
REM Show elapsed time:
set /A hh=elapsed/(60*60*100), rest=elapsed%%(60*60*100), mm=rest/(60*100), rest%%=60*100, ss=rest/100, cc=rest%%100
if %mm% lss 10 set mm=0%mm%
if %ss% lss 10 set ss=0%ss%
set "TimeElapsed=    Time elapsed (mm:ss) %mm%:%ss%"
title %timeelapsed%    WIP: "%directory%"
exit /b

Note on testing. I tested all the versions with virus protection off and with no other running programs. I selected 5 as the Desired Number of Random Episodes. I use an Evo N410c running Win XP. I tried to be as even as possible for each version of the script, but I noticed the 13 minute run can sometimes be as fast as 10 minutes (my guess is XP is creating & keeping some indices on-the-fly which affect runtime).

The 120K item library is contained on an external harddrive connected by USB. The library has about 300 directories with hundreds to thousands of items. It has another 1400 directories with between a few and a few dozen items. This is a live library, but I included an additional testing directory with filenames having every combination and permutation of !, & and %.

Philosophic note. Several times I've had to do 'stuff' with this 120K item library and in the end it has always been the case that avoidance of DisableDelayedExpansion is the best rule. I typically do everything I can with EnabledDelayedExpansion and then take care of any exceptions (bangs) as exceptions.

Any recommendations to reduce the ugliness of this solution (but not its speed) are very welcome.

Postscript--------------------

I incorporated Aacini's and dbenham's random number routine into my 10 minute version. Their coding looked to be much more elegant than my original coding. I deleted my GenerateUniqueRandomNumber subroutine and incorporated the following:

   if !ind! lss !Count! (set end=1) else set /a "end=ind-Count+1"
   for /l %%N in (!ind! -1 !end!) do (
     set /a "ran=!random!%%%%N+1"
     set /a "num=RN!ran!, RN!ran!=RN%%N
     for %%N in (!num!) do echo !$$%%N!>>"%USERPROFILE%\Desktop\RandomEpisodes.m3u8"
   )

This change added 2 minutes to processing time (increased run-time from 10 minutes to 12 minutes). Sometimes elegant just ain't as fast as plain ugly. I'm stickin' with my original.

RKO
  • 161
  • 10
  • Nicely done. But I think my last update with a low request count of 5 should beat this. – dbenham Sep 27 '16 at 19:27
  • Also - good idea to use a temp file. You might be surprised how often a temp file beats out purely memory based batch solutions. – dbenham Sep 27 '16 at 19:31
0
...
setlocal DisableDelayedExpansion
endlocal&set "item=%%g"
...

should work.

Magoo
  • 77,302
  • 8
  • 62
  • 84
0
@ECHO OFF
SETLOCAL DISABLEDELAYEDEXPANSION
FOR /f "tokens=1*delims=:" %%a IN (
 'dir /b /ad c:\106x^|findstr /n /r "."') DO (
 SET "thing%%a=%%b"
)
SET thing
GOTO :EOF

This should establish your array. c:\106x is just a directory where I have some strange directory-names.


My directory c:\106x:

!dir!
%a
%silly%
-
1 & 2
a silly dirname with & and % and ! an things
exe
flags
nasm32working
nasmxpts
no test file(s)
some test file(s)
stl
wadug
with spaces
with'apostrophes'test

Result of running above code

thing1=!dir!
thing10=nasmxpts
thing11=no test file(s)
thing12=some test file(s)
thing13=stl
thing14=wadug
thing15=with spaces
thing16=with'apostrophes'test
thing2=%a
thing3=%silly%
thing4=-
thing5=1 & 2
thing6=a silly dirname with & and % and ! an things
thing7=exe
thing8=flags
thing9=nasm32working

works for me! - and only executes the setlocal once.

Magoo
  • 77,302
  • 8
  • 62
  • 84
  • Thanks, but I'm trying to avoid over extension of DisableDelayedExpansion ... which kills processing speed. Again, I have working code, but I need to make it faster ... and I notice that if I can tolerate a few missing filenames (those containing apostrophes) I can cut processing time in half. Hence the original post was quite specific about not using DisableDelayedExpansion for the entire subroutine. – RKO Sep 23 '16 at 16:25
  • Of course it works. So does my original code if you change it to disabledelayed expansion. But they are both s0ooo slow. The difference is a factor of 2 versus enabledelayed expansion (takes twice as long) – RKO Sep 23 '16 at 17:28