I have an 8 core CPU with 8GB of RAM, and I'm creating a batch file to automate 7-zip CLI in exhausting most parameters and variables to compress the same set of files with the ultimate goal of finding the strongest combination of parameters and variables that result in the smallest archive size possible.
This is very time consuming by nature especially when the set of files to be processed is in gigabytes. I need a way not only to automate but to speed up this whole process.
7-zip works with different compression algorithms, some are single-threaded only, and some are multi-threaded, some do not require much amount of memory, and some require huge amounts of it and could even surpass the 8GB barrier. I've already successfully created an automated batch that works in sequence which exclude combinations requiring more than 8GB of memory.
I've split the different compression algorithms in several batches to simplify the whole process. For example, compression in PPMd as a 7z archive uses 1-thread and up to 1024MB. This is my current batch:
@echo off
echo mem=1m 2m 3m 4m 6m 8m 12m 16m 24m 32m 48m 64m 96m 128m 192m 256m 384m 512m 768m 1024m
echo o=2 3 4 5 6 7 8 10 12 14 16 20 24 28 32
echo s=off 1m 2m 4m 8m 16m 32m 64m 128m 256m 512m 1g 2g 4g 8g 16g 32g 64g on
echo x=1 3 5 7 9
for %%x IN (9) DO for %%d IN (1024m 768m 512m 384m 256m 192m 128m 96m 64m 48m 32m 24m 16m 12m 8m 6m 4m 3m 2m 1m) DO for %%w IN (32 28 24 20 16 14 12 10 8 7 6 5 4 3 2) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.ppmd.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -m0=PPMd:mem=%%d:o=%%w -ms=%%s
exit
x
, s
, o
and mem
are parameters, and what's after each of them are the variables which 7z.exe will work with. x
and s
in this case are of no concern, they mean compression strength and solid block size for the archive.
That batch will work fine, but is limited to running only 1 instance of 7z.exe at a time and now I'm looking for a way to make it run more 7z.exe instances in parallel but without exceeding 8GB of RAM or 8 threads at once, whichever comes first, before proceeding to do the next ones in the sequence.
How can I improve this? I have some ideas but I don't know how to make them work in a batch. I was thinking of 2 other variables that won't interact with the 7z processes but would control when the next 7z instance would start. One variable would keep track of how many threads are currently in use and another would track how much memory are in use. Could that work?
Edit: Sorry, I need to add details, I'm new to this posting style. Following this answer - https://stackoverflow.com/a/19481253/2896127 - I mentioned 8 batches were created and that 7z.PPMd batch was one of them. Maybe listing all the batches and how 7z deals with the parameters will give a better insight on the whole issue. I'll start with the simple ones:
- 7z.PPMd - 1 fully utilized thread and dictionary dependant 32m-1055m memory usage per instance.
- 7z.BZip2 - 8 fully utilized threads and fixed 109m memory usage per instance.
- zip.Bzip2 - 8 partially utilized threads and fixed 336m memory usage per instance.
- zip.Deflate - 8 partially utilized threads and fixed 260m memory usage per instance.
- zip.PPMd - 8 partially utilized threads and dictionary dependant 280m-2320m memory usage per instance.
What I mean with partially utilized threads is that, while I assign 8 threads to be used by each 7.exe instance, the algorithm can do variable CPU usage at a randomly fashion, out of my control, unpredictable, but the limitation is set there - no more than 8 threads. In the case of 8 fully utilized threads, it means that on my 8 core CPU, each instance is utilizing 100% of CPU.
The most complex ones - 7z.LZMA, 7z.LZMA2, zip.LZMA - will need to be explained in detail but I am running short on time now. I'll be back to edit the LZMA part whenever I have more free time.
Thanks again.
EDIT: Adding in LZMA part.
7z.LZMA - each instance is n-threaded, ranging from 1 to 2:
- 1 fully utilized thread, dictionary dependant, 64k to 512m:
- 64k dictionary uses 32m memory
- ...
- 512m dictionary uses 5407m memory
- excluded range: 768m to 1024m (above the limit of 8192m memory available)
- 2 partially utilized threads, dictionary dependant, 64k to 512m:
- 64k dictionary uses 38m memory
- ...
- 512m dictionary uses 5413m memory
- excluded range: 768m to 1024m (above the limit of 8192m memory available)
- 1 fully utilized thread, dictionary dependant, 64k to 512m:
7z.LZMA2 - each instance is n-threaded, ranging from 1 to 8:
- 1 fully utilized thread, dictionary dependant, 64k to 512m:
- 64k dictionary uses 32m memory
- ...
- 512m dictionary uses 5407m memory
- excluded range: 768m to 1024m (above the limit of 8192m memory available)
- 2 or 3 partially utilized threads, dictionary dependant, 64k to 512m:
- 64k dictionary uses 38m memory
- ...
- 512m dictionary uses 5413m memory
- excluded range: 768m to 1024m (above the limit of 8192m memory available)
- 4 or 5 partially utilized threads, dictionary dependant, 64k to 256m:
- 64k dictionary uses 51m memory
- ...
- 256m dictionary uses 5677m memory
- excluded range: 384m to 1024m (above the limit of 8192m memory available)
- 6 or 7 partially utilized threads, dictionary dependant, 64k to 192m:
- 64k dictionary uses 62m memory
- ...
- 192m dictionary uses 6965m memory
- excluded range: 256m to 1024m (above the limit of 8192m memory available)
- 8 partially utilized threads, dictionary dependant, 64k to 128m:
- 64k dictionary uses 72m memory
- ...
- 128m dictionary uses 6717m memory
- excluded range: 192m to 1024m (above the limit of 8192m memory available)
- 1 fully utilized thread, dictionary dependant, 64k to 512m:
zip.LZMA - each instance is n-threaded, ranging from 1 to 8:
- 1 fully utilized thread, dictionary dependant, 64k to 512m:
- 64k dictionary uses 3m memory
- ...
- 512m dictionary uses 5378m memory
- excluded range: 768m to 1024m (above the limit of 8192m memory available)
- 2 or 3 partially utilized threads, dictionary dependant, 64k to 512m:
- 64k dictionary uses 9m memory
- ...
- 512m dictionary uses 5384m memory
- excluded range: 768m to 1024m (above the limit of 8192m memory available)
- 4 or 5 partially utilized threads, dictionary dependant, 64k to 256m:
- 64k dictionary uses 82m memory
- ...
- 256m dictionary uses 5456m memory
- excluded range: 384m to 1024m (above the limit of 8192m memory available)
- 6 or 7 partially utilized threads, dictionary dependant, 64k to 256m:
- 64k dictionary uses 123m memory
- ...
- 256m dictionary uses 8184m (very close to the limit though, I may consider excluding it)
- excluded range: 384m to 1024m (above the limit of 8192m memory available)
- 8 partially utilized threads, dictionary dependant, 64k to 128m:
- 64k dictionary uses 164m memory
- ...
- 128m dictionary uses 5536m memory
- excluded range: 192m to 1024m (above the limit of 8192m memory available)
- 1 fully utilized thread, dictionary dependant, 64k to 512m:
I'm trying to understand the behaviour of the commands with nul in them. I don't quite understand what's happening during that part, what those symbols ^ > ^&1 "" are meant to say.
2>nul del %lock%!nextProc!
%= Redirect the lock handle to the lock file. The CMD process will =%
%= maintain an exclusive lock on the lock file until the process ends. =%
start /b "" cmd /c %lockHandle%^>"%lock%!nextProc!" 2^>^&1 !cpu%%N! !cmd!
)
set "launch="
Then later on, at the :wait code:
) 9>>"%lock%%%N"
) 2>nul
if %endCount% lss %startCount% (
1>nul 2>nul ping /n 2 ::1
goto :wait
)
2>nul del %lock%*
EDIT 2 (29-10-2013): Adding the current point of the situation.
After trial and error research, complemented with step by step notes of what's happening, I was able to understand the behaviour above. I simplified the line with start command to this:
start /b /low cmd /c !cmd!>"%lock%!nextProc!"
Though it works, I still don't understand the meaning of 1^>"filename" 2^>^&1 'command'
. I know it is related to writing text in the filename what would otherwise be displayed to me. In this case, it would show all of 7z.exe text but written in the file. Until 7z.exe instance finishes its job, nothing is written in the file, but the file already exists, yet at the same time doesn't exist. When 7z.exe actually finishes, the file is finalized and this time it exists for the next part of the script.
Now I can understand the processing behaviour of the suggested script and I'm complementing it with something of my own - I am trying to implement all batches into "one batch do it all" script. In the simplified version, this is it:
echo 8 threads - maxproc=1
for %%x IN (9) DO for %%t IN (8) DO for %%d IN (900k) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.bzip2.%%tt.%%dd.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=BZip2:d=%%d:mt=%%t
for %%x IN (9) DO for %%t IN (8) DO for %%d IN (900k) DO 7z.exe a teste.resultado\%%xx.bzip2.%%tt.%%dd.zip .\teste.original\* -mx=%%x -mm=BZip2:d=%%d -mmt=%%t
for %%x IN (9) DO for %%t IN (8) DO for %%w IN (257 256 192 128 96 64 48 32 24 16 12 8) DO 7z.exe a teste.resultado\%%xx.deflate64.%%tt.%%ww.zip .\teste.original\* -mx=%%x -mm=deflate64:fb=%%w -mmt=%%t
for %%x IN (9) DO for %%t IN (8) DO for %%w IN (258 256 192 128 96 64 48 32 24 16 12 8) DO 7z.exe a teste.resultado\%%xx.deflate.%%tt.%%ww.zip .\teste.original\* -mx=%%x -mm=deflate:fb=%%w -mmt=%%t
for %%x IN (9) DO for %%t IN (8) DO for %%d IN (256m 128m 64m 32m 16m 8m 4m 2m 1m) DO for %%w IN (16 15 14 13 12 11 10 9 8 7 6 5 4 3 2) DO 7z.exe a teste.resultado\%%xx.ppmd.%%tt.%%dd.%%ww.zip .\teste.original\* -mx=%%x -mm=PPMd:mem=%%d:o=%%w -mmt=%%t
echo 4 threads - maxproc=2
for %%x IN (9) DO for %%t IN (4) DO for %%d IN (256m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.lzma2.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=lzma2:d=%%d:fb=%%w -mmt=%%t
echo 2 threads - maxproc=4
for %%x IN (9) DO for %%t IN (2) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=LZMA:d=%%d:fb=%%w -mmt=%%t
for %%x IN (9) DO for %%t IN (2) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.lzma2.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=lzma2:d=%%d:fb=%%w -mmt=%%t
for %%x IN (9) DO for %%t IN (2) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO 7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.zip .\teste.original\* -mx=%%x -mm=lzma:d=%%d:fb=%%w -mmt=%%t
echo 1 threads - maxproc=8
for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=LZMA:d=%%d:fb=%%w -mmt=%%t
for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.lzma2.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=lzma2:d=%%d:fb=%%w -mmt=%%t
for %%x IN (9) DO for %%d IN (1024m 768m 512m 384m 256m 192m 128m 96m 64m 48m 32m 24m 16m 12m 8m 6m 4m 3m 2m 1m) DO for %%w IN (32 28 24 20 16 14 12 10 8 7 6 5 4 3 2) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.ppmd.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -m0=PPMd:mem=%%d:o=%%w -ms=%%s
for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO 7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.zip .\teste.original\* -mx=%%x -mm=lzma:d=%%d:fb=%%w -mmt=%%t
In short, I want to process all that in the most efficient manner possible. Doing it by deciding how many processes can run at a time would be a way, but then there's also the memory required for each process, so that the sum of all required memory by those processes won't exceed 8192 MB. I got this part working.
@echo off
setlocal enableDelayedExpansion
set "maxMem=8192"
set "maxThreads=8"
:cycle1
set "cycleCount=4"
set "cycleThreads=1"
set "maxProc="
set /a "maxProc=maxThreads/cycleThreads"
set "cycleFor1=for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO ("
set "cycleFor2=for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO ("
set "cycleFor3=for %%x IN (9) DO for %%d IN (1024m 768m 512m 384m 256m 192m 128m 96m 64m 48m 32m 24m 16m 12m 8m 6m 4m 3m 2m 1m) DO for %%w IN (32 28 24 20 16 14 12 10 8 7 6 5 4 3 2) DO for %%s IN (on) DO ("
set "cycleFor4=for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO ("
set "cycleCmd1=7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=LZMA:d=%%d:fb=%%w -mmt=%%t"
set "cycleCmd2=7z.exe a teste.resultado\%%xx.lzma2.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=lzma2:d=%%d:fb=%%w -mmt=%%t"
set "cycleCmd3=7z.exe a teste.resultado\%%xx.ppmd.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -m0=PPMd:mem=%%d:o=%%w -ms=%%s"
set "cycleCmd4=7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.zip .\teste.original\* -mx=%%x -mm=lzma:d=%%d:fb=%%w -mmt=%%t"
set "tempMem1=5407"
set "tempMem2=5407"
set "tempMem3=1055"
set "tempMem4=5378"
rem set "tempMem1=5407"
rem set "tempMem2=5407"
rem set "tempMem3=1055 799 543 415 287 223 159 127 95 79 63 55 47 43 39 37 35 34 33 32"
rem set "tempMem4=5378"
set "memSum=0"
if not defined memRem set "memRem=!maxMem!"
for /l %%N in (1 1 %cycleCount%) DO (set "tempProc%%N=")
for /l %%N in (1 1 %cycleCount%) DO (
set memRem
set /a "tempProc%%N=%memRem%/tempMem%%N"
set /a "memSum+=tempMem%%N"
set /a "memRem-=tempMem%%N"
set /a "maxProc=!tempProc%%N!"
call :executeCycle
set /a "memRem+=tempMem%%N"
set /a "memSum-=tempMem%%N"
set /a "maxProc-=!tempProc%%!
)
goto :fim
:executeCycle
set "lock=lock_%random%_"
set /a "startCount=0, endCount=0"
for /l %%N in (1 1 %maxProc%) DO set "endProc%%N="
set launch=1
for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO (
set "cmd=7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=LZMA:d=%%d:fb=%%w -mmt=%%t"
if !startCount! lss %maxProc% (
set /a "startCount+=1, nextProc=startCount"
) else (
call :wait
)
set cmd!nextProc!=!cmd!
echo !time! - proc!nextProc!: starting !cmd!
2>nul del %lock%!nextProc!
start /b /low cmd /c !cmd!>"%lock%!nextProc!"
)
set "launch="
:wait
for /l %%N in (1 1 %startCount%) do (
if not defined endProc%%N if exist "%lock%%%N" (
echo !time! - proc%%N: finished !cmd%%N!
if defined launch (
set nextProc=%%N
exit /b
)
set /a "endCount+=1, endProc%%N=1"
) 9>>"%lock%%%N"
) 2>nul
if %endCount% lss %startCount% (
1>nul 2>nul ping /n 2 ::1
goto :wait
)
2>nul del %lock%*
echo ===
echo Thats all folks!
exit /b
:fim
pause
I have trouble with cycleFor1
and cycleCmd1
located in :cycle1
part - they should be replacing the for
line and the first cmd
variable inside the :executeCycle
, to make it work as I intend to. How do I do that?
The other issue I have is about tempMem3
. I have logged all memory required when the command cycleCmd3
would be running. It is dictionary dependant. tempMem3 and cycleCmd3 are related like this:
for %%d IN (1024m 768m 512m 384m 256m 192m 128m 96m 64m 48m 32m 24m 16m 12m 8m 6m 4m 3m 2m 1m) DO
set "tempMem3=1055 799 543 415 287 223 159 127 95 79 63 55 47 43 39 37 35 34 33 32"
So 1024m would use 1055, 768m would use 799, and so on till 1m using 32. I don't know how to translate that into the script.
Any help is appreciated.