1

*script updated below to reflect suggested changes

This is a CMD script my main question is for PS(But would def take cmd advice too, it is working though). It's using a powershell command as a one liner to "extract" an embedded files from itself. (When you drag/drop files onto it, it will embed them into itself, double click the script to extract the embedded files)

The extract command is slow.. I just updated this and it used to have temp files but I was able to get both variables I needed by switching groups with the same regex once for the first variable and again for the second, so maybe there's a way to get them both without searching through the entire file for each one?

(UPDATED 7/13/23) Here is the full script BAG.cmd

@ECHO OFF & SET N=0
>nul 2>&1 REG ADD HKCU\Software\Classes\.Admin\shell\runas\command /f /ve /d "CMD /x /d /r SET \"f0=%%2\"& call \"%%2\" %%3"& SET _= %*
>nul 2>&1 FLTMC|| IF "%f0%" NEQ "%~f0" (CD.>"%temp%\runas.Admin" & START "%~n0" /high "%temp%\runas.Admin" "%~f0" "%_:"=""%" & EXIT /b)
>nul 2>&1 REG DELETE HKCU\Software\Classes\.Admin\ /f
>nul 2>&1 DEL %temp%\runas.Admin /f
FOR /F "usebackq skip=2 tokens=3-4" %%i IN (`REG QUERY "HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion" /v ProductName 2^>nul`) DO SET "VER=%%i %%j"
IF NOT "%VER%"=="Windows 10" ECHO. & ECHO UNSUPPORTED SYSTEM DETECTED! & ECHO. & PAUSE & EXIT
IF [%1]==[] (CALL :EMPTYBAG) ELSE (ECHO FILLING BAG... & ECHO. & FOR %%i IN (%*) DO (CALL :FILLBAG %%i))
IF %N%==1 (ECHO. & PAUSE & EXIT) ELSE (EXIT)
:FILLBAG
IF EXIST %1\* ECHO [Folder - Ignored] - %~nx1 - Folders are not supported! & SET N=1 & EXIT /b
IF %~z1 LSS 1 ECHO [File   - Ignored] - %~nx1 - Empty files are not supported! & SET N=1 & EXIT /b
IF %~z1 GEQ 750000000 ECHO [File   - Ignored] - %~nx1 - Not Added! The file is too large! & SET N=1 & EXIT /b
SET /A Size=(%~z0 + %~z1) * (130 / 100)
IF %Size% GEQ 980000000 ECHO [File   - Ignored] - %~nx1 - Not Added! There is not enough room in the BAG! & SET N=1 & EXIT /b
ECHO.>>"%~f0" & POWERSHELL -nop -c "Add-Content '%~f0' "^""::%~nx1::"^"" -NoNewline; [Convert]::ToBase64String([IO.File]::ReadAllBytes("^""%~1"^"")) | Add-Content "^""%~f0"^"" -NoNewline; Add-Content '%~f0' "^""::%~nx1::"^"" -NoNewline" & DEL /F "%~1">nul
EXIT /b
:EMPTYBAG
IF %~z0 LSS 2205 ECHO The BAG is already empty, drag-and-drop something onto the BAG to put it inside. ;^) & ECHO. & PAUSE & EXIT /b
IF %~z0 GEQ 80000000 (ECHO EMPTYING BAG... ^(This may take a while^)) ELSE (ECHO EMPTYING BAG...)
POWERSHELL -nop -c "$file=Get-Content '%~f0'; $match=[regex]::Matches($file,'::([^^:]+)::(.+?)::\1::') | Foreach-Object {$name=$_.Groups[1].Value; $fname=$name; while(Test-Path -Path "^""%~dp0$fname"^"") { $n++; $fname="^""($n)$name"^"" }; $data=$_.Groups[2].Value; [IO.File]::WriteAllBytes("^""$fname"^"", [Convert]::FromBase64String($data))}; (Get-Content '%~f0' -TotalCount 22) | Set-Content '%~f0'">nul
EXIT /b

The original line that took such a long time was the extraction command (updated above):

POWERSHELL -nop -c $file=Get-Content '%~f0'; $name=[regex]::Match^($file,'\:\:^([^^^\:]+^)\:\:^(.+?^)\:\:\1\:\:'^).Groups[1].Value.Replace^('::',''^).Trim^(^); $data=[regex]::Match^($file,'\:\:^([^^^\:]+^)\:\:^(.+?^)\:\:\1\:\:'^).Groups[2].Value.Replace^('::',''^).Trim^(^); [IO.File]::WriteAllBytes(\"$name\", [Convert]::FromBase64String($data)); ^(get-content '%~f0' -totalcount 26^) ^| set-content '%~dp0emptyBAG.cmd' >nul

Embedding/Converting to B64 happens very quick, but extracting/converting back to whatever it was takes a while.

Ive tested up to 1gb file with this script, only takes about 30 seconds to embed, but that takes 7-8min to extract ;P

Probably a good size for a test case would be ~40mb to see the delay but not have it be too long.

I think I am rolling through the entire file 4 times during the extraction command, 1) to set the file to a var with Get-Content, 2) to scan the var for the first regex to set $name, 3) to scan the var for the second regex to set $data, and 4) to stream out the $data named $name to create the final file..

I dont know how to make it more efficient =/

**EDIT - Script is working properly now(updated above), TY to all who helped.

  • 1
    I haven't looked closely, but as a general pointer: Unless you need _line-by-line_ processing, use `Get-Content -Raw`, which greatly speeds up processing. – mklement0 Jul 10 '23 at 22:55
  • 1
    I don’t think I understand your whole scenario, but at the very least you can do something like this to improve performance by only parsing the file contents once with the regex… ```$m = [regex]::Match^($file,'\:\:^([^^^\:]+^)\:\:^(.+?^)\:\:\1\:\:'^); $name = $m.Groups[1]… ; $data = $m.Groups[2]…``` – mclayton Jul 10 '23 at 22:56
  • 1
    Regex can be slow and uses lots of memory. Use Event Viewer and monitor the memory usage while running. Slow time is probably because you are using 100% memory. Reading one line at a time using StreamReader will reduce memory usage and will reduce runtime. – jdweng Jul 10 '23 at 23:12
  • Your batch file almost impossible to read with its lack of line returns and parentheses, random insertions of `>nul 2>&1` and `exit`s. This however is crazy ridiculous```POWERSHELL -nop -c $file=Get-Content '%~f0'; $name=[regex]::Match^($file,'\:\:^([^^^\:]+^)\:\:^(.+?^)\:\:\1\:\:'^).Groups[1].Value.Replace^('::',''^).Trim^(^); $data=[regex]::Match^($file,'\:\:^([^^^\:]+^)\:\:^(.+?^)\:\:\1\:\:'^).Groups[2].Value.Replace^('::',''^).Trim^(^); [IO.File]::WriteAllBytes(\"$name\", [Convert]::FromBase64String($data)); ^(get-content '%~f0' -totalcount 26^) ^| set-content '%~dp0emptyBAG.cmd' >nul```. – Compo Jul 11 '23 at 11:11
  • @mklement0 Thank you for this info, for this script the regex needs line by line because it looks "At the beginning.." of each line starting with ::, so unfortunately I cant use that to speed this particular task up. Tried -Raw and it wont find the regex (must be at start of line only) – PlayLORD-SysOp Jul 11 '23 at 11:17
  • @Compo yeah its just escaped powershell with regex, they can read it.. and have provided the help I needed ;P – PlayLORD-SysOp Jul 11 '23 at 11:18
  • @mclayton this seems to GREATLY speed up the extraction process on larger files. Using regex only once then filtering the results by group was much better than getting them one at a time. This is exactly what I needed. I believe its going as fast as it can now. Thank you! – PlayLORD-SysOp Jul 11 '23 at 11:21
  • @Compo ALSO as a note every single command has a purpose, like >nul 2>&1 which hides output + errors of whatever command is run afterwards when used at the beginning of a line.. and any exits are deliberate not random, CMD is the easy part (usually lol). But if you know of any ways to further reduce/speed up the script I'm all ears. – PlayLORD-SysOp Jul 11 '23 at 11:32
  • Whereabouts does the matched text appear? If, for example, it's in the few lines you can perhaps use ```Get-Content ... | Select-Object -First $n``` to only process those lines when looking gor ```$name``` and ```$data``` rather than parse the entire file with a regex. Can you post a representative example of the part of the file they appear in (anonymosed if necessary)? – mclayton Jul 11 '23 at 12:37
  • 1
    I still don't understand your entire script, btw. It looks like it's some sort of arcane self-modifying batch file that writes data values back into the *.cmd itself for the next execution. I'm pretty sure this could all be a lot simpler using just a powershell script, but that's a whole separate issue... – mclayton Jul 11 '23 at 12:39
  • 1
    @PlayLORD-SysOp, you can prepend inline option `m` to a regex so as to make `^` and `$` match the start and of _each line_; e.g. ``"one`ntwo`nthree" -match '(?m)^one$'`` – mklement0 Jul 11 '23 at 13:37
  • 1
    @PlayLORD-SysOp, I know what it is supposed to be. When I said it was ridiculous, I said so because of unnecessary character escaping. When you pass commands between parsers like that, doublequoting is your friend. i.e. ```powershell.exe -NoProfile -Command "AllArgsEnclosedHere" 1>NUL```. _Your main escaping concern would then really only be other, (nested), doublequotes_. – Compo Jul 11 '23 at 13:51
  • 1
    @PlayLORD-SysOp, so you're happy to have two occasions where `EXIT /b` is followed by another `EXIT /b`. Also why not add the `nul` device redirection to the `CALL` command, for the entire script. You also seem to be using strange options with `cmd.exe`. I could live with `/x`, despite the fact it should be `/e` or `/e:on` on any system new enough to run the rest of your script, but I'm fairly certain that `/r` is not correct. Additionally the `/q` option is not required for any of your `DEL` commands. Why use `IF NOT %~z1 GEQ 1` rather than `IF %~z1 LSS 1`? – Compo Jul 11 '23 at 17:00
  • 1
    @Compo, `/r` is a legacy alias for `/c` – mklement0 Jul 11 '23 at 17:02
  • 2
    ...but it is even less well documented, and has no relevance to a script using the non legacy commands contained within it! Like I said initially, the entire thing is poorly put together, as if the intention is to make it as difficult to read and understand as possible. – Compo Jul 11 '23 at 17:06
  • No its just to be efficient @Compo - certain tasks are moved onto a single line, like echoing text to a file.. Its not made to be hard to read although I understand where youre coming from. Other than the powershell portion it only consists of a few if then statements, I did remove the extra exit /b's i put them there out of habbit.. Also this script runs fine on the latest update of Win 11. – PlayLORD-SysOp Jul 11 '23 at 17:14
  • 1
    @PlayLORD-SysOp, it's more efficient to redirect at the `CALL` command, and then specify alternate stdout redirections within the script, if needed. Placing multiple commands on a line separated by ampersands is no more efficient. And not using full paths and extensions for your executables is less efficient than using them too. – Compo Jul 11 '23 at 17:20
  • @mklement0 is there a way to have this regex keep going after it finds a match then run the output command for each match it finds? I know maybe not on a single line but not sure how the match works. (this whole thing is a self assigned HW assignment for lack of a better term, so Im just trying to learn, ive been picking at back up in my spare time just to toy with, so if its a pain to show me something like this no worries and thanks for everything) – PlayLORD-SysOp Jul 11 '23 at 21:21
  • 1
    @PlayLORD-SysOp, if you replace `[regex]::Match()` with `[regex]::Matches()`, you'll get _all_ matches (which you'll have to loop over). – mklement0 Jul 11 '23 at 21:45
  • @mclayton its called BAG, it acts like a literal bag. You put things inside of it, or empty it out. Its just a thing to help me learn and have fun with. Thanks for the help. It embeds the file into itself (appends to the last line) as a single line, with the tag at the start and end of the line. So when it is extracted the regex finds that line parsed the name/data then streams it out... it takes way longer to rebuild from b64 to file than from file to b64, thats why I was curious but I think its just as good as i can get, im wondering about the regex continuing so i can do multiple extracts? – PlayLORD-SysOp Jul 11 '23 at 21:49
  • @mklement0 can you give a short example how I would assign vars with a loop, it doesnt need to match my script, just grabbing a first group 1 and group 2, then moving to the next and repeating.. if its not a pain I saw the -AllMatches option and was about to try that but this is a 1 letter change – PlayLORD-SysOp Jul 11 '23 at 21:53
  • @mklement0 I was hoping to get something like this working but it doesnt.. POWERSHELL -nop -c "$file=Get-Content '%~f0'; $match=[regex]::Matches($file,'\:\:([^\:]+)\:\:(.+?)\:\:\1\:\:') | Foreach-Object {$name=$match.Groups[1].Value; $data=$match.Groups[2].Value; [IO.File]::WriteAllBytes("""$name""", [Convert]::FromBase64String($data))}; (Get-Content '%~f0' -TotalCount 23) | Set-Content '%~dp0emptyBAG.cmd'">nul – PlayLORD-SysOp Jul 11 '23 at 22:17
  • 1
    @mklement0 I got it! This is what I mean in that I am learning from this. POWERSHELL -nop -c "$file=Get-Content '%~f0'; $match=[regex]::Matches($file,'\:\:([^\:]+)\:\:(.+?)\:\:\1\:\:') | Foreach-Object {$name=$_.Groups[1].Value; $data=$_.Groups[2].Value; [IO.File]::WriteAllBytes("""$name""", [Convert]::FromBase64String($data))}; (Get-Content '%~f0' -TotalCount 23) | Set-Content '%~dp0emptyBAG.cmd'">nul That scrolls through all matches and extracts ;P – PlayLORD-SysOp Jul 11 '23 at 22:20

1 Answers1

1

The original line I was working on was

POWERSHELL -nop -c $file=Get-Content '%~f0'; $name=[regex]::Match^($file,'\:\:^([^^^\:]+^)\:\:^(.+?^)\:\:\1\:\:'^).Groups[1].Value.Replace^('::',''^).Trim^(^); $data=[regex]::Match^($file,'\:\:^([^^^\:]+^)\:\:^(.+?^)\:\:\1\:\:'^).Groups[2].Value.Replace^('::',''^).Trim^(^); [IO.File]::WriteAllBytes(\"$name\", [Convert]::FromBase64String($data)); ^(get-content '%~f0' -totalcount 26^) ^| set-content '%~dp0emptyBAG.cmd' >nul

There were unnecessary escape chars because I could use """ in place of \" in the powershell commands to get the same effect and enclose the entire line in ". Doing this allowed me to remove the redundant ^ chars. Leaving me with this:

POWERSHELL -nop -c $file=Get-Content '%~f0'; $name=[regex]::Match($file,'\:\:([^^\:]+)\:\:(.+?)\:\:\1\:\:').Groups[1].Value.Replace^('::','').Trim(); $data=[regex]::Match($file,'\:\:([^^\:]+)\:\:(.+?)\:\:\1\:\:').Groups[2].Value.Replace('::','').Trim(); [IO.File]::WriteAllBytes("""$name""", [Convert]::FromBase64String($data)); (get-content '%~f0' -totalcount 26) | set-content '%~dp0emptyBAG.cmd'" >nul

With the line cleaned up and easier to work with the regex was changed from running twice to running once and being referenced by var:

POWERSHELL -nop -c $file=Get-Content '%~f0'; $match=[regex]::Match($file,'\:\:([^^\:]+)\:\:(.+?)\:\:\1\:\:'); $name = $match.Groups[1].Value.Replace^('::','').Trim(); $data = $match.Groups[2].Value.Replace('::','').Trim(); [IO.File]::WriteAllBytes("""$name""", [Convert]::FromBase64String($data)); (get-content '%~f0' -totalcount 26) | set-content '%~dp0emptyBAG.cmd'" >nul

But what would happen if the regex found multiple matches? At this point it would cycle through and stop after the first. So in order to use the same command but continue through all the matches the "Match" command was switched to "Matches" and a Foreach loop was used to cycle through each object, using $_ as the current object. (along with removing the trim statements due to better data input and changing the outfile name and totalcount # because the script is smaller/less lines now)

POWERSHELL -nop -c "$file=Get-Content '%~f0'; $match=[regex]::Matches($file,'\:\:([^^\:]+)\:\:(.+?)\:\:\1\:\:') | Foreach-Object {$name=$_.Groups[1].Value; $data=$_.Groups[2].Value; [IO.File]::WriteAllBytes("""$name""", [Convert]::FromBase64String($data))}; (Get-Content '%~f0' -TotalCount 17) | Set-Content '%~f0'">nul

EDIT: I was able to remove more escape chars, I didnt need them between the : in the regex, but the regex did require one.

POWERSHELL -nop -c "$file=Get-Content '%~f0'; $match=[regex]::Matches($file,'::([^^:]+)::(.+?)::\1::') | Foreach-Object {$name=$_.Groups[1].Value; $fname=$name; while(Test-Path -Path """%~dp0$fname""") { $n++; $fname="""($n)$name""" }; $data=$_.Groups[2].Value; [IO.File]::WriteAllBytes("""$fname""", [Convert]::FromBase64String($data))}; (Get-Content '%~f0' -TotalCount 22) | Set-Content '%~f0'">nul

Thank you all for your help!

  • 1
    Thanks for posting an answer; regarding quoting: When calling PowerShell's _CLI_ (`powershell.exe` for _Windows PowerShell_, `pwsh` for _PowerShell (Core) 7+_) from the outside, using (possibly implied) `-Command` / `-c`, you need to _escape_ `"` chars. you want passed through as part of the command: `\"` works in principle, but can break when calling from `cmd.exe`. In that case, use `"^""` (sic) with `powershell.exe`, and `""` with `pwsh.exe`, inside overall `"..."` quoting. See [this answer](https://stackoverflow.com/a/49060341/45375) for details. – mklement0 Jul 12 '23 at 22:20
  • 1
    Also, `:` does _not_ require escaping. – mklement0 Jul 12 '23 at 22:22
  • 1
    `^` chars. outside of (what `cmd.exe` thinks is inside) `"..."` are _stripped_ before the target command sees it, so that, for instance, `[^\:]` turns into `[\:]` – mklement0 Jul 12 '23 at 22:27
  • Does that not signify to match at start ^ of line containing : for the regex? Oh wow I guess that wasnt working... I need 2 for it to pass. – PlayLORD-SysOp Jul 12 '23 at 22:30
  • Funny you mentioned the ^ not being needed for escape and I added it to the embed command right after ;P I do need to escape those as CMD see's them AND I wanted to avoid extra :: around the script anyway even though it should only be looking at start of line for the patterns, I know what you were saying was unrelated because they werent there when you posted that.. But thank you for catching the missing escape char too! There are no offending matches in the script itself, so I wouldnt have caught that unless i got bored and stared at it ;P – PlayLORD-SysOp Jul 12 '23 at 22:39
  • @mklement0 i know its not supposed to but the regex wont work without the current escapes. ive tried removing them. i dont think I can reduce it any further.. – PlayLORD-SysOp Jul 12 '23 at 22:46
  • 1
    I haven't looked at the specifics of your command, but here are general pointers: `^` is `cmd.exe`'s escape character in _unquoted_ tokens, but is preserved as-is inside `"..."`. If you nest ``\"...\"`` inside `"..."`, `cmd.exe` sees what's inside the latter as _unquoted_ - hence the recommendation to use `"^""` (Windows PowerShell) / `""` (PowerShell 7+). A `^` that _reaches PowerShell_ has no special meaning _to PowerShell itself_. – mklement0 Jul 13 '23 at 00:54
  • 1
    In PowerShell, inside of a quoted string used _as a regex_, the meaning of `^` is context-dependent: as part of `[^...]` it _negates_ the character set or range represented by `...`. Otherwise, it is an _anchor_ (assertion) for the _start_ of the input string (by default), or - with inline regex option `(?m)` - for the start of a _line_. – mklement0 Jul 13 '23 at 00:55
  • What would the altered regex look like to add start of line only? Just prepend it with that? Or harder? I also added duplicate protection to the script in that powershell line with a while loop to make sure the file doesnt exist before naming (so no data is able to be lost by dropping duplicates in and then they overwrite eachother during extraction - or existing files already in folder before extraction with same names). Powershell is great. CMD still has some use w/ things like extracting filename from path w/ %~nF etc, ;P had to prepend the names (n)filename.txt instead of filename(n).txt – PlayLORD-SysOp Jul 13 '23 at 01:04