0

Myself and some other people at work have been trying to figure out exactly why this excerpt of this script runs so much faster in ISE than in the shell.

For context, the entire script (which compares AD hashes to a list of known compromised hashes), will run in ISE in about 30 minutes with the expected results. However, when invoked remotely or run locally from the shell, it takes up to 10 days in some cases.

We've found that this little bit of code in a function is where things go wonky. I'm not 100% certain, but I believe it may be resulting from the use of System.IO.StreamReader. Specifically, calling the ReadLine() method; but really not sure.

$fsHashDictionary = New-Object IO.Filestream $HashDictionary,'Open','Read','Read'
$frHashDictionary = New-Object System.IO.StreamReader($fsHashDictionary) 

while (($lineHashDictionary = $frHashDictionary.ReadLine()) -ne $null) {

if($htADNTHashes.ContainsKey($lineHashDictionary.Split(":")[0].ToUpper()))
{
    $foFoundObject = [PSCustomObject]@{
        User = $htADNTHashes[$lineHashDictionary.Split(":")[0].ToUpper()]
        Frequency = $lineHashDictionary.Split(":")[1]
        Hash = $linehashDictionary.Split(":")[0].ToUpper()
    }
    $mrMatchedResults += $foFoundObject            
}
  • 3
    Not enough information to tell. What _exactly_ does the file at `$HashDictionary` contain? How did you populate `$htADNTHashes`? What type of object is `$mrMatchesResults`? Please [post a complete example](https://stackoverflow.com/help/minimal-reproducible-example) – Mathias R. Jessen Jun 18 '21 at 20:24
  • Is `Write-Progress` being used at any point on your code? – Santiago Squarzon Jun 18 '21 at 20:27
  • 5
    As an aside: performance will generally improve if you use `$mrMatchedResults = while (...) { ... }`, i.e. if you simply output the `[pscustomobject]` instances in the loop and let PowerShell collect them in an array. "Extending" arrays with `+=` requires recreating the array every time. – mklement0 Jun 18 '21 at 20:30
  • $mrMatchedResults is an array for containing the results of the loop. – basht0p Jun 18 '21 at 20:54
  • $HashDictionary is the contents of "https://downloads.pwnedpasswords.com/passwords/pwned-passwords-ntlm-ordered-by-hash-v7.7z" $htADNTHashes is the content of a text file. Password hashes that were pulled using Get-ADReplAccount from DSInternals. Write-Progress is nowhere. – basht0p Jun 18 '21 at 21:02
  • 1
    Check this interesting video on the slow performance of += with arrays in PowerShell, versus something meant to hold a shifting collection of items, the arraylist. https://www.youtube.com/watch?v=Yp_m5T_kyJU&t=1227s – FoxDeploy Jun 18 '21 at 21:30
  • As commented by @IInspectable to my answer, there is much support to prove my memory related suggestion. Can you add more details (to the question) as a confirmation you using the same system for both tests? The health (memory/disk usage) of the system/script. How you came to the conclusion that included script highlight is related to the performance issue? Can you supply a [mcve]? What PowerShell version are you using? (although there is no *direct* relation with the performance, note the minor difference in default [apartment state](https://stackoverflow.com/a/16073022/1701026) for v.2.0). – iRon Jun 20 '21 at 15:18

2 Answers2

1

It's probably because ISE uses the WPF framework and benefits from hardware acceleration, a PowerShell console does not.

alexzelaya
  • 255
  • 1
  • 7
  • Where can I read about this? Quite interested in your answer. – Santiago Squarzon Jun 19 '21 at 00:35
  • `Windows PowerShell ISE is built on the Windows Presentation Foundation (WPF). If the graphical elements of Windows PowerShell ISE do not render correctly on your system, you might resolve the problem by adding or adjusting the "Disable WPF Hardware acceleration" graphics rendering settings on your system. For more information, see Graphics Rendering Registry Settings.` https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_windows_powershell_ise?view=powershell-5.1 – alexzelaya Jun 19 '21 at 00:59
  • Yeah I was expecting an article regarding the permonance advantages of poweshell ise over poweshell cli... – Santiago Squarzon Jun 19 '21 at 01:04
  • I think you've clearly measured the performance advantages yourself. This is kind of a first so it might be worth it documenting your findings for the community. – alexzelaya Jun 19 '21 at 01:08
  • Yeah, In my experience, I have not actually, which is why I was asking you since you posted this as an answer. – Santiago Squarzon Jun 19 '21 at 01:20
  • Oh! My bad, I thought you owned the question. It's a given that access to the GPU mem will deliver higher performance. I just had not ever seen such a difference when running ISE vs PS console. It just seemed obvious there was a performance benefit of some kind. – alexzelaya Jun 19 '21 at 01:26
  • [These are my findings](https://i.imgur.com/IX3aI9V.png) when looping through files and folders comparing ISE vs CLI (CLI being on the right side) I know this is not the same as to what OP is doing but you can clearly see here that CLI performs much better than ISE in this example. Code can be found [here](https://github.com/santysq/Linear-Loops-vs-ThreadJob-vs-Runspace) – Santiago Squarzon Jun 19 '21 at 01:31
  • 1
    I have to do some checking but it seems the the file objects being loaded and manipulated are benefited by HW acceleration. Your script, although it doesn't line up with this scenario, is interesting and will definitely check it out. – alexzelaya Jun 19 '21 at 01:44
  • 1
    @Santiago, any good articles you can recommend on runspaces? – Abraham Zinala Jun 19 '21 at 01:51
  • 1
    @AbrahamZinala Sorry I don't have any articles to recommend, I got the the basics from reading [this awesome answer from Mathias](https://stackoverflow.com/questions/41796959/why-powershell-workflow-is-significantly-slower-than-non-workflow-script-for-xml/41797153#41797153). I got lucky I could always use the ThreadJob module. – Santiago Squarzon Jun 19 '21 at 01:53
  • WPF can also data bind, it's not just rendering. – alexzelaya Jun 19 '21 at 07:45
  • 1
    There is no GUI. And there's no data binding in the question. What are you talking about? – IInspectable Jun 19 '21 at 17:01
1

Afaik, there isn't anything that can explain a "Script runs hundreds of times faster in ISE than in the shell" therefore I suspect the available memory differences between one and the other session are causing your script to run into performance issues.
Knowing that custom PowerShell objects are pretty heavy. To give you an idea how much memory they consume, try something like this:

$memBefore = (Get-Process -id $pid).WS
    $foFoundObject = [PSCustomObject]@{
        User = $htADNTHashes[$lineHashDictionary.Split(":")[0].ToUpper()]
        Frequency = $lineHashDictionary.Split(":")[1]
        Hash = $linehashDictionary.Split(":")[0].ToUpper()
    }
$memAfter = (Get-Process -id $pid).WS
$memAfter - $memBefore

Together with the fact that arrays (as $mrMatchedResults) are mutual and therefore causing the array to be rebuild every time you use the increase assignment operator (+=), the PowerShell session might be running out of physically memory causing Windows to constantly swapping memory pages.

.Net methods like [System.IO.StreamReader] are definitely a lot faster then PowerShell cmdlets (as e.g. Get-Content) but that doesn't mean that you have to pot everything into memory. Meaning, instead of assigning the results to $lineHashDictionary (which loads all lines into memory), stream each object to the next cmdlet.

Especially For you main object, try to respect the PowerShell pipeline. As recommended in Why should I avoid using the increase assignment operator (+=) to create a collection?, you better not assign the output at all but pass the pipeline output directly to the next cmdlet (and eventually release to its destination, as e.g. display, AD, disk) to free up memory.

And if you do use .Net classes (along with the StreamReader class) make sure that you dispose the object as shown in the PowerShell scripting performance considerations article, otherwise you function might leak even more memory than required.

the performance of a complete (PowerShell) solution is supposed to be better than the sum of its parts. Meaning, don't focus too much on a single function if it concerns performance issues, instead look at you whole solution. The PowerShell pipeline gives you the opportunity to e.g. load objects from AD and process them almost simultaneously and using just a little more memory than each single object.

iRon
  • 20,463
  • 10
  • 53
  • 79
  • None of that explains the performance difference when executed in ISE and the command prompt. – IInspectable Jun 20 '21 at 06:23
  • @IInspectable, that is correct. What I am trying to say here is that it probably concerns an incorrect conclusion. I don't think there is anything that can explain a "*Script runs **hundreds of times** faster in ISE than in the shell*". I suspect there are minor memory differences between one and the other session (possibly worsen by memory leaks of previous try outs) that probably causing this issue. – iRon Jun 20 '21 at 06:43
  • That doesn't make much sense. If there is a memory leak then the CLR's compacting heap implementation will eventually move the leaked memory into the same region that ultimately gets paged out to disk and is never touched again. Performance implications? None. – IInspectable Jun 20 '21 at 06:52