3

I have a Powershell script i'm using to parse each row in a file, reformat it, and write the new string to an output file. It works fine with an input file with a few hundred lines. However, I need to ultimately run it against a file with a few million lines, and I have been waiting hours and it still hasn't finished. Following this post, I think I need to put Write-Output outside of the loop, but i've been unsuccessful so far.

This is my current code:

Foreach ($line in Get-Content $logFile) {

    $arr = $line.Split()

    $port1 = $arr[9].Split(":")

    $port2 = $arr[11].Split(":")

    $connstring = '|' + $port1[0] + "|" + $port1[1] + "|" + $port2[0] + "|" + $port2[1] + "|" + $arr[4] + "|"

    Write-Output $connstring | Out-File "C:\logging\output\logout.txt" -Append 
}

An example of an input string is:

06/14-04:40:11.371923  [**] [1:4:0] other [**] [Priority: 0] {TCP} 67.202.196.92:80 -> 192.168.1.105:55043

And I need to reformat it to this:

|67.202.196.92|80|192.168.1.105|55043|other|

Any help is much appreciated!

yodish
  • 733
  • 2
  • 10
  • 28

3 Answers3

4

If you use a -ReadCount on the Get-Content it will have the effect of streaming the file one row at a time rather than having to read the entire file in to memory. I suspect that moving write operation outside of your loop might be faster. Less variables and steps inside your loop will probably help too.

Assuming the fourth element after the split doesn't contain a colon (you didn't supply and example of your file) then something like this is should do the trick:

Get-Content $logFile -ReadCount 1 | % {
    '|' + (($_.Split()[9, 11, 4] -replace ':', '|') -join '|') + '|' 
} | Out-File "C:\logging\output\logout.txt"
Dave Sexton
  • 10,768
  • 3
  • 42
  • 56
  • Thank you Dave; so far, my initial testing with your code seems a little faster but unfortunately not by much. I've updated my original post with an example of an input string and the reformatted string. Perhaps my code logic needs tweaked? – yodish Jul 01 '17 at 23:27
  • Update, this code was substantially faster. Thank you Dave! – yodish Jul 02 '17 at 00:14
2

It might help to remove the addition in your string construction

$connstring = "|$($port1[0])|$($port1[1])|$($port2[0])|$($port2[1])|$($arr[4])|"

Try using Measure-Command to test with sample data sets.

Maximilian Burszley
  • 18,243
  • 4
  • 34
  • 63
1

try Something like this :

$test="06/14-04:40:11.371923  [**] [1:4:0] other [**] [Priority: 0] {TCP} 67.202.196.92:80 -> 192.168.1.105:55043"

$template=@"
{Row:06/14-04:40:11.371923  [**] [1:4:0] {Text:other} [**] [Priority: 0] \{TCP\} {IPIN:67.202.196.92}:{PORTIN:80} -> {IPOUT:192.168.1.105}:{PORTOUT:55043}}
"@

$test| ConvertFrom-String -TemplateContent $template |%{"|{0}|{1}|{2}|{3}|{4}|" -f $_.Row.IPIN, $_.Row.PORTIN, $_.Row.IPOUT , $_.Row.PORTOUT , $_.Row.Text }

but you could export direectly to csv like this :

$template=@"
{Row:06/14-04:40:11.371923  [**] [1:4:0] {Text:other} [**] [Priority: 0] \{TCP\} {IPIN:67.202.196.92}:{PORTIN:80} -> {IPOUT:192.168.1.105}:{PORTOUT:55043}}
"@

Get-Content $logFile | ConvertFrom-String -TemplateContent $template | % {
 [pscustomobject]@{
 IPIN=$_.Row.IPIN 
 PORTIN=$_.Row.PORTIN 
 IPOUT=$_.Row.IPOUT 
 PORTOUT=$_.Row.PORTOUT 
 Text=$_.Row.Text
 }

} | export-csv "C:\logging\output\logout.csv" -Append -NoType
Esperento57
  • 16,521
  • 3
  • 39
  • 45
  • 1
    In the second code the last line is an alternative foreach and should IMO be commented out/deleted. Otherwise +1 I think `ConvertFrom-String` with a template is heavily underestimated –  Jul 02 '17 at 17:08