25

In Powershell, how to read and get as fast as possible the last line (or all the lines) which contains a specific string in a huge text file (about 200000 lines / 30 MBytes) ? I'm using :

get-content myfile.txt | select-string -pattern "my_string" -encoding ASCII | select -last 1

But it's very very long (about 16-18 seconds). I did tests without the last pipe "select -last 1", but it's the same time.

Is there a faster way to get the last occurence (or all occurences) of a specific string in huge file?

Perhaps it's the needed time ... Or it there any possiblity to read the file faster from the end as I want the last occurence? Thanks

Victor Zakharov
  • 25,801
  • 18
  • 85
  • 151
SA345
  • 251
  • 1
  • 3
  • 3
  • 4
    The reason there was no change whether you piped to "Select -last 1" or not is because the whole file has to be processed to know which is "last". – Kevin Buchan Jan 23 '14 at 14:17
  • 3
    You may need to use .NET to have some performance there: [Start reading massive text file from the end](http://stackoverflow.com/questions/13621225/start-reading-massive-text-file-from-the-end). – Victor Zakharov Jan 23 '14 at 16:11

5 Answers5

46

Try this:

get-content myfile.txt -ReadCount 1000 |
 foreach { $_ -match "my_string" }

That will read your file in chunks of 1000 records at a time, and find the matches in each chunk. This gives you better performance because you aren't wasting a lot of cpu time on memory management, since there's only 1000 lines at a time in the pipeline.

mjolinor
  • 66,130
  • 7
  • 114
  • 135
9

Have you tried:

gc myfile.txt | % { if($_ -match "my_string") {write-host $_}}

Or, you can create a "grep"-like function:

function grep($f,$s) {
    gc $f | % {if($_ -match $s){write-host $_}}
    }

Then you can just issue: grep $myfile.txt $my_string

bummi
  • 27,123
  • 14
  • 62
  • 101
Robbie P
  • 335
  • 3
  • 10
8
$reader = New-Object System.IO.StreamReader("myfile.txt")

$lines = @()

if ($reader -ne $null) {
    while (!$reader.EndOfStream) {
        $line = $reader.ReadLine()
        if ($line.Contains("my_string")) {
            $lines += $line
        }
    }
}

$lines | Select-Object -Last 1
riabovil
  • 81
  • 1
  • 1
1

Have you tried using [System.IO.File]::ReadAllLines();? This method is more "raw" than the PowerShell-esque method, since we're plugging directly into the Microsoft .NET Framework types.

$Lines = [System.IO.File]::ReadAllLines();
[Regex]::Matches($Lines, 'my_string_pattern');
  • May be slow for big files, or even crash due to out of memory exception. – Victor Zakharov Jan 23 '14 at 16:08
  • 4
    this user specifically said hes using large files, why would you post a solution that "can crash if used with large files"? – Chad Baxter Nov 08 '15 at 22:32
  • if i want regex to give whole line where pattern matches how to do that [Regex]::Matches($line, 'Database:'); it should give where it matches Database: but it should give databasename as well – deepti Feb 14 '18 at 05:28
0

I wanted to extract the lines that contained failed and also write this lines to a new file, I will add the full command for this

get-content log.txt -ReadCount 1000 |
>>  foreach { $_ -match "failed" } | Out-File C:\failes.txt 
Avram Virgil
  • 1,170
  • 11
  • 22