1

I have a large file (250 Gb) that I need to search for a string, once I find it I need to copy everything from that line through the end of the file. Example file:

Bird
Lion
Tiger
Jaguar
Frog
Snake

Result would be:

Jaguar
Frog
Snake

I am new to PowerShell and have tried the following but that just finds the string Jaguar and prints it, I need the following lines as well.

Get-Content -Path "C:\Dump\test1.txt" |
Select-String 'Jaguar' |
Set-Content -Path "C:\Dump\test2.txt"
Santiago Squarzon
  • 41,465
  • 5
  • 14
  • 37
maday15
  • 51
  • 2

2 Answers2

5

Since you say your file is really large (and possibly the resulting file is also large), I think I would use a switch and a StreamWriter

$writer  = [System.IO.StreamWriter]::new('C:\Dump\test2.txt')
$foundMarker = $false
switch -Regex -File 'C:\Dump\test1.txt' {
    '\bJaguar\b' { $foundMarker = $true; $writer.WriteLine($_) }
    default { if ($foundMarker) { $writer.WriteLine($_) } }
}
# clean up
$writer.Flush()
$writer.Dispose()

The \b surrounding your keyword Jaguar make it a 'whole word' search.

P.S. If you need the keyword to be matched Case-sensitively, add switch CaseSensitive to the switch: switch -Regex -CaseSensitive -File 'C:\Dump\test1.txt' {...}

Theo
  • 57,719
  • 8
  • 24
  • 41
  • Nicely done. Note that calling `.Dispose()` (or `.Close()`) _implicitly_ flushes, so I don't think you need an explicit `.Flush()` call. – mklement0 Aug 01 '21 at 21:17
0

I create a simple understable function that you can use with large files:

function Get-Content-Since-Equals-To-File(){
    param (
        [string] $Path,        
        [string] $LineText,
        [string] $PathNewFile
    )
    $writer  = [System.IO.StreamWriter]::new($PathNewFile)
    $continue=0
    foreach($line in [System.IO.File]::ReadLines($Path))
    {    
        if($line.Equals($LineText)){$continue=1}
        if( $continue -eq 1){
            #Add-Content -Path $PathNewFile -Value $line #According to  mklement0 using Add-Content is really slow
            $writer.WriteLine($line);
        }
    }
    $writer.Dispose();    
}

Then you can invoke that function by just pass the file path , the word since you want to get the file and the new file Path:

Get-Content-Since-Equals-To-File -Path ./1.txt "Jaguar" -PathNewFile './newFile.txt'

The above result produce a file with the desired result(note I'm using relative paths as an example, in your day-to-day you should use absolute paths and consider the working dir aka cwd):

Get-Content ./newFile.txt
Jaguar
Frog
Snake

This function is based on Read file line by line in PowerShell , because it reads line by line you can use it in large files.

In case you don't need to match you can use other conditions to adapt the function.

Thanks to @mkelement0 for the improvement on Add-Content, I updated the code by using a StreamWriter.

  • Thanks for updating. You can solve the relative-path problem by replacing `[System.IO.StreamWriter]::new($PathNewFile)` with `[System.IO.StreamWriter]::new((Convert-Path -LiteralPath $PathNewFile))`. Also, it's better to use `[bool]` values to represent Boolean values: `$continue = $false` / `$continue = $true` and `if ($continue) ...`. – mklement0 Aug 01 '21 at 23:14