1

Is there a way to get the line index of the last occurrence in file, not the line itself?

$delegate = [Func[string,bool]] {$args[0] -match $myString}
$lastCheckIn  = [Linq.Enumerable]::Last([System.IO.File]::ReadLines($myFile), $delegate)

I thought of using [Linq.Enumerable]::Count but couldn't find a method to return a sequence from the start to $lastCheckIn.

Marisol
  • 117
  • 1
  • 7
  • 1
    It's not quite clear what you're trying to find/count... When you say "the index of the last occurrence", do you mean the _line_ index, the _character_ position or the _byte_ offset? Or something else? – Mathias R. Jessen Jan 27 '22 at 20:19
  • 1
    Yes, I’m looking for the line index, thank you – Marisol Jan 27 '22 at 21:15

3 Answers3

3

Limiting the approach to Linq, you can try the following.

$lines = 'aaa','bbb','ccc','ddd','aaa','bbb','ccc','ddd'
$searchPattern = 'c+'

$selectDelegate = [Func[object, int32, object]] { @{line=$args[0]; index=$args[1] } }
$whereDelegate = [Func[object,bool]] { $args[0].line -match $searchPattern }

$objects = [Linq.Enumerable]::Select($lines, $selectDelegate)
$lastObject = [Linq.Enumerable]::Last($objects, $whereDelegate)
$lastObject.index
# result = 6 (zero-based index)

# Without Linq
$lines | Select-String $searchPattern | Select-Object -Last 1 -ExpandProperty LineNumber
# result = 7 (one-based index)

This first builds up a collection of line/index pairs, filters for the last match, and then extract the index. I've also included a non-linq powershell equivalent.

T N
  • 4,322
  • 1
  • 5
  • 18
1

I believe this should be faster and easier to implement, maintain and understand:

$pattern = 'yourpatternhere'
$content = [System.IO.File]::ReadAllLines('path/to/file.ext')
$tail = $content.Count

while($tail--)
{
    if($content[$tail] -match $pattern)
    {
        "$pattern was found on line: $tail"
        break
    }
}
T N
  • 4,322
  • 1
  • 5
  • 18
Santiago Squarzon
  • 41,465
  • 5
  • 14
  • 37
1

T N's helpful answer shows an effective LINQ solution as well as a PowerShell-idiomatic alternative.

There are two alternatives for improved performance:

  • If it is acceptable to load the entire file into memory first, use [Array]::FindLastIndex() (Get-Content -ReadCount 0 is in essence the PowerShell equivalent of [System.IO.File]::ReadAllLines()):
[Array]::FindLastIndex(
  (Get-Content -ReadCount 0 $myFile), # -ReadCount 0 returns all lines as single array
  [Predicate[string]] { $args[0] -match $myString }
)
  • An optimized LINQ solution, with lazy enumeration:
$i = $index = -1
$null = [Linq.Enumerable]::LastOrDefault(
  [IO.File]::ReadLines($myFile),
  [Func[string, bool]] { 
     ++$script:i; 
     if ($args[0] -match $myString) { $script:index = $script:i; return $true } 
  }
)
$index # output the index of the last match, if not found, -1

Caveat: This approach to finding the index only works as intended with lazy enumerables as input, as only they necessitate forward enumeration until the very last element.

By contrast, list-like enumerables (those that implement the System.Collections.IList interface or its generic counterpart), are enumerated backwards, from the end of the list, for optimized performance.

Similarly, if you expect the last matching line to be close(r) to the end of the large file, you'll need to lazily read the file backwards for best performance, for which there is no standard .NET API. Doing so in a way that handles variable-width character encodings such as UTF-8 is nontrivial, however - see this answer.

mklement0
  • 382,024
  • 64
  • 607
  • 775
  • Thank you, it is a 1GB file so I was looking for the least memory and CPU consuming option – Marisol Jan 30 '22 at 08:43
  • I see, @Marisol; then the optimized LINQ solution could work for you. Ultimately, however, if you expect the last matching line to be close(r) to the _end_ of the large file, you'll need to read the file _backwards_ for best performance. Unfortunately, doing so in a way that handles variable-width character encodings such as UTF-8 is nontrivial - see [this answer](https://stackoverflow.com/a/452945/45375). – mklement0 Jan 30 '22 at 15:28