0
>$search="<table id="
 $linenumber= Get-Content ".\145039.html" | select-string $search | Select-Object LineNumber

$search="</table>"
$linenumber2= Get-Content ".\145039.html" | select-string $search | Select-Object LineNumber
#$linenumber2


# the list of line numbers to fetch
$linesToFetch = $linenumber[2]..$linenumber2[2]
$currentLine  = 1
$result  = switch -File ".\145039.html" {
    default { if ($linesToFetch -contains $currentLine++) { $_ }}
}

# write to file and also display on screen by using -PassThru
$result | Set-Content -Path ".\excerpt.html" -PassThru



Cannot convert the "@{LineNumber=6189}" value of type              "Selected.Microsoft.PowerShell.Commands.MatchInfo" to type "System.Int32".
At line:10 char:1
+ $linesToFetch = $linenumber[2]..$linenumber2[2]
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidArgument: (:) [], RuntimeException
    + FullyQualifiedErrorId : ConvertToFinalInvalidCastException

$linenumber and $linenumber2 return values like below but I just need to get the number not the column header.

LineNumber
----------
      6015

Also, the final version needs to loop through all the html files in a directory not just one static file.

Sorry, there is probably a better way to do this but not sure how.

Thanks in advance!

Did a lot of googling but could not find the right solution.


Updated code:
     $search1="disconnect-status"
    $linenumber1= Get-Content ".\145039.html" | select-string 
    $search1 
      | Select-Object -ExpandProperty LineNumber
    

   $search2="</table>"
    $linenumber2= Get-Content ".\145039.html" | select-string 
    $search2 | Select-Object -ExpandProperty LineNumber
     


    # the list of line numbers to fetch
      $linesToFetch = $linenumber1[3]..$linenumber2[1]
        $currentLine  = 1
    $result  = switch -File ".\145039.html" {
        default { if ($linesToFetch -contains $currentLine++) { $_ }}
    }

    # write to file and also display on screen by using -PassThru
    $result | Set-Content -Path ".\excerpt.html" -PassThru
_____________________________________________________________

Thank you @ mklement0

This now works for one file at a time now I need it go select text from all the HTML files in the directory.

SteveMize
  • 1
  • 1
  • You should use the "dot notation" to get the desired property of the object `$LineNumber.LineNumber` or `$LineNumber2.LineNumber` – Olaf Jan 09 '23 at 22:33

1 Answers1

1
  • Your immediate problem is that you need to change
    Select-Object LineNumber (which, due to positional parameter binding, is equivalent to
    Select-Object -Property LineNumber) to
    Select-Object -ExpandProperty LineNumber.

    • That is, you must use Select-Object's -ExpandProperty parameter in order to only get the values of the input objects' .LineNumber properties - see this post.
  • That said, your approach can be optimized in a number of ways, allowing you to make do with only a switch statement:

$output = $false; $openTagCount = 0
$result = 
  switch -File .\145039.html {
    '<table id=' {
      if (++$openTagCount -eq 3) { $output = $true } # 3rd block found
      continue
    } 
    '</table>' {
      if ($output) { break } # end of 3rd block -> exit
      continue
    }
    default {
      if ($output) { $_ } # inside 3rd block -> output line
    }
  }

Note: This extracts the lines inside the third <table> element that has an id attribute, as implied by your original solution attempt; the solution you later edited into the question works differently.


Taking a step back:

  • It looks like your input is HTML, so you're usually better off using HTML parsing to handle your input:

    • In Windows PowerShell you may be able to use Invoke-WebRequest relying on the Internet Explorer engine if present (it isn't anymore by default in recent Windows versions).

    • In recent versions of Windows and in PowerShell (Core) (v6+), you'll either need New-Object -Com HTMLFile - see this answer - or a third-party solution such as such as the PowerHTML module that wraps the HTML Agility Pack - see this answer.

mklement0
  • 382,024
  • 64
  • 607
  • 775
  • Thanks! I edited my code with the -expandproperty and that worked. I tried your code but can't get that to work. I need to get only two tables form the html files located in a directory. I did try to look for an html parser but got stuck. My dated code works but now I need it to go through many files and put all the results in one file. – SteveMize Jan 10 '23 at 16:42
  • Glad to hear it, @SteveMize; I totally missed that you're parsing HTML, not XML - please see my update re HTML parsers. – mklement0 Jan 10 '23 at 17:03
  • @SteveMize, as for my code: I got confused about what your original code is trying to do; please see my update, which, however, still contradicts your most recent comment: in your original code you're using `[2]`, implying that you want the lines of the _3rd_ table, which is what my code now does. What is the actual intent? As for processing multiple files: once you get the single `switch` statement working, you can simply call it in a loop. – mklement0 Jan 10 '23 at 17:05