1

I am trying to write a script that downloads web sites information. I am able to download the information but I cannot seem to get the filtering working. I have an a series of values that I want skipped stored in $TakeOut but it does not recognize the values in the if -eq $TakeOut. I have to write a line for each value.

What I am wondering is, if there is a way to use a $value as over time there will be a considerable amount of values to skip.

This works but is not practical in the long run.

if ($R.innerText -eq "Home") {Continue}

Something like this would be preferable.

if ($R.innerText -eq $TakeOut) {Continue}

Here is a sample of my code.

#List of values to skip
$TakeOut = @()
$TakeOut = (
"Help",
"Home",
"News",
"Sports",
"Terms of use",
"Travel",
"Video",
"Weather"
)

#Retrieve website information
$Results = ((Invoke-WebRequest -Uri "https://www.msn.com/en-ca/").Links)

#Filter and format to new table of values
$objects = @()
foreach($R in $Results) {
   if ($R.innerText -eq $TakeOut) {Continue}
   $objects += New-Object -Type PSObject -Prop @{'InnerText'= $R.InnerText;'href'=$R.href;'Title'=$R.href.split('/')[4]}
}

#output to file
$objects  | ConvertTo-HTML -As Table -Fragment | Out-String >> $list_F
Aderbal Farias
  • 989
  • 10
  • 24
Woody
  • 15
  • 4

1 Answers1

1

You cannot meaningfully use an array as the RHS of an -eq operation (the array will be implicitly stringified, which won't work as intended).

PowerShell has operators -contains and -in to test membership of a value in an array (using -eq on a per-element basis - see this answer for background); therefore:

 if ($R.innerText -in $TakeOut) {Continue}

Generally, your code can be streamlined (PSv3+ syntax):

$TakeOut = 
    "Help",
    "Home",
    "News",
    "Sports",
    "Terms of use",
    "Travel",
    "Video",
    "Weather"

#Retrieve website information
$Results = (Invoke-WebRequest -Uri "https://www.msn.com/en-ca/").Links

#Filter and format to new table of values
$objects = foreach($R in $Results) {
   if ($R.innerText -in $TakeOut) {Continue}
   [pscustomobject @{
      InnerText = $R.InnerText
      href = $R.href
      Title = $R.href.split('/')[4]
   }
}

#output to file
$objects | ConvertTo-HTML -As Table -Fragment >> $list_F
  • Note the absence of @(...), which is never needed for array literals.

  • Building an array in a loop with += is slow (and verbose); simply use the foreach statement as an expression, which returns the loop body's outputs as an array.

  • [pscustomobject] @{ ... } is PSv3+ syntactic sugar for constructing custom objects; in addition to being faster than a New-Object call, it has the added advantage of preserving property order.

You could write the whole thing as a single pipeline:

#Retrieve website information
(Invoke-WebRequest -Uri "https://www.msn.com/en-ca/").Links | ForEach-Object {
   #Filter and format to new table of values
   if ($_.innerText -in $TakeOut) {return}
   [pscustomobject @{
      InnerText = $_.InnerText
      href = $_.href
      Title = $_.href.split('/')[4]
   }
} | ConvertTo-HTML -As Table -Fragment >> $list_F

Note the need to use return instead of continue to move on to the next input.

mklement0
  • 382,024
  • 64
  • 607
  • 775