I am iterating through a CSV file of URLs and using Invoke-WebRequest to get back the innerHTML and href values for links that match a specified criteria however this only works for some URLs and not for others unless I add the parameter -UseBasicParsing which doesn't provide the property access and filtering capabilities I need.
A common denominator is that the ones that don't work all use a www subdomain but a couple of them are still accessible without this but still don't work and I am not sure this should be an issue anyway as other www URLs do work
As mentioned above, I have tried adding UseBasicParsing which does allow a connection but this restricts the data that I have access to. I have also looked at the http headers for the URLS to try and understand what the differences are but am unsure what the issue is.
This functions correctly and returns the innerHTML text and href for each link on the page
$currentRequest = Invoke-WebRequest -Uri https://moz.com/learn/seo/what-
is-seo
$currentRequest | Get-Member
$currentRequest = $currentRequest.Links |
Select innerHTML, href |
WHERE innerHTML -like *SEO*
$currentRequest
Using exactly the same code with the following URL, the console just freezes until the script is exited
https://www.redevolution.com/what-is-seo
When I run the script with the working URL I get a pair of values for each link as shown below
innerHTML : Recommended SEO Companies
href : https://moz.com/community/recommended
With the non working URL as mentioned above the command line just stays at a blinking cursor.
This is just one example and I need to query other data as well so it would be great to understand how I can consistently run Invoke-WebRequest without issues.
Many thanks!!
Mike