0

This is my first Stack Overflow post!

Im trying to check if a website is fully loaded through Powershell. I discovered that the function Invoke-Request and Select-String is what I need to check for a word on the site (which only appears after the website is fully loaded). Then If the word is found, I want to give a value "true" for example back to break out of a loop. an example can be seen in Answer 1 on this Example solution

However, if I use this solution I'm getting as an output the entire HTML code. Which I don't want Does any 1 know how to avoid getting the entire HTML script? and how to return the word as a "true" value? As an example I want to return from this Website the sentence "No products found." to check if it was fully loaded.

This is a code example I currently have. The Try + Catch example would be an if else statement that could break me out of the loop after finding the sentence "No production found".

Do you guys have any idea how to solve this?

try {
$Response = Invoke-WebRequest -URI https://pwa-woo.wpmobilepack.com/#/;
write-Host $Response.InputFields | Where-Object 
{
$_.name -like "* No products found.*"
}

#break out of the loop
write-Host "Case True and break the loop"
}catch {
write-Host "Case False dident work"
}

(The solution shouldn't create a file)

Yu Zhou
  • 11,532
  • 1
  • 8
  • 22
  • Check [this question](https://stackoverflow.com/questions/22510779/can-powershell-wait-until-ie-is-dom-ready), it might help. – rpm192 Jun 08 '21 at 12:53
  • That already looks interesting! Thanks for the incredibly fast answer! This might help – Niklas Pesthy Jun 08 '21 at 12:58
  • @rpm192 in this case the [# Element ID to check for in DOM $elementID = "systemmessage"] can be a " – Niklas Pesthy Jun 08 '21 at 13:09
  • @rpm192 The problem is I use Microsoft Edge. the example is using Internet Explorer. I don't think its working for Micr.Edge – Niklas Pesthy Jun 08 '21 at 13:18
  • You should be able to use Edge. The only issue I see is that the text / element that you want to check for does not seem to be in the DOM of the website. Is it using React or something different to get a list of products? – rpm192 Jun 08 '21 at 13:32
  • @rpm192 The example website and ellement are sadly not the Site I need. I cant share the exact webpage since its from my company. How exactly do I check for the DOM element? If I inspect the specific element. I see a tab called "DOM breakpoint" in Micr. Edge. Is that the DOM info I need? – Niklas Pesthy Jun 08 '21 at 13:37
  • Im trying a new method now. Instead of using "website scraping" il try to measure the download connection to then make the code somewhat dynamic. – Niklas Pesthy Jun 09 '21 at 08:11
  • Thanks for the input guys. I have a working code now – Niklas Pesthy Jun 16 '21 at 11:08

1 Answers1

0

Another way you could do this is to use Edge Dev Tools to see the order of requests performed on the site.

When I open dev tools, and go to the network tab on that address, then search for 'No Products Found', I see that main.js does an XHR request to a URL, and then alternately displays that message:

[![using Edge DevTools Network tab to search for the string "no products found"][1]][1]

If this gives a response, it will render a grid of items, if not, it displays No products found.

Here's the URL it checks: https://pwathemes.com/demo-api/wp-json/pwacommercepro/products/?page=1&featured=1&order=asc&orderby=title&per_page=20

So an easier loop becomes this, checking that URL directly for products. If there are any, then No products found will not be displayed

$url = `https://pwathemes.com/demo-api/wp-json/pwacommercepro/products/?page=1&featured=1&order=asc&orderby=title&per_page=20`
try {
   $result = Invoke-RestMethod $url -ErrorAction Stop
}
catch{
   write-warning "Could not load products!"
}

if ($null -ne $result){
    "results..."
    $result
}

Why use this approach?

Loading a page and search for strings is also called web scraping.

Most modern pages today load asynchronously, meaning they give a very quick response to a base page so the browser can start showing results, then populate placeholders with useful info. To do this, they load scripts.

We can monitor a page's loading to see what the scripts are doing and go directly to the underlying APIs to make something somewhat less fragile.

There's a trade-off for either method. The app publisher (if it isn't us), could change the underlying API at any time with no promise or notice, so it could break our scripts.

Meanwhile, they could change the way they render content on a page also at any time.

It's really apples and oranges, but going to the API is often easier than page scraping, and APIs change less than front-end code. [1]: https://i.stack.imgur.com/R54JW.png

FoxDeploy
  • 12,569
  • 2
  • 33
  • 48
  • To bad I cant share the actuall site that I need to check. Since its company related. I feel like you would be the kind of person who has the perfect solution in under 3 Minutes. Il try to do the exact thing you did for the example site. Kind regards :) – Niklas Pesthy Jun 08 '21 at 13:42
  • Here's two articles that might help. One out-lines web scraping, the other shows you how to find APIs. Web scraping - https://www.foxdeploy.com/blog/extracting-and-monitoring-web-content-with-powershell.html, API hunting - https://www.foxdeploy.com/blog/faster-web-cmdlet-design-with-chrome-65.html – FoxDeploy Jun 08 '21 at 13:46
  • thanks again for you're input! ^^ I found the term "webscraping" aswell. But I didn't know what exactly it was. Youre articles, will help me understand this topic a little better. Thanks again =) – Niklas Pesthy Jun 08 '21 at 13:48
  • Im trying a new method now. Instead of using "website scraping" il try to measure the download connection to then make the code somewhat dynamic. – Niklas Pesthy Jun 09 '21 at 08:11
  • I have a code that works now through scraping. – Niklas Pesthy Jun 16 '21 at 11:07