2

I want to create a simple app that will execute JavaScript commands in Chrome Console on a specific page and will return an output.

Namely, I want to get all accessible links from the current page. I can do it by running the following command in Chrome Console:

urls = $$('a'); for (url in urls) console.log(urls[url].href);

It will return a set of links as output, which I'd like to be able to process in my application.

I can run it manually from Chrome Console, but I want to automate this task because I have a lot of links to work with.

The pseudocode is something like the following:

function runCommandOnSite(command, site) { ... }

function main() {
  let site = "facebook.com";
  let command = "urls = $$('a'); for (url in urls) console.log(urls[url].href)";
  let result_links = runCommandOnSite(site, command);
  console.log(result_links);
}

Note: any programming language which could be run from Linux Desktop is acceptable.

punund
  • 4,321
  • 3
  • 34
  • 45

1 Answers1

1

Sounds like you want to scrape a web page and get all the URLs in that web page. Whenever you face a problem like this, always search for Web Crawler examples for any preferred language.

Given below are some examples for scraping the set of URLs from a given webpage. Of course, you might have to do some filtering on the output. But, do some playing around and see...

Python 3 - Beautiful Soup 4

from bs4 import BeautifulSoup
from urllib.request import urlopen
import ssl

# to open up HTTPS URLs
gcontext = ssl.SSLContext()

# You can give any URL here. I have given the Stack Overflow homepage
url = 'https://stackoverflow.com'
data = urlopen(url, context=gcontext).read()

page = BeautifulSoup(data, 'html.parser')

for link in page.findAll('a'):
    l = link.get('href')
    print(l)

Java - JSoup

Have a look at this example.

Node JS - Cheerio

Have a look at this example.

Use Selenium Web Drivers - For Most Programming Languages

I will not explain this section, because it is so wide, and beyond the scope of this answer.

Keet Sugathadasa
  • 11,595
  • 6
  • 65
  • 80