-4

I wanted to retrieve a webpage page source example "http://google.com" in order to extract out all the picture url inside. Note: not from own website. I hope the idea is clear to understand.

Can this be achieved using only own js code without using jquery? If yes how?

Note: Because I wanted it to be as lightweight as possible, I don't wish to use other site api or library. I hope this can be achived using own written js only.

<p id="demo">here is your downloadable picture link:  </p>

<script>
function analyseURL() {
    var url1 = "http://google.com";

    var HtmlPageSource = get url1 inner sourcode

    var finalPic = filter HtmlPageSource here

    document.getElementById("demo").innerHTML = finalPic;
}
</script>

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
alita
  • 39
  • 9
  • 2
    Without AJAX? Without JS APIs? No, that's impossible... – FZs Sep 20 '19 at 04:33
  • What do you mean JS api anyway? – alita Sep 20 '19 at 04:41
  • Where do you expect to run this code, in a browser or via Node? – Phil Sep 20 '19 at 04:45
  • @EricSia A JS API is a (maybe platform-depending) feature, that you can access from your code directly, but not part of the ECMAScript standard. Their code is not necessarily written in JS, but that part is completely abstract. So, there are APIs, that you can't create an 'APIless' polyfill for. I think you don't want to use *libraries* (e.g. jQuery), but want to use APIs (even the `document` is the part of an API: the DOM...) – FZs Sep 20 '19 at 04:47
  • @Phil browser as I wish it work in mobile as well – alita Sep 20 '19 at 04:48
  • 1
    What you're asking is not possible. There are [very few avenues](https://stackoverflow.com/questions/17148357/including-external-html-file-to-another-html-file) for including remote content without AJAX and even if you could, same-origin policy would prevent you from reading it. On the flip side, an AJAX solution is only possible if the remote site allows it via [CORS](https://stackoverflow.com/questions/14681292/same-origin-policy-and-cors-cross-origin-resource-sharing) – Phil Sep 20 '19 at 05:06

1 Answers1

3

You honestly might be better off using a library like Selenium. It's pretty simple with Python if you're at all familiar with the language (although I think it's available in JS as well). This way you can have Selenium start an instance of a browser, scrape the page for whatever elements you want, then download whatever images or write whatever text you want to files.

Either that or look into using the 'curl' command.

Since Javascript is used for a lot of malicious web crawling odds are you're going to have a bad time trying to get something to work with javascript without any kind of ajax or api. See CORS and Cross Site Scripting.

Steve Whitmore
  • 903
  • 1
  • 9
  • 22
  • I'm familiar with python, but my web hoster don't support it unfortunately. Say if use python I feel like it's meaningless, might as well write the entire thing using python. I hope to achieve using only html + JS. Umm I mentioned I don't wish to use other library. That's just a quick single task that's why. – alita Sep 20 '19 at 04:54
  • You could run the Python script on your machine. I have a few cron jobs set up on my desktop that run at regular intervals for web crawling/scraping tasks. Another option would be to get SSH access to your hosting environment and come up with a simple bash script to run using curl. Sorry you're shooting for html+js only but like others have said it's not really going to be possible. – Steve Whitmore Sep 20 '19 at 05:17
  • I use mobile more often than pc, that's why I wanted it done in web form.. I'll continue research see if XMLHttpRequest can help at all – alita Sep 20 '19 at 10:37
  • 1
    @EricSia You cannot do this using code that runs in a browser. The Same-Origin policy will not allow it. –  Sep 20 '19 at 14:15