0

I'm trying to scrape a web page using classic asp. Why, because I have an asp file that I am trying to include in 2 domains and I'd rather not have 2 copies to update.

I'm new to the whole web scraping thing and having difficulty finding a "Dummies" tutorial on how to do it with classic asp (not my preference but what I'm stuck with). I don't need anything fancy, just a grab entire source of page from here.asp and post it on myotherpage.asp.

Little help in either code or tutorials would be appreciated.

Kara
  • 6,115
  • 16
  • 50
  • 57
testing123
  • 761
  • 6
  • 13
  • 37
  • As far as I know "Web Scraping" means capturing the output of a website. You can't do this with classic asp because it uses server side code which you will never see with an http request, you need access to the server. If both your domains are hosted on the same server then there are various things you could do to let them share the same files. – John Mar 10 '17 at 02:07
  • @John that's what the [`WinHttpRequest` object](http://stackoverflow.com/a/37462944/692942) is for, just because it's server-side doesn't mean the server can't act like a client. – user692942 Mar 10 '17 at 09:33
  • @Lankymart If all the OP needs is the output from `here.asp` then yes, the other server can pull it. The way I read it was that he wanted server 2 to be able to read server side code from a file on server 1. The question isn't very clear in this respect. – John Mar 10 '17 at 14:37
  • @John The term "Web Scraping" to me always suggests the client-side render, but honestly the question is not clear and shows no attempt at a problem anyway. – user692942 Mar 10 '17 at 14:41

1 Answers1

6

To retrieve the HTML source from a URL in Classic ASP you can use code like this:

<% 
Set obj = CreateObject("MSXML2.ServerXMLHTTP")
obj.Open "GET", "http://www.example.com/page.html", False
obj.Send ""
Response.Write obj.ResponseText
Set obj = Nothing 
%>

In this example, obj.ResponseText is the HTML source.

user692942
  • 16,398
  • 7
  • 76
  • 175
johna
  • 10,540
  • 14
  • 47
  • 72
  • msxml3.dll error '80072ee2' The operation timed out... obj.Send "" – WilliamK Nov 05 '21 at 01:11
  • You may need to increase one of the timeouts, refer https://learn.microsoft.com/en-us/previous-versions/windows/desktop/ms760403(v=vs.85) – johna Nov 06 '21 at 02:05
  • No change after adding obj.SetTimeouts 600000, 600000, 15000, 15000 Trying to get the response from a url like http://localhost:8000/?file=C:\IN\word.doc&out=C:\OUT which should return "Completed" after the file is processed. The only hint is an error about timing out. It works ok with PHP and curl but prefer to use ASP. – WilliamK Nov 07 '21 at 21:55