4

I have 1000+ URLs that I want to scrape to retrieve the title value from the HTML. After trying different things, I ultimately used iMacros scripts, which I don't know anything about. Nonetheless, I managed to make a script after reading guides.

My script is working perfectly but has a problem: When leeching URLs titles, if it encounters an HTTP error (e.g. dead link, forbidden page, etc), it crashes with an error message like this one:

Error -1350: Error loading page. Http status 403. Line 4: URL GOTO=http://url.com

Instead of crashing when the script encounters these errors, I would like it to simply skip the URL and continue running. How can I modify my script to do this? Here is my script:

VERSION BUILD=9002379
TAB T=1
TAB CLOSEALLOTHERS
URL GOTO=http://google.com/
ADD !EXTRACT {{!URLCURRENT}}
TAG POS=1 TYPE=TITLE ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=d:/ FILE=links.txt
SET !EXTRACT_TEST_POPUP NO

Output:

http://google.com/,Google

I would also like to replace the comma after the URL in the output with a semicolon.

Stephan
  • 41,764
  • 65
  • 238
  • 329
Texh
  • 1,823
  • 2
  • 12
  • 8

1 Answers1

1

At critical points where you don't want to exit at fails:

SET !ERRORIGNORE YES

If you want to revert back at some point to do stop on an error:

SET !ERRORIGNORE NO

You can use these two as many times as you like, even every second row turn on and off.

Poyke
  • 655
  • 3
  • 13