2

I want to create an appliaction that logs into a website and fetches the HTML source from the inner pages. (like a bot that logs and collects HTML data) kind of like a Web-Crawler I guess.

I can accomplish this using Selenium 2 but I am forced to creare a new WebDriver and what it does, it opens a browser window and then executes the commands.

Is there a way to avoid opening the browser window and just fetch the data I want even thought I need to send a .click() command at the login page before I get to the HTML source data I want?

I read this: Is it possible to hide the browser in Selenium RC? But its using Selenium RC which I think its an old tech which was replaced by Selenium 2

Alternatively, could you guys recommend me a technology apart from Selenium that I could use to accomplish this?

Thanks for your time in reading my question :)

Community
  • 1
  • 1
HyrionX
  • 38
  • 1
  • 5
  • Selenium does the job of fetching me the data, the only downside of it is that it has to visually open the window and do it and I think it kinda slows down because it has to visually render the page, you know? I just want the contents so my App can parse them and store the data into DB. I dont need to see them :) – HyrionX Feb 23 '12 at 20:59

3 Answers3

3

If you are looking for something with WebDriver (Selenium 2), then there is HTMLUnitDriver which comes with it which doesn't launch a browser. Quoting from: http://code.google.com/p/selenium/wiki/GettingStarted "This is a pure Java driver that runs entirely in-memory. Because of this, you won't see a new browser window open." An example is available too.

niharika_neo
  • 8,441
  • 1
  • 19
  • 31
  • I do remember reading about that there was another driver. Im pretty sure the one you are mentioning is it. I will look into it. – HyrionX Feb 26 '12 at 19:14
0

If you are working on Unix/Linux systems, you can also use xfvb, which is a virtual X frame buffer. It is an X11 server that performs all graphical operations in memory without visual screen display. I believe it should be faster than normal X11 server.

Use can use it with the following commands (on Red Hat compatible systems):

> sudo yum install Xvfb
> Xvfb :10 -screen 0 1024x768x24 &
> DISPLAY=:10 firefox http://www.google.com &
> DISPLAY=:10 import -window root www.google.com.png

The last command will take a screenshot of the window.

To run selenium with Xvfb, you can simply export the DISPLAY variable with command:

> export DISPLAY=:10

Now, when you run the selenium from the same terminal, it will use the virtual X11 server.

HTH.

Shumin

Shumin Guo
  • 184
  • 1
  • 3
  • 11
0

No, but there are plenty of browser emulators around that (possibly) can do the job. For example WWW-Mechanize, SimpleTest, Mechanize .. depends on which language you'd rather program in.

troelskn
  • 115,121
  • 27
  • 131
  • 155
  • Selenium does the job, the only downside of it is that it has to visually open the window and do it and I think it kinda slows down because it has to visually render the page, you know? I just want the contents so my App can parse them and store the data into DB. I dont need to see them :) – HyrionX Feb 23 '12 at 20:58
  • Yes, the projects I linked to are much more efficient than automating a real browser, as Selenium does. – troelskn Feb 25 '12 at 09:56