0

I have a crawler in python that uses Selenium webdriver. I would like to start it in a cluster and leave it running for about 10 days. The problem is:

I do not have an X display!!!!

I have done some searching and reading. Normally this would be solved by using Xfvb and pyvirtualdisplay. It is not yet installed in the clusters. Now another problem pops up:

I do not have admin access in clusters!!!!

Although I can install pyvirtualdisplay in python VE, I cannot run

sudo apy-get install xvfb

I don't own a personal desktop. Any suggestion?

Patrick the Cat
  • 2,138
  • 1
  • 16
  • 33

1 Answers1

1

You can connect phantomjs to Selenium.

It needs no X display at all http://phantomjs.org/

connect it to your selenium grid server like this

java -jar selenium-server-standalone-2.33.0.jar -role hub &

#bit flaky if selenium isn't up quietly goes away
sleep 5
phantomjs --webdriver=4001 --webdriver-selenium-grid-hub=http://127.0.0.1:4444 &

To add more to it add them on extra ports

phantomjs --webdriver=4002 --webdriver-selenium-grid-hub=http://127.0.0.1:4444 &
KeepCalmAndCarryOn
  • 8,817
  • 2
  • 32
  • 47
  • Forgive me if I'm asking something stupid. I simply don't have selenium-server-standlone-x.xx.x.jar. I import selenium.webdriver and do all I want in Python. Can I still use this javascript lib? – Patrick the Cat Oct 09 '13 at 00:09
  • I think the answer here is what you need http://stackoverflow.com/questions/13287490/is-there-a-way-to-use-phantomjs-in-python - specifically the answer by @Pykler – KeepCalmAndCarryOn Oct 09 '13 at 00:17