1

I try to use Selenium to run a script to download/scraping data from a "infinite" scroll instagram page for research pourpose. I use google colaboratory and this haven't a browser installed because operate like a server.

It's my script

import time
from selenium import webdriver
from bs4 import BeautifulSoup as bs

browser = webdriver.Firefox()
browser.get("https://www.instagram.com/dario_nardella/?hl=it")

lenOfPage = browser.execute_script("window.scrollTo(0, document.body.scrollHeight);var lenOfPage=document.body.scrollHeight;return lenOfPage;")
match=False
while(match==False):
        lastCount = lenOfPage
        time.sleep(3)
        lenOfPage = browser.execute_script("window.scrollTo(0, document.body.scrollHeight);var lenOfPage=document.body.scrollHeight;return lenOfPage;")
        if lastCount==lenOfPage:
            match=True
source_data = browser.page_source
bs_data = bs(source_data)

and i have this error

WebDriverException: Message: 'geckodriver' executable needs to be in PATH. 

To solve my problem i try to download geckodriver with this bash command

!wget https://github.com/mozilla/geckodriver/releases/download/v0.11.1/geckodriver-v0.11.1-linux64.tar.gz
!tar -xvzf geckodriver-v0.11.1-linux64.tar.gz
!rm geckodriver-v0.11.1-linux64.tar.gz
!chmod +x geckodriver

but i have the same error. thanks a lot for any solution

I follow @macio solution but i have another problem with permission maybe caused from colaboratory

browser = webdriver.Firefox(executable_path=/path to geckodriver/)

and i don't know why

-rwxrwxr-x 1 1000 1000 4087499 Oct 10  2016 geckodriver*
-rw-r--r-- 1 root root       0 Oct 24 10:20 geckodriver.log
Marco Scarselli
  • 1,154
  • 2
  • 11
  • 27
  • you need download the gekodriver from browser and put it in the same path where you have your firefox driver for selenium so they will be picked-up dynamically from your system path; you can save it even in another place if you want but you need your webdriver.Firefox() point to it in this case; for example I have my gekodriver.exe on C:\Python27\selenium\webdriver – Carlo 1585 Oct 24 '18 at 10:47
  • Are you sure that it will be work for google colab??!!! – Hamed Baziyad Jan 23 '19 at 14:01

1 Answers1

1

First of all, why you are using so old geckodriver from 10 Oct 2016?

Try that way:

browser = webdriver.Firefox(executable_path=/path to geckodriver/)

Or insert the path to geckodriver in the PATH env variable:

export PATH=$PATH:/path to geckodriver/
Macio
  • 104
  • 6