Issue deploying script to AWS Lambda

Question

The issue I have is I am trying to run a script which uses Selenium and specifically webdriver.

driver = webdriver.Firefox(executable_path='numpy-test/geckodriver', options=options, service_log_path ='/dev/null')

My issue is that the function requires geckodriver in order to run. Geckodriver can be found in the zip file I have uploaded to AWS but I have no idea as to how to get the function to access it on AWS. Locally its not an issue as it is in my directory and so everything runs.

I get the following error message when running the function via serverless:

{ "errorMessage": "Message: 'geckodriver' executable needs to be in PATH. \n", "errorType": "WebDriverException", "stackTrace": [ [ "/var/task/handler.py", 66, "main", "print(TatamiClearanceScrape())" ], [ "/var/task/handler.py", 28, "TatamiClearanceScrape", "driver = webdriver.Firefox(executable_path='numpy-test/geckodriver', options=options, service_log_path ='/dev/null')" ], [ "/var/task/selenium/webdriver/firefox/webdriver.py", 164, "init", "self.service.start()" ], [ "/var/task/selenium/webdriver/common/service.py", 83, "start", "os.path.basename(self.path), self.start_error_message)" ] ] }

Error --------------------------------------------------

Invoked function failed

Any help would be appreciated.

EDIT:

def TatamiClearanceScrape():
    options = Options()
    options.add_argument('--headless')

    page_link = 'https://www.tatamifightwear.com/collections/clearance'
    # this is the url that we've already determined is safe and legal to scrape from.
    page_response = requests.get(page_link, timeout=5)
    # here, we fetch the content from the url, using the requests library
    page_content = BeautifulSoup(page_response.content, "html.parser")

    driver = webdriver.Firefox(executable_path='numpy-test/geckodriver', options=options, service_log_path ='/dev/null')
    driver.get('https://www.tatamifightwear.com/collections/clearance')

    labtnx = driver.find_element_by_css_selector('a.btn.close')
    labtnx.click()
    time.sleep(10)
    labtn = driver.find_element_by_css_selector('div.padding')
    labtn.click()
    time.sleep(5)
    # wait(driver, 50).until(lambda x: len(driver.find_elements_by_css_selector("div.detailscontainer")) > 30)
    html = driver.page_source
    page_content = BeautifulSoup(html)
    # we use the html parser to parse the url content and store it in a variable.
    textContent = []

    tags = page_content.findAll("a", class_="product-title")

    product_title = page_content.findAll(attrs={'class': "product-title"})  # allocates all product titles from site

    old_price = page_content.findAll(attrs={'class': "old-price"})

    new_price = page_content.findAll(attrs={'class': "special-price"})

    products = []
    for i in range(len(product_title) - 2):
        #  groups all products together in list of dictionaries, with name, old price and new price
        object = {"Product Name": product_title[i].get_text(strip=True),
                  "Old Price:": old_price[i].get_text(strip=True),
                  "New Price": new_price[i].get_text(), 'date': str(datetime.datetime.now())
                  }
        products.append(object)



    return products

Can you post the lambda function code that tries to launch selenium? — bwest, Jan 20 '19 at 19:09
Have you tried passing an absolute path as the `executable_path` argument? — Milan Cermak, Jan 20 '19 at 19:25
I am now getting an error "errorMessage": "Message: 'task' executable may have wrong permissions. — FreeLand, Jan 20 '19 at 20:23
Take a look at this answer https://stackoverflow.com/a/42535334/401096 — bwest, Jan 20 '19 at 20:44
Possible duplicate of [Can't run binary from within python aws lambda function](https://stackoverflow.com/questions/41651134/cant-run-binary-from-within-python-aws-lambda-function) — bwest, Jan 20 '19 at 20:44

score 1 · Answer 1 · answered Jan 21 '19 at 10:19

You might want to have a look at AWS Lambda Layers for this. With Layers you Lambda can you can use libraries without needing to include them in your deployment package for you functions. Layers do so that you do not have to upload dependencies on every change of your code, you just create an additional layer with all required packages.

Read here for more details on AWS Lambda Layers

Issue deploying script to AWS Lambda

1 Answers1