2

The issue

I have a web scraper running in AWS lambda but in a few weeks AWS lambda will stop supporting Ruby 2.7. I built my scraper last year using this tutorial.

I need to find a version of chrome driver & headless chrome that is compatible with Ruby 2.7, But I don't know exactly where to start.

I have looked at the ChromeDriver's downloads portal But I don't see any indication there that Chrome driver will work for ruby 2.7 or any other specific version of ruby for that matter.

The code I have works by accessing the ChromeDriver binary and starting it inside a specific folder

I downloaded the specific binaries I am using by running these commands:

# serverless chrome
wget https://github.com/adieuadieu/serverless-chrome/releases/download/v1.0.0-37/stable-headless-chromium-amazonlinux-2017-03.zip
unzip stable-headless-chromium-amazonlinux-2017-03.zip -d bin/
rm stable-headless-chromium-amazonlinux-2017-03.zip

# chromedriver
wget https://chromedriver.storage.googleapis.com/2.37/chromedriver_linux64.zip
unzip chromedriver_linux64.zip -d bin/
rm chromedriver_linux64.zip
Alvaro Alday
  • 343
  • 3
  • 19
  • Did you solve this problem? i have a problem same as you. So when i just migrate ruby version 2.5 to 2.7, there were problem like "Selenium::WebDriver::Error::WebDriverError: unable to connect to chromedriver http://127.0.0.1:9515". My Serverless chrome and chromedriver version are same as you. – Hoonki Jun 10 '21 at 05:21
  • @Hoonki I have found a way to make it work, what you need to do is change your Lambda from using a .zip file to use a docker image stored in ECR, I'll write a propper response for my question after I am done doing just that. – Alvaro Alday Jun 10 '21 at 19:13
  • Thanks. But what does using docker image mean? Should we still need using ruby 2.5 environment? – Hoonki Jun 14 '21 at 02:19
  • @Hoonki You can create your own custom docker image to use in Lambda from scratch. You only need to follow a guide like this one https://docs.aws.amazon.com/lambda/latest/dg/images-create.html I have been working on this question for a while now. I'll respond my own question and create a tutorial in https://dev.to/ but I think you should be able to figure this one out on your own. Or you can wait until I give an update. – Alvaro Alday Jun 16 '21 at 00:51
  • @AlvaroAlday I set it up in a docker using the ecr lambda ruby2.7 image but still get the same error. Curious to see what your Dockerfile looks like. – HarrisJT Jun 16 '21 at 19:53
  • Hey guys. i've tried headless-chrome version https://github.com/adieuadieu/serverless-chrome/releases/tag/v1.0.0-57 for amazon linux 2. The ruby 2.7 aws lambda environment's os version is amazon linux 2. But it doesn't work... i got an error that "unable to connect chromedriver 127.0.0.1 ~~". Did you find the solution? – Hoonki Jun 30 '21 at 06:43
  • 1
    I updated the post and included the Docker file that I used. @HarrisJT – Alvaro Alday Jul 01 '21 at 19:38

1 Answers1

1

Solution

I found the solution to this problem. Ruby 2.7 that Lambda offers by default runs on top of Amazon Linux 2 (which lacks many important libraries & dependencies), unfortunately, there's nothing you can do to change that.

However, Amazon offers you the ability to run your code in a custom docker image that can be up to 10GB in size.

I fixed this problem by creating my own image using the following Dockerfile

FROM public.ecr.aws/lambda/ruby:2.7

# Install dependencies needed to run MySQL & Chrome

RUN yum -y install libX11
RUN yum -y install dejavu-sans-fonts
RUN yum -y install procps
RUN yum -y install mysql-devel
RUN yum -y install tree
RUN mkdir /var/task/lib
RUN cp /usr/lib64/mysql/libmysqlclient.so.18 /var/task/lib
RUN gem install bundler
RUN yum -y install wget
RUN yum -y groupinstall 'Development Tools'

# Ruby Gems

ADD Gemfile ${LAMBDA_TASK_ROOT}/
ADD Gemfile.lock ${LAMBDA_TASK_ROOT}/
RUN bundle config set path 'vendor/bundle' && \
    bundle install

# Install chromedriver & chromium

RUN mkdir ${LAMBDA_TASK_ROOT}/bin

# Chromium
RUN wget https://github.com/adieuadieu/serverless-chrome/releases/download/v1.0.0-37/stable-headless-chromium-amazonlinux-2017-03.zip
RUN unzip stable-headless-chromium-amazonlinux-2017-03.zip -d ${LAMBDA_TASK_ROOT}/bin/
RUN rm stable-headless-chromium-amazonlinux-2017-03.zip

# Chromedriver

RUN wget https://chromedriver.storage.googleapis.com/2.37/chromedriver_linux64.zip
RUN unzip chromedriver_linux64.zip -d ${LAMBDA_TASK_ROOT}/bin/
RUN rm chromedriver_linux64.zip

# Copy function code

COPY app.rb ${LAMBDA_TASK_ROOT}

WORKDIR ${LAMBDA_TASK_ROOT}

RUN tree
RUN ls ${LAMBDA_TASK_ROOT}/bin
# Set the CMD to your handler (could also be done as a parameter override outside of the Dockerfile)
CMD [ "app.handle" ]

Notes

  1. If your code was previously deployed using a zip file you will have to either destroy the previous function or create a second function with the code update, it all comes down to how you want to handle deployment.
  2. It is possible to automate the deployment process using the serverless framework
Alvaro Alday
  • 343
  • 3
  • 19