0

I have successfully run python code in local to access a RDS MySQL instance in AWS by importing the mysql.connector package.

brew unlink mysql
brew install mysql-connector-c
sed -i -e 's/libs="$libs -l "/libs="$libs -lmysqlclient -lssl -lcrypto"/g' /usr/local/Cellar/mysql/8.0.21/bin/mysql_config
C_INCLUDE_PATH=/Users/myself/OneDrive/ASEME/libs LDFLAGS=-L/usr/local/opt/openssl/lib  pip3 install MYSQL-python
brew unlink mysql-connector-c
brew link --overwrite mysql

However, now I need to move the code to AWS Glue but I don't know how to configure the environment only by uploading a library in .zip in a S3 bucket

I have tried to zip the src folder of the python connector https://dev.mysql.com/downloads/connector/python/?os=26

but when running the job

import mysql.connector
import sys
import boto3
import os

ENDPOINT="yyy"
PORT="3306"
USR="admin"
REGION="zzz"
DBNAME="xxx"
os.environ['LIBMYSQL_ENABLE_CLEARTEXT_PLUGIN'] = '1'

#gets the credentials from .aws/credentials
session = boto3.Session(profile_name='default')
client = boto3.client('rds')

I get the error:

  File "/tmp/runscript.py", line 123, in <module>
    runpy.run_path(temp_file_path, run_name='__main__')
  File "/usr/local/lib/python3.6/runpy.py", line 263, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "/usr/local/lib/python3.6/runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "/usr/local/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/tmp/glue-python-scripts-be10tih2/filescontrol.py", line 1, in <module>
ModuleNotFoundError: No module named 'mysql'

how to run the code? if the problem is that mysql installation is needed a part from the python connector then I don't think glue can handle code involving a MySQL connection through python library

user2728349
  • 139
  • 1
  • 3
  • 12
  • are you planning to run this code on python shell or Glue pyspark job? – Prabhakar Reddy Jul 28 '20 at 11:50
  • I can do it in a python shell but I also could use pyspark. The reason I have first tried with python is that I see that only with python shell non pure python .zip libraries are supported. Also python shell is cheaper. So I separate the whole ETL in python and pyspark steps. But if it's not possible to do it in python then I will move to pyspark. – user2728349 Jul 28 '20 at 14:05
  • As you are running in python shell can you try using easy_install as in https://stackoverflow.com/a/54852126/4326922 ? – Prabhakar Reddy Aug 04 '20 at 08:10
  • https://pypi.org/project/mysql-connector-python/#files Download the above wheel file and add it to your s3 bucket and later copy the uri of this file and add it to python libraries path in your glue job – Shruti Kulkarni Feb 07 '23 at 18:08

0 Answers0