I am working on an AWS Glue ETL script using the dynamic frame
glue abstraction and writing code in python.
I created a JDBC connection resource named sap-lpr-connection
in the glue data catalog and would like to use it to retrieve the connection options from the code.
As per this link (and other sources), I should be using a "custom.jdbc"
connection_type
to access the connection resource I created.
This is what my code looks like:
from pyspark.context import SparkContext
from awsglue.context import GlueContext
# DATABASE
database = 'sap_lpr'
table = 'bsim'
# GLUE CONTEXT
glue_context = GlueContext(SparkContext.getOrCreate())
# CONNECTION OPTIONS
connection_options = {
"connectionName": f"{database.replace('_', '-')}-connection",
"dbTable": table
}
# READ DATA
dyf = glue_context.create_dynamic_frame.from_options(
connection_type="custom.jdbc",
connection_options=connection_options
)
But when I run the code I get this error:
An error occurred while calling o81.getSource. Glue ETL Marketplace: Can not retrieve required field CONNECTOR_TYPE.
I know an alternative would be to specify a "jdbc"
connection_type
and pass the various connection options such as jdbc URL, username and password, but I prefer to retrieve that information from the glue connection resource I created on purpose for this.
Also, I would really like to stick to the glue_context
API as opposed to the standard spark API.
Any idea what I might be doing wrong?