4

I was using Password/Username in my aws glue conenctions and now I switched to Secret Manager.

Now I get this error when I run my etl job :

An error occurred while calling o89.getCatalogSource. None.get

Even tho the connections and crawlers works :

The Connection Image. (I added the connection to the job details)

The Crawlers Image.

This example of the etl job that used to work before :

import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job

args = getResolvedOptions(sys.argv, ["JOB_NAME"])
sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args["JOB_NAME"], args)

# Script generated for node PostgreSQL
PostgreSQL_node1663615620851 = glueContext.create_dynamic_frame.from_catalog(
    database="pg-db",
    table_name="postgres_schema_table",
    transformation_ctx="PostgreSQL_node1663615620851",
)

this what I see as erros in the logs :

2022-09-19 19:28:19,322 ERROR [main] glue.ProcessLauncher (Logging.scala:logError(73)): Error from Python:Traceback (most recent call last):
  File "/tmp/FC 2 job.py", line 19, in <module>
    transformation_ctx="PostgreSQL_node1663615620851",
  File "/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py", line 629, in from_catalog
    return self._glue_context.create_dynamic_frame_from_catalog(db, table_name, redshift_tmp_dir, transformation_ctx, push_down_predicate, additional_options, catalog_id, **kwargs)
  File "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py", line 186, in create_dynamic_frame_from_catalog
    makeOptions(self._sc, additional_options), catalog_id),
  File "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 111, in deco
    return f(*a, **kw)
  File "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py", line 328, in get_return_value
    format(target_id, ".", name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling o89.getCatalogSource.
: java.util.NoSuchElementException: None.get
    at scala.None$.get(Option.scala:349)
    at scala.None$.get(Option.scala:347)
    at com.amazonaws.services.glue.util.DataCatalogWrapper.$anonfun$getJDBCConf$1(DataCatalogWrapper.scala:208)
    at scala.util.Try$.apply(Try.scala:209)
    at com.amazonaws.services.glue.util.DataCatalogWrapper.getJDBCConf(DataCatalogWrapper.scala:199)
    at com.amazonaws.services.glue.GlueContext.getGlueNativeJDBCSource(GlueContext.scala:485)
    at com.amazonaws.services.glue.GlueContext.getCatalogSource(GlueContext.scala:320)
    at com.amazonaws.services.glue.GlueContext.getCatalogSource(GlueContext.scala:185)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:282)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:238)
    at java.lang.Thread.run(Thread.java:750)

and also this :

2022-09-19 19:28:19,348 ERROR [main] glueexceptionanalysis.GlueExceptionAnalysisListener (Logging.scala:logError(9)): [Glue Exception Analysis] {
    "Event": "GlueETLJobExceptionEvent",
    "Timestamp": 1663615699344,
    "Failure Reason": "Traceback (most recent call last):\n  File \"/tmp/FC 2 job.py\", line 19, in <module>\n    transformation_ctx=\"PostgreSQL_node1663615620851\",\n  File \"/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py\", line 629, in from_catalog\n    return self._glue_context.create_dynamic_frame_from_catalog(db, table_name, redshift_tmp_dir, transformation_ctx, push_down_predicate, additional_options, catalog_id, **kwargs)\n  File \"/opt/amazon/lib/python3.6/site-packages/awsglue/context.py\", line 186, in create_dynamic_frame_from_catalog\n    makeOptions(self._sc, additional_options), catalog_id),\n  File \"/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py\", line 1305, in __call__\n    answer, self.gateway_client, self.target_id, self.name)\n  File \"/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py\", line 111, in deco\n    return f(*a, **kw)\n  File \"/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py\", line 328, in get_return_value\n    format(target_id, \".\", name), value)\npy4j.protocol.Py4JJavaError: An error occurred while calling o89.getCatalogSource.\n: java.util.NoSuchElementException: None.get\n\tat scala.None$.get(Option.scala:349)\n\tat scala.None$.get(Option.scala:347)\n\tat com.amazonaws.services.glue.util.DataCatalogWrapper.$anonfun$getJDBCConf$1(DataCatalogWrapper.scala:208)\n\tat scala.util.Try$.apply(Try.scala:209)\n\tat com.amazonaws.services.glue.util.DataCatalogWrapper.getJDBCConf(DataCatalogWrapper.scala:199)\n\tat com.amazonaws.services.glue.GlueContext.getGlueNativeJDBCSource(GlueContext.scala:485)\n\tat com.amazonaws.services.glue.GlueContext.getCatalogSource(GlueContext.scala:320)\n\tat com.amazonaws.services.glue.GlueContext.getCatalogSource(GlueContext.scala:185)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.lang.reflect.Method.invoke(Method.java:498)\n\tat py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)\n\tat py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)\n\tat py4j.Gateway.invoke(Gateway.java:282)\n\tat py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)\n\tat py4j.commands.CallCommand.execute(CallCommand.java:79)\n\tat py4j.GatewayConnection.run(GatewayConnection.java:238)\n\tat java.lang.Thread.run(Thread.java:750)\n",
    "Stack Trace": [
        {
            "Declaring Class": "get_return_value",
            "Method Name": "format(target_id, \".\", name), value)",
            "File Name": "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py",
            "Line Number": 328
        },
        {
            "Declaring Class": "deco",
            "Method Name": "return f(*a, **kw)",
            "File Name": "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py",
            "Line Number": 111
        },
        {
            "Declaring Class": "__call__",
            "Method Name": "answer, self.gateway_client, self.target_id, self.name)",
            "File Name": "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py",
            "Line Number": 1305
        },
        {
            "Declaring Class": "create_dynamic_frame_from_catalog",
            "Method Name": "makeOptions(self._sc, additional_options), catalog_id),",
            "File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py",
            "Line Number": 186
        },
        {
            "Declaring Class": "from_catalog",
            "Method Name": "return self._glue_context.create_dynamic_frame_from_catalog(db, table_name, redshift_tmp_dir, transformation_ctx, push_down_predicate, additional_options, catalog_id, **kwargs)",
            "File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py",
            "Line Number": 629
        },
        {
            "Declaring Class": "<module>",
            "Method Name": "transformation_ctx=\"PostgreSQL_node1663615620851\",",
            "File Name": "/tmp/FC 2 job.py",
            "Line Number": 19
        }
    ],
    "Last Executed Line number": 19,
    "script": "FC 2 job.py"
}

2 Answers2

4

I ran into the same issue. I asked about it on AWS re:Post and was told that the reason why this exception gets thrown is that getCatalogSource and getCatalogSink do not yet support connecting with Secrets Manager. The workaround is to either use boto3 to retrieve credentials from Secrets Manager or connect with username/password.

kahopfer
  • 41
  • 2
  • can you paste the code? I am having this problem where I have a catalog in account B and want to access it from account A and need a jdbc connection. Credentials are stored in secrets manager and with boto I can list tables and database but from gluecontext I have this error. However if I try to create spark connection directly with credentials this will be bad because I dont have the vpc setup in place – Eduardo EPF Apr 24 '23 at 07:41
  • I did not end up using boto3 to solve my problem, however AWS provides code examples of how to interact with Secrets Manager using boto3 [here](https://docs.aws.amazon.com/code-library/latest/ug/python_3_secrets-manager_code_examples.html). Hopefully this helps. – kahopfer Apr 24 '23 at 14:44
0

I have faced a similar issue. In my case it was not able to find the table at specified location. It looks to be same, Try checking the entities you have provided like db-name, table name etc. Should work !!

Mradul Yd
  • 71
  • 8