Trying to submit a sqoop job to dataproc to export data from a postgres database following the article: https://medium.com/google-cloud/migrate-oracle-data-to-bigquery-using-dataproc-and-sqoop-cd3863adde7b
It is erroring out with: org.postgresql.util.PSQLException: SSL error: java.security.cert.CertPathValidatorException: Trust anchor for certification path not found.
This is the command I am trying to submit (variables have been appropriately set):
gcloud dataproc jobs submit hadoop --cluster=sqoop-cluster --region=us-central1 --class=org.apache.sqoop.Sqoop --jars=$libs -- import -Dmapreduce.job.user.classpath.first=true -Dorg.apache.sqoop.splitter.all
ow_text_splitter=true --connect=$JDBC_STR --username=xxx --password=xxxx--driver=org.postgresql.Driver --target-dir=$STAGING_BUCKET/$TABLE --table=$SCHEMA.$TABLE --enclosed-by
'\"' --escaped-by \" --fields-terminated-by '|' --null-string '' --null-non-string '' --as-textfile
The postgres jdbc connection string is as follows (omitting ssl=true throws hba_conf not found):
JDBC_STR=jdbc:postgresql://xxxxx:5432/YYYY?ssl=true
The detailed error:
Job [63fb49544a1141f89f9a12960cc18e18] submitted.
Waiting for job output...
/usr/lib/hadoop/libexec//hadoop-functions.sh: line 2400: HADOOP_COM.GOOGLE.CLOUD.HADOOP.SERVICES.AGENT.JOB.SHIM.HADOOPRUNCLASSSHIM_USER: invalid variable name
/usr/lib/hadoop/libexec//hadoop-functions.sh: line 2365: HADOOP_COM.GOOGLE.CLOUD.HADOOP.SERVICES.AGENT.JOB.SHIM.HADOOPRUNCLASSSHIM_USER: invalid variable name
/usr/lib/hadoop/libexec//hadoop-functions.sh: line 2460: HADOOP_COM.GOOGLE.CLOUD.HADOOP.SERVICES.AGENT.JOB.SHIM.HADOOPRUNCLASSSHIM_OPTS: invalid variable name
2021-10-14 21:48:33,931 WARN tool.SqoopTool: $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
2021-10-14 21:48:34,128 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
2021-10-14 21:48:34,156 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
2021-10-14 21:48:34,176 WARN sqoop.ConnFactory: $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
2021-10-14 21:48:34,203 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.Gene
ricJdbcManager. Please specify explicitly which connection manager should be used next time.
2021-10-14 21:48:34,217 INFO manager.SqlManager: Using default fetchSize of 1000
2021-10-14 21:48:34,217 INFO tool.CodeGenTool: Beginning code generation
2021-10-14 21:48:34,504 ERROR manager.SqlManager: Error executing statement: org.postgresql.util.PSQLException: SSL error: java.security.cert.CertPathValidatorException: Trust anchor for certification path not found.
org.postgresql.util.PSQLException: SSL error: java.security.cert.CertPathValidatorException: Trust anchor for certification path not found.
at org.postgresql.ssl.MakeSSL.convert(MakeSSL.java:64)
Any help is appreciated.
Thanks!