0

I get "connection refused error" when I try to write the results of a Dataframe to an RDS (MySQL). I am using PySpark 3 on EMR cluster v6.x (1 master node, 1 slave node). The table does not exist yet. But the data base exist.

spark-submit --jars s3://{some s3 folder}/mysql-connector-java-8.0.25.jar s3://{some s3 folder}/pyspark_script.py

The part of the script that writes to mysql is here (after testing, its the only part of the script that delivers error/is not working): * I have changed the name of my db, user, and password here below

df.write\
    .mode("overwrite")\
    .format("jdbc")\
    .option("url","jdbc:mysql://localhost:3306/{my database name}?useSSL=false")\
    .option("driver","com.mysql.cj.jdbc.Driver")\
    .option("dbtable","mydb_table")\
    .option("user","myuser")\
    .option("password","mypassword")\
    .save()

This is the error I get: It is about connection refused. I have already given the EMR Role access RDS and its data!

Traceback (most recent call last):
  File "/mnt/tmp/spark-93919f38-ea4d-44d6-be7d-0416be972753/pyspark_script.py", line 57, in <module>
    .option("password","assignment")\
  File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 1107, in save
  File "/usr/lib/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in __call__
  File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 111, in deco
  File "/usr/lib/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o163.save.
: com.mysql.cj.jdbc.exceptions.CommunicationsException: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
    at com.mysql.cj.jdbc.exceptions.SQLError.createCommunicationsException(SQLError.java:174)
    at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:64)
    at com.mysql.cj.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:833)
    at com.mysql.cj.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:453)
    at com.mysql.cj.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:246)
    at com.mysql.cj.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:198)
    at org.apache.spark.sql.execution.datasources.jdbc.connection.BasicConnectionProvider.getConnection(BasicConnectionProvider.scala:49)
    at org.apache.spark.sql.execution.datasources.jdbc.connection.ConnectionProvider$.create(ConnectionProvider.scala:68)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$createConnectionFactory$1(JdbcUtils.scala:62)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:48)
    at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:90)
    at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:194)
    at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:232)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
    at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:229)
    at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:190)
    at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:134)
    at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:133)
    at org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:989)
    at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
    at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
    at org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:110)
    at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:135)
    at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
    at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
    at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:135)
    at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:253)
    at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:134)
    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
    at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:68)
    at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:989)
    at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:438)
    at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:415)
    at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:301)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:282)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:238)
    at java.lang.Thread.run(Thread.java:748)
Caused by: com.mysql.cj.exceptions.CJCommunicationsException: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:61)
    at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:105)
    at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:151)
    at com.mysql.cj.exceptions.ExceptionFactory.createCommunicationsException(ExceptionFactory.java:167)
    at com.mysql.cj.protocol.a.NativeSocketConnection.connect(NativeSocketConnection.java:89)
    at com.mysql.cj.NativeSession.connect(NativeSession.java:144)
    at com.mysql.cj.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:953)
    at com.mysql.cj.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:823)
    ... 45 more
Caused by: java.net.ConnectException: Connection refused (Connection refused)
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:607)
    at com.mysql.cj.protocol.StandardSocketFactory.connect(StandardSocketFactory.java:155)
    at com.mysql.cj.protocol.a.NativeSocketConnection.connect(NativeSocketConnection.java:63)
    ... 48 more

21/12/19 11:40:04 INFO SparkContext: Invoking stop() from shutdown hook
21/12/19 11:40:04 INFO AbstractConnector: Stopped Spark@74d96709{HTTP/1.1, (http/1.1)}{0.0.0.0:4040}
21/12/19 11:40:04 INFO SparkUI: Stopped Spark web UI at http://{ip}.eu-central-1.compute.internal:4040
21/12/19 11:40:04 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
21/12/19 11:40:04 INFO MemoryStore: MemoryStore cleared
21/12/19 11:40:04 INFO BlockManager: BlockManager stopped
21/12/19 11:40:04 INFO BlockManagerMaster: BlockManagerMaster stopped
21/12/19 11:40:04 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
21/12/19 11:40:04 INFO SparkContext: Successfully stopped SparkContext
21/12/19 11:40:04 INFO ShutdownHookManager: Shutdown hook called
21/12/19 11:40:04 INFO ShutdownHookManager: Deleting directory /mnt/tmp/spark-fd1b8e7c-7b4c-424d-a451-743a6e075fbd
21/12/19 11:40:04 INFO ShutdownHookManager: Deleting directory /mnt/tmp/spark-93919f38-ea4d-44d6-be7d-0416be972753
21/12/19 11:40:04 INFO ShutdownHookManager: Deleting directory /mnt/tmp/spark-fd1b8e7c-7b4c-424d-a451-743a6e075fbd/pyspark-40fbaaf5-2e34-44ba-875f-88308084546d
QBits
  • 121
  • 1
  • 11

1 Answers1

0

Try this thread: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

EMR with localhost RDS? seems odd, did you missed set the IP correctly ?

Benny Elgazar
  • 243
  • 2
  • 9