I have a postgres database on an EC2 machine. Using PySpark on a cluster setup I am trying to write to the postgresDB but am not able to.
The Postgres Database has a DB: my_db
, followed by a table events
.
My PySpark code is:
df.write.format("jdbc") \
.option("url", "jdbc:postgresql://ec2-xxxxx.compute-1.amazonaws.com:543x/my_db") \
.option("dbtable", "events") \
.option("user", "xxx") \
.option("password", "xxx") \
.option("driver", "org.postgresql.Driver").mode('append').save()
When executing I receive this error:
py4j.protocol.Py4JJavaError: An error occurred while calling o69.save. : org.postgresql.util.PSQLException: ERROR: relation "events" already exists
It seems that it creates a new table when I execute spark-submit
, how to solve this error?