2

I have a postgres database on an EC2 machine. Using PySpark on a cluster setup I am trying to write to the postgresDB but am not able to.

The Postgres Database has a DB: my_db, followed by a table events.

My PySpark code is:

df.write.format("jdbc") \
.option("url", "jdbc:postgresql://ec2-xxxxx.compute-1.amazonaws.com:543x/my_db") \
.option("dbtable", "events") \
.option("user", "xxx") \
.option("password", "xxx") \
.option("driver", "org.postgresql.Driver").mode('append').save()

When executing I receive this error:

py4j.protocol.Py4JJavaError: An error occurred while calling o69.save. : org.postgresql.util.PSQLException: ERROR: relation "events" already exists

It seems that it creates a new table when I execute spark-submit, how to solve this error?

tetrapack
  • 21
  • 1
  • May be one of these can help https://stackoverflow.com/questions/8792912/postgresql-error-relation-already-exists – Equinox Sep 30 '20 at 12:24

0 Answers0