2

I had dynamically created a table through glue job and it is successfully working fine. But as per new requirement, I need to add a new column which generates unique values and should be primary key in redshift.

I had implemented the same using rownum() function and it’s working fine. But the latest requirement is that particular column should be primary key.

When I try to do that, it asks the column to have not null. Do you know how to make the column not null dynamically through glue job ? Or any redshift query to make it not null. I tried all the ways without luck.

w = Window().orderBy(lit('A'))
df = timestampedDf.withColumn("row_num", row_number().over(w))
rowNumDf = DynamicFrame.fromDF(df1, glueContext, "df1")

postStep = "begin; ALTER TABLE TAB_CUSTOMER_DATA ALTER COLUMN row_num INTEGER NOT NULL; ALTER TABLE TAB_CUSTOMER_DATA ADD CONSTRAINT PK_1 PRIMARY KEY (row_num); end;"

## @type: DataSink
## @args: [catalog_connection = "REDSHIFT_CONNECTION", connection_options = {"dbtable": "tab_customer_data", "database": "randomdb"}, redshift_tmp_dir = TempDir, transformation_ctx = "datasink4"]
## @return: datasink4
## @inputs: [frame = rowNumDf]
datasink4 = glueContext.write_dynamic_frame.from_jdbc_conf(frame = rowNumDf, catalog_connection = "REDSHIFT_CONNECTION", connection_options = {"dbtable": "TAB_CUSTOMER_DATA", "database": "randomdb", "postactions": postStep}, redshift_tmp_dir = args["TempDir"], transformation_ctx = "datasink4")
job.commit()
James Z
  • 12,209
  • 10
  • 24
  • 44
Pasha Shaik
  • 93
  • 1
  • 8
  • 1
    you could try a modified version of this approach https://ubiq.co/database-blog/how-to-remove-not-null-constraint-in-redshift/#:~:text=Redshift%20does%20not%20allow%20you,not%20null%20constraint%20in%20Redshift. – Jon Scott Mar 06 '21 at 09:20
  • Have you looked at having an Identity Column? These are designed to be unique while primary keys are not guaranteed to be unique (it is up to the user to ensure uniqueness as it is not enforced by Redshift). – Bill Weiner Mar 06 '21 at 15:23
  • Thank you so much Jon. It is working with this approach. – Pasha Shaik Mar 07 '21 at 10:07

1 Answers1

0

I had solved this using below link approach:

  1. add a new Column with default and not null.
  2. update the old column values to new column.
  3. drop an old column.
  4. make this new column primary.

https://ubiq.co/database-blog/how-to-remove-not-null-constraint-in-redshift/

Pasha Shaik
  • 93
  • 1
  • 8