awswrangler redshift to_sql upserting specific columns

Question

Suggest we have a table with a row like below.

tag value before the update operation

I want to update just col_id and slug columns with a new values.

this is the code line I use to update this row.

df = pd.DataFrame([[datas.get("collection_id"), datas.get("slug")]], columns=["col_id", "slug"])

wr.redshift.to_sql(
        df=df, table="test_db", schema="offchain", con=con, use_column_names=True, mode="upsert")

after this operation the row updated like below. tag value becomes NULL after the update operation

but I just want that update col_id and slug. keep the tag column the same now as it was before.

I try to find method to update just specific columns of rows. When I try tı use use_column_names it should be update given columns. But in this example it is updating tag column by NULL.

did you manage to find the answer? I just ran into the same problem with wr.postgresql.to_sql() — symbiotech, Sep 16 '22 at 12:31
I changed my method for upsert. You can follow this. https://docs.aws.amazon.com/redshift/latest/dg/merge-specify-a-column-list.html — Beşir Kassab, Sep 19 '22 at 09:50

score 0 · Answer 1 · answered Mar 30 '23 at 13:03

So the issue with your above statement is you have not used the primary_keys parameter which is required when you are performing upserts for this to work correctly. By using the statement below, it will find the record based on this unique primary key and perform so it should update the col_id and slug values. The assumption here is the "tag" column is unique and you don't have existing tags with this value.

wr.redshift.to_sql(
        df=df, table="test_db", schema="offchain", con=con, use_column_names=True, mode="upsert", primary_keys=['tag'])

awswrangler redshift to_sql upserting specific columns

1 Answers1