0

I have a table A with the following columns:

  • id UUID
  • str_identifier TEXT
  • num FLOAT

and a table B with similar columns:

  • str_identifier TEXT
  • num FLOAT
  • entry_date TIMESTAMP

I want to construct a sqlalchemy query that does the following:

  • finds entries in table B that either do not exist yet in table A, and inserts them
  • finds entries in table B that do exist in table A but have a different value for the num column

The catch is that table B has the entry_date column, and as a result can have multiple entries with the same str_identifier but different entry dates. So I always want to perform this insert/update query using the latest entry for a given str_identifier (if it has multiple entries in table B).

For example, if before the query runs tables A and B are:

[A]
|  id | str_identifier  |  num  |
|-----|-----------------|-------|
| 1   | str_id_1        |   25  |


[B]
| str_identifier | num | entry_date |
|----------------|-----|------------|
| str_id_1       | 89  | 2020-07-20 |
| str_id_1       | 25  | 2020-06-20 |
| str_id_1       | 50  | 2020-05-20 |
| str_id_2       | 45  | 2020-05-20 |

After the update query, table A should look like:

[A]
|  id | str_identifier  | num |
|-----|-----------------|-----|
|  1  | str_id_1        | 89  |
|  2  | str_id_2        | 45  |

The query I've constructed so far should detect difference, but will adding order_by(B.entry_date.desc()) ensure I only do the exist comparisons with the latest str_identifier values?

My Current Query

query = (
        select([B.str_identifier, B.value])
        .select_from(
            join(B, A, onclause=B.str_identifier == A.str_identifier, isouter=True)
        )
        .where(
            and_(
                ~exists().where(
                    and_(
                        B.str_identifier == A.str_identifier,
                        B.value == A.value,
                        ~B.value.in_([None]),
                    )
                )
            )
        )
    )
Gord Thompson
  • 116,920
  • 32
  • 215
  • 418
Megan
  • 1,000
  • 1
  • 14
  • 44
  • Hi Megan. You have described your requirements quite well, but you haven't given any indication of the research you have done or the things you may have tried. Your question in its current form may well attract downvotes because "Stack Overflow is not a code-writing service". What you describe is known as an "upsert". Please research that and then [edit] your question to improve it. – Gord Thompson Jul 21 '20 at 20:41
  • @GordThompson Thank you for the response. I'm not super well-versed with `sqlalchemy` and didn't want to post my obviously wrong attempts, but I can add those in. – Megan Jul 21 '20 at 20:47
  • You may be interested in examples of upserts for [PostgreSQL](https://stackoverflow.com/q/61366664/2144390) and [SQL Server](https://stackoverflow.com/q/62388767/2144390). – Gord Thompson Jul 21 '20 at 21:51
  • 1
    @GordThompson Thank you, I'll take a look. – Megan Jul 21 '20 at 22:24

0 Answers0