0

I have a BQ table with duplicate (x2 times) rows of the same ad_id.

I want to delete old rows with ts > 120 min where there is a newer one with the same ad_id (Schema contains timestamp, ad_id, value. But there is not rowId).

This is my try, is there a nicer way to do so?

DELETE TOP FROM {table_full_name} o
       WHERE timestamp < TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 120 MINUTE) AND timestamp in ( 
                                            SELECT MIN(timestamp) 
                                            FROM  {table_full_name}  i 
                                            WHERE i.ad_id=o.ad_id 
                                            GROUP  BY ad_id) 
Elad Benda
  • 35,076
  • 87
  • 265
  • 471
  • Did you try 'window' functions - https://cloud.google.com/bigquery/docs/reference/standard-sql/analytic-function-concepts – al-dann Sep 20 '21 at 16:09
  • I couldn't integrate delete with window or partition. Can you please give me a general way? Will my query in the post work as well? – Elad Benda Sep 20 '21 at 17:19
  • Does this answer your question? [Remove old duplicate rows in BQ based on timestamp](https://stackoverflow.com/questions/69276932/remove-old-duplicate-rows-in-bq-based-on-timestamp) – Sakshi Gatyan Sep 24 '21 at 10:14

0 Answers0