PROBLEM: While setting up a CDC pipeline using datastream in Google Cloud platform, when there is a delete query fired on the source table the same is getting reflected on the destination table as well which we need to prevent.
SOLUTION NEEDED: How do we prevent the deletion of the data that is already written to the destination table even if there is a deletion done on the source table using datastream service of google cloud platform.
p.s. Open to all types of solutions.
Note: we are also considering writing a trigger on big query tables that will restore deleted data but that is only and only if we are not able to control this behaviour using datastream and we have exhausted all other options.
DEBUGGING DONE: We were trying to prevent the deletion of the data in the destination table by introducing a filter into the Datastream that will only stream data that has its datastream metadata field "is_deleted__" boolean marked as false,so that if any data gets deleted in the source it will be marked as true and datastream would not pick up those rows to update in the destination table. But there is a problem.
ISSUE WITH THE DEBUGGING: We are able to see the is_deleted field only with tables that does not have a primary key and is not available with tables that have a primary key hence this solution can only be implemented to tables without primary key.