0

source_tbl: email_addr, role

target_tbl: email_addr, roles

source_tbl can have many email addresses. target_tbl can have one distinct email address (the role in source_tbl is consolidated into roles in target_tbl - just fyi)

My concern is not duplicating email address in target_tbl (no constraints are in place by dba setup)

I have a suspicion that the the sub-query inside a Insert into Select is done once, and that result set is then used by the insert. If this is the case then a query like this won't work:

insert into target_tbl 
 ( email_addr, roles)
select 
 src.email_addr,
 src.role
from source_tbl src
where src.email_address not in ( select email_addr from target_tbl)

can't find anything that confirms my suspicion, checking to see if someone can point me to documentation that gives me this confirmation

Martin Smith
  • 438,706
  • 87
  • 741
  • 845
mmv_sat
  • 458
  • 8
  • 15
  • 1
    Could you provide a little info about the roles. I think you might be able to pivot and merge and there a handful of other solutions that would take a many rows down to a 1 row relationship. Could you provide some sample data in your question so we can target an answer for you. Hint: You can also look at your execution plan to see if its looping per row or if its doing a single lookup from target_tbl. I suspect this query is not RBAR. – Matt Jul 12 '16 at 16:49
  • Why do you need documentation of something you can easily confirm by testing yourself? – Tab Alleman Jul 12 '16 at 17:18
  • Thank you for the pivot lead - that was another problem I was going to come across after my initial question @MattA. – mmv_sat Jul 12 '16 at 17:33
  • I've been seeing the results of my query in my data. I want to know the details of why the query was running the way it is for understanding. Better to know why a think does what it does rather than just using a thing. @TabAlleman – mmv_sat Jul 12 '16 at 17:35
  • So ask a new question about how to achieve your desired result. The original question you had here is about the static-ness or otherwise of the sub query result. – Martin Smith Jul 12 '16 at 17:45

1 Answers1

1

Are you asking whether the following sub query is re-evaluated for each row, and so takes into account email addresses previously added by the same statement?

 where src.email_address not in ( select email_addr from target_tbl)

If so the answer is "no".

It will be evaluated at the start and spooled somewhere as a result of Halloween Protection (unless SQL Server manages to come up with a plan that guarantees these semantics without a spool but the answer is still the same).

So if the SELECT query returns, say, 3 rows for abc@dfg.com and that email address doesn't already exist in the target table all three rows will be inserted.

You can either change the SELECT to only pick one row per email or add a unique constraint to the table with ignore_dup_key on to just preserve the first (arbitrary) one encountered. Likely the first option is more useful as it gives you control over which one is kept.

Martin Smith
  • 438,706
  • 87
  • 741
  • 845