4

SAS allows creation of proc sql create table statement where the table to be created recursively references itself in the select statement e.g.:

proc sql;
     create table t1 as 
     select 
         t1.id
         ,t2.val1 
     from 
         t1
         inner join t2 on t1.id=t2.id
;
quit;

When a statement like this is executed a warning message is written to the log.

WARNING: This CREATE TABLE statement recursively references the target table. A consequence of this is a possible data integrity problem.

This warning message could be suppressed by using undo_policy=none option. (see SAS Usage Note 12062)

Question:

  • Can creating a table in such a recursive manner potentially return some unexpected results? Is it possible that it would create different results that spiting the same operation into 2 steps:

     proc sql;
       create table _data_ as 
         select 
           t1.id
           ,t2.val1 
         from 
           t1
           inner join t2 on t1.id=t2.id;
    
         create table t1 as
           select * from &syslast;
    
    quit;
    
  • Is the two step approach better/safer to use?

kristof
  • 52,923
  • 24
  • 87
  • 110
  • Having no idea what results you would "expect" from this invalid query, it is hard to say whether it could return "unexpected" results. SAS proc sql doesn't support recursive queries. – Gordon Linoff Feb 15 '16 at 15:24
  • @GordonLinoff - from what I have seen so far it produces the same output as the two step approach, I am not sure if the results are deterministic or not. I have inherited some sas code and I am trying to make sense of it. – kristof Feb 15 '16 at 15:33
  • 1
    I THINK this depends on the engine used. The BASE engine (what you are using in the WORK library) creates a temporary data set behind the scenes and then copies it over at the end. Other engines may not do that, hence the WARNING. – DomPazz Feb 15 '16 at 15:51
  • I have gotten this warning as well, but I have yet to see any anomalous results. The developers must know of very specific cases that bad results could be returned, and place that disclaimer in just in case. I tried making some of the worst recursive sql statements possible, but couldn't get wrong results. – Stu Sztukowski Feb 15 '16 at 15:58
  • @StuSztukowski Is this an acceptable type of query in sas or should it really be avoided? – kristof Feb 15 '16 at 16:18
  • 3
    Personally, I would avoid it. It might work 99.9999% of the time, but if you're working with critical data, you want to be absolutely certain your read/write operations will work 100% of the time. If you would like to do this, I would recommend a data step. Data steps do support these recursive update operations and such. If something does go wrong in the future, even if the sql statement is not the exact reason for a fault, it's easier to explain to your boss than "I ignored this warning here, but that's probably unrelated" ;) – Stu Sztukowski Feb 15 '16 at 16:48
  • thanks @StuSztukowski, avoiding it would probably be the best practice, the problem is with the existing code, should it be changed or left as it is. – kristof Feb 16 '16 at 09:25

2 Answers2

4

This should work fine if the tables being queried are SAS datasets. It is no worse than this simple data step.

data t1;
 merge t1 t2;
 by id;
run;

When SAS runs that type of step it will first create a new physical file with the results and only after the step has finished it will delete the old t1.sas7bdat and rename the temporary file to t1.sas7bdat. If you do with a PROC SQL statement SAS will follow the same basic steps.

I believe that the warning is there because if the tables being referenced were from a external database system (such as Oracle) then SAS might push the query into the database and there it could cause trouble.

Tom
  • 47,574
  • 2
  • 16
  • 29
  • Thank you @tom for your answer, that is what I was suspecting. The thing that worries me is the part of the warning message that states 'A consequence of this is a possible data integrity problem' and the fact that the warning message can be suppressed with 'undo_policy=none' option. SAS documentation is not very clear - at least from what I have seen - when this type of statement is safe to use and in which scenarios it can cause problems. If it can cause data integrity issues why allow it in the first place. – kristof Feb 16 '16 at 09:20
2

I have found that using the same table name as input and output for SAS proc sql can produce incorrect results. It works OK most of the time, but definitely not 100% of the time. Rather than suppress the warning, use a different output table name.

SAS has confessed to this: http://support.sas.com/kb/12/062.html

Geoff
  • 21
  • 1