SAS/ACCESS and data step on external DB

Question

I have the following concern regarding SAS/ACCESS facility.

Let's imagine that we have an external DB (i.e. Oracle), which we have assigned to a certain libname.

Next, we do a simple operation on one of the tables within this DB, i.e.

data db.table_new;
  set db.table_old(keep=var1 var2 var3);
  if var1>0 then new_var1=5;
run;

My question is the following:

Will the whole table table_old be pulled from external DB to SAS Server in order to process the data?
Will SAS/ACCESS transform the data step into DBMS operation or SQL so the whole processing will be performed outside SAS?

The documentation is unclear about it . See page 62.

You may be able to ask your DBA to search through the server logs and find what your statement was translated into. — Robert Penridge, Nov 19 '15 at 00:41
I guess, with `options dbidirectexec` SAS would always try to convert the code to the DB native query and pull the data. Any option on the `SET` statement would get directly translated and passed to the DB. — samkart, Feb 14 '18 at 16:59

score 2 · Answer 1 · answered Nov 18 '15 at 21:29

2

Usually the rule of thumb is: if the SAS functions that are used in DATA step can be converted to native db sql functions, then SAS will let the DB server do the data processing. In your case, this seems to be the situation.

answered Nov 18 '15 at 21:29

Victor

16,609
71
229
409

Stu Sztukowski · Answer 2 · 2015-11-19T18:55:15.820

You can answer this question on any piece of code through a set of non-syntax-highlighted options that need to be simplified:

options sastrace=',,,d' sastraceloc=saslog nostsuffix;

When you run the data step, check the log. You will see information about whether SAS is able to successfully translate the code or not. If it was unsuccessful, you will see:

ACCESS ENGINE: SQL statement was not passed to the DBMS, SAS will do the processing.

If this occurs, SAS will usually send out a select * to the server and pull everything before filtering. When you see that error, try doing explicit passthrough, or redesign your query so that it can do everything on the server. It is possible to bring down the SAS server, or severely degreade performance on the Oracle server, if the table is large enough.

Some common functions you'll want to avoid using directly in the query, especially with Oracle:

datepart()
intnx()
intck()
today()
put()
input()

If I have to use any of those functions, I usually play it safe and create a macro variable of static ones beforehand (e.g. today()), filter the raw data at the lowest level first to get it into the SAS server, or use explicit SQL passthrough.

score 1 · Answer 3 · answered Nov 19 '15 at 21:07

In summary, I would say it depends on your method. On the second page of Chapter 1 of the SAS/Access 9.2 document in your above link, there are two methods (among the older DBLOAD procedure) of the SAS/ACCESS facility:

LIBNAME reference - assign SAS librefs to DBMS objects such as schemas and databases; you can then work with the table or view as you would with a SAS data set...You can use such SAS procedures as PROC SQL or DATA step programming on any libref that references DBMS data.
SQL Pass-through facility - to interact with a data source using its native SQL syntax without leaving your SAS session. SQL statements are passed directly to the data source for processing...The DBMS optimizer can take advantage of indexes on DBMS columns to process a query more quickly and efficiently

Hence, for the first method SAS handles processing and second method DBMS handles processing. Like most clients (Java, C#, Python script or PHP webpage) that connect to external RDMS sources, unless a direct ODBC/OLEDB or other API connection is explicitly employed and request sent, processing is handled in the frontend (i.e., calculating parameters) and the end result is updated to the backend via transactions. All SAS's libraries would live in memory (or temporary hard disk) during the appointed session and depending on the code handles data itself and passes results to external source or passes data handling entirely to another source.

Comparative Example: Microsoft Access

One good comparative example would be Microsoft Access which like SAS too provides a linked table connection and pass-through query for any ODBC-compliant RDMS including SQL Server, Oracle, MySQL, etc. It is often a misnomer to tag Access as a database when actually it is a GUI program and collection of objects, one of which is the default Windows JET/ACE engine (a .dll file) not at all restricted to Access but available to all Office programs. Notice the world default as this can be switched out to any ODBC database source.

Linked tables are essentially Access GUI objects (specifically special tabledefs) not unlike SAS's libname refs that are loaded into a JET/ACE table container with data pointing externally. One can then use a linked table like any other Access local table and use anything of the ACE SQL dialect. This special linked table (much like SAS's libname refs are established by ODBC or other connection type) points to the external source and the driver translates query command for the migration action. Therefore, an exact same Access linked table query may perform differently than same RDMS query.

Analogy

I imagine SAS behaves the same way and exists as a front-end with libname ref as local objects with pointers to the backend. All data step handling is processed locally and simply the resultset are imported or extracted by the engine. To use an analogy. A database would be the home and SAS is the garbage man, home decorator, or move-in helper. SAS (like Java's JDBC, PHP's PDO, Python's cursors, R's libraries) knocks on the door which the database answers (annoyed by so many requests). "Hey buddy, we need to take out the garbage and here are the exact items...or we need to remodel the basement and here are the exact specs...or we have new furniture to add in the truck ready for drop off...with credentials signed please carry out immediately." And like in both, pass-through methods are requests carried out on the backend engine. So SAS leaves instructions, maybe a note on the door (without exactness) for homeowner to carry out.

MS Access example can be removed if it strays from the main point. — Parfait, Nov 19 '15 at 21:08

SAS/ACCESS and data step on external DB

3 Answers3