1

Good time of a day to everyone. I face with a huge problem during my work on previous week. Here ia the deal:

I need to download exel file (blob) from ORACLE database through SAS. I am using:

  1. First step i need to get data from oracle. I used the construction (blob file is nearly 100kb):

    proc sql;
    connect to oracle;
    create table SASTBL as 
    select * from connection to oracle (
    select dbms_lob.substr(myblobfield,1,32767) as blob_1,
    dbms_lob.substr(myblobfield,32768,32767) as blob_2,
    dbms_lob.substr(myblobfield,65535,32767) as blob_3,
    dbms_lob.substr(myblobfield,97302,32767) as blob_4
    
      from my_tbl;
    );
    quit;
    

And the result is:

    blob_1 = 70020202020202...02
    blob_2 = 02020202020...02
    blob_3 = 02020202...02

I do not understand why the field consists from "02"(the whole file)

And the length of any variable in sas is 1024 (instead of 37767) $HEX2024 format. If I ll take:

dbms_lob.substr(my_blob_field,2000,900) from the same object the result will mush more similar to the truth: blob = "A234ABC4536AE7...."

The question is: 1. how can i get binary data from blob field correctly trough SAS? What is my mistake?

Thank you.

EDIT 1:

I get the information but max string is 2000 kb.

3 Answers3

0

PROC SQL uses SQL to interact with SAS datasets (create tables, query tables, aggregate data, connect externally, etc.). The procedure mostly follows the ANSI standard with a few SAS specific extensions. Each RDMS extends ANSI including Oracle with its XML handling such as saving content in a blob column. Possibly, SAS cannot properly read the Oracle-specific (non-ANSI) binary large object type. Typically SAS processes string, numeric, datetime, and few other types.

As an alternative, consider saving XML content from Oracle externally as an .xml file and use SAS's XML engine to read content into SAS dataset:

** STORING XML CONTENT; 
libname tempdata xml 'C:\Path\To\XML\File.xml';

** APPEND CONTENT TO SAS DATASET;
data Work.XMLData;
   set tempdata.NodeName;        /* CHANGE TO REPEAT PARENT NODE OF XML. */
run;
Parfait
  • 104,375
  • 17
  • 94
  • 125
0

Use the DBMAX_TEXT option on the CONNECT statement (or a LIBNAME statement) to get up to 32,767 characters. The default is probably 1024.

Tom
  • 47,574
  • 2
  • 16
  • 29
0

Adding as another answer as I can't comment yet... the issue you experienced is that the return of dbms_lob.substr is actually a varchar so SAS limits it to 2,000. To avoid this, you could wrap it in to_clob( ... ) AND set the DBMAX_TEXT option as previously answered.

Another alternative is below...

The code below is an effective method for retrieving a single record with a large CLOB. Instead of calculating how many fields to split the clob into resulting in a very wide record, it instead splits it into multiple rows. See expected output at bottom.

Disclaimer: Although effective it may not be efficient ie may not scale well to multiple rows, the generally accepted approach then is row pipelining PLSQL. That being said, the below got me out of a pinch if you can't make a procedure...

PROC SQL;
connect to oracle (authdomain=YOUR_Auth path=devdb DBMAX_TEXT=32767 );
create table clob_chunks (compress=yes) as
select *
from connection to Oracle (
    SELECT    id
        , key   
        , level clob_order
        , regexp_substr(clob_value, '.{1,32767}', 1, level, 'n') clob_chunk
    FROM (
        SELECT id, key, clob_value
        FROM schema.table
        WHERE id = 123
    )
    CONNECT BY LEVEL <= regexp_count(clob_value, '.{1,32767}',1,'n')
)
order by id, key, clob_order;

disconnect from oracle;

QUIT;   

Expected output:

ID  KEY CHUNK   CLOB
1   1   1   short_clob
2   2   1   long clob chunk1of3
2   2   2   long clob chunk2of3
2   2   3   long clob chunk3of3
3   3   1   another_short_one

Explanation:

  1. DBMAX_TEXT tells SAS to adjust the default of 1024 for a clob field.
  2. The regex .{1,32767} tells Oracle to match at least once but no more than 32767 times. This splits the input and captures the last chunk which is likely to be under 32767 in length.
  3. The regexp_substr is pulling a chunk from the clob (param1) starting from the start of the clob (param2), skipping to the 'level'th occurance (param3) and treating the clob as one large string (param4 'n').
  4. The connect by re-runs the regex to count the chunks to stop the level incrementing beyond end of the clob.

References:

  • You don't have to post a comment if you can answer the question. Your post is perfectly fine. If however, you cannot answer the question you must not post it as an aswer. So no Need to explain yourself :) – Noel Widmer Jul 31 '17 at 06:45