0

I'm trying to import data from Oracle database to HDFS by using sqoop 1.4.6. I have no problem doing simple imports to HDFS or hive tables by using --table table_name or --query SELECT <...> statements.

However, my question: is there a way to import the table with executing a certain PL/SQL procedure first?

For example, assume there exists a procedure set_date(date IN DATE), specified in the Oracle database that is used to set the report date for the data. With usual JDBC tools (i.e., ROracle package if working from R) two queries are required for the same connection to pull the data:

1) BEGIN set_date(#some_date#); END;

2) SELECT * FROM table_name;

Is it possible to achieve the same result using sqoop? I've tried mixing the two queries into one by the following approach:

sqoop import --connect jdbc:oracle:thin@LINK:PORT:SID \
--username user -P \
--target-dir /some/directory \
--query "BEGIN set_date(#some_date#); END; SELECT * FROM table_name"

But apparently the whole query is assumed to be in PL/SQL format, therefore the 2) part of the query is not formulated as expected. Also, I've tried eval'ing only the set_date part of the query, which seems to work fine, so the only problem is to correctly execute the 2) SQL query.

Is it possible to call a second query using the same sqoop connection? Is it possible to mix 1) and 2) statements into one in order to get the same result (for now a simple select table would be enough) using sqoop?

Note: I don't have much experience working with SQL and PL/SQL, so sorry if I didn't specify some parts clearly enough.

runr
  • 1,142
  • 1
  • 9
  • 25
  • I don't know sqoop, but if you are on Oracle 12.1 you might try combining the two statements into a single anonymous block and returning an implicit result set, along the lines of http://stackoverflow.com/a/43451151/230471 – William Robertson May 21 '17 at 08:54

1 Answers1

0

A bit of a wild guess as I don't know Sqoop, but from Oracle 12.1 an anonymous PL/SQL block cab return result sets implicitly, so if Sqoop supports this it's just possible that the block could be:

declare
    rc sys_refcursor;
begin
    set_date(#some_date#);

    open rc for select * from table_name;
    dbms_sql.return_result(rc);
end;
William Robertson
  • 15,273
  • 4
  • 38
  • 44
  • Thanks, will give it a look, though I'm currently on Oracle 11g. – runr May 22 '17 at 07:23
  • After running it I get the error: ``PLS-00201: identifier 'DMBS_SQL.RETURN_RESULT' must be declared``, probably due to the Oracle version used? Additionally, tried openning ``:rc`` and printing it afterwards, but so far without luck. Are there any other known workarounds? – runr May 22 '17 at 08:11
  • Implicit result sets are new in 12c. – William Robertson May 22 '17 at 09:21