3

I have procedure that receives an XML:

CREATE OR REPLACE PROCEDURE PROCESS_XML(xml IN CLOB) AS
BEGIN
  DBMS_OUTPUT.PUT_LINE('XML processing started');
END;

Now I'm doing a bash script that will download some XML files from a server and, for each one, I will call the above procedure using SQL Plus.

#!/bin/bash
file=$(curl -s "http://example.com/someFile.xml");
sqlplus myuser/mypass@myhost:1521/myscheme <<< "EXECUTE PROCESS_XML('$file')";

It works fine for small file but for big ones, I'm getting the following error:

SQL*Plus: Release 12.1.0.2.0 Production on Thu Jun 8 18:28:19 2017
Copyright (c) 1982, 2016, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP
and Data Mining options
SQL> Input truncated to 7499 characters
SP2-0027: Input is too long (> 2499 characters) - line ignored

Is there anything I could in order to send those big XML?

Thanks

qxlab
  • 1,506
  • 4
  • 20
  • 48
  • Are you able to put the files somewhere you could read them from inside the database, with `utl_file` or as an external table via an Oracle directory object)? Or even get the files from the remote web site from inside the database (which needs ACLs etc.)? – Alex Poole Jun 08 '17 at 22:49
  • @AlexPoole Yeah that was my first approach but I don't have and I will not be allowed to change ACL permissions. So if possible, I'd like to do that through SQL Plus... Thanks – qxlab Jun 08 '17 at 22:53

3 Answers3

4

You can split the file contents into chunks that SQL*Plus will accept, and then recombine them inside an anonymous PL/SQL block; this will also allow value that are longer than a string literal can be. For instance:

#!/bin/bash

file=$(curl -s "http://example.com/someFile.xml" | sed -r "s/(.{1,2000})/l_clob := l_clob || '\1';\n/g")

sqlplus -s -l myuser/mypass@myhost:1521/myscheme <<!EOF
set serveroutput on
declare
 l_clob clob := '';
begin
${file}
 PROCESS_XML(l_clob);
end;
/
exit
!EOF

EXECUTE is a wrapper around a simple anonymous block anyway, so using a heredoc instead of a herestring just lets you expand that to do more. The block declares an empty CLOB and then appends the chunks form the file - which are each converted to look like:

 l_clob := l_clob || '<up to 2000 chars>';

When SQL*Plus sees it, the constructed heredoc end up as:

set serveroutput on
declare
 l_clob clob := '';
begin
 l_clob := l_clob || '<first 2000 chars>';
 l_clob := l_clob || '<next 2000 chars>';
 l_clob := l_clob || '<next 2000 chars>';
 ...
 l_clob := l_clob || '<last 2000 chars>';
 PROCESS_XML(l_clob);
end;
/
exit

Modifying your procedure slightly, partly to verify the length that is passed in, and partly to check the XML hasn't been corrupted in the process:

CREATE OR REPLACE PROCEDURE PROCESS_XML(xml IN CLOB) AS
BEGIN
  DBMS_OUTPUT.PUT_LINE('XML processing started; CLOB length: '
    || length(xml));
  DBMS_OUTPUT.PUT_LINE('XML processing started; converted XML length: '
    || length(xmltype(xml).getclobval()));
END;
/

using that script to handle a large file gives output:

XML processing started; CLOB length: 368104
XML processing started; converted XML length: 368104

PL/SQL procedure successfully completed.

Of course, this slows things down a bit; that ~360k file took about 13 seconds on my system. There may well be faster mechanisms than sed, but the principle still applies.


The version of sed on macOS (which needs -E instead of GNU's -r flag) seems to be limited to 255 repetitions of a pattern (via RE_DUMP_MAX, set in limits.h and not as far as I know modifiable at runtime).

You can just use a lower limit:

file=$(curl -s "http://example.com/someFile.xml" | sed -E "s/(.{1,255})/l_clob := l_clob || '\1';\n/g")

which is actually much faster under Linux too, so not a bad option anyway.


After further experimentation on macOS (El Cap, but probably the same for Sierra) and trying to get escaped newlines to work without including literal n or \n in the output, which causes PLS-00103, it seems easier to put an actual newline in:

file=$(curl -s "http://example.com/someFile.xml" | sed -E "s/(.{1,255})/ l_clob := l_clob || '\1';\
/g")
Alex Poole
  • 183,384
  • 11
  • 179
  • 318
  • Thanks Alex! It's a good approach. Only thing is that for the `sed` command, I'm getting `RE error: invalid repetition count(s)` when the XML is small... is there anything we could do? – qxlab Jun 09 '17 at 18:13
  • @qxlab - I'm not seeing that, with even a tiny file. So, which operating system and version are using, and which bash version? (I'm doing this on OEL/RedHat 5u6, bash 3.2.25...). And do you have a sample of a small file that causes this problem? – Alex Poole Jun 09 '17 at 18:23
  • That may be the problem, to test, I'm using MacOS Sierra, bash version is `GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin16)`. I can reproduce easily with a string like "3" – qxlab Jun 09 '17 at 18:59
  • OK, I guess you've changd `-r` to `-E` then, as that isn't the GNU `sed`. It also has a limit of 255 for the repetition, so change 2000 to 255. (That's actually much quicker in GNU sed, interestingly...) – Alex Poole Jun 09 '17 at 19:16
  • Thanks Alex, you are right! One more thing, I'm getting the error `PLS-00103: Encountered the symbol "PROCESS_XML" when expecting one of the following: := . ( @ % ;`. Any idea what that might be? Probably another bash difference.. thanks – qxlab Jun 09 '17 at 20:17
  • @qxlab - clearly I haven't used `sed` enough on my Mac... got it working, though there may be a more elegant way. (Installing GNU `sed` is also an option, I suppose!) – Alex Poole Jun 09 '17 at 21:09
  • Ok actually to make it work I had to use a escaped line break `\\\n` like this: `file=$(curl -s "http://example.com/someFile.xml" | sed -E "s/(.{1,255})/l_clob := l_clob || '\1';\\\n/g")` then, inside PL/SQL I had to use `$(echo -e "$xmlSplitted")`. That's the only way I could make the line breaks to really work. Thanks – qxlab Jun 10 '17 at 02:55
0

You can try to loop through the input file appending 2.4k chunks at a time, probably:

variable l_var clob;
exec :l_var := '';

-- loop here 
exec :l_var := :l_var || '$chunk';
---

exec process_xml(:l_var);

And instead of shell scripting, you can also form the clob in Java, for instance, reading the XML line-by-line, which doesn't have a limit of the size of the variable.

access_granted
  • 1,807
  • 20
  • 25
0

Could you try zipping the file data in bash, then unzipping in PLSQL using utl_compress?

Something like:

#!/bin/bash
file=$(curl -s "http://example.com/someFile.xml" | gzip -f);
sqlplus myuser/mypass@myhost:1521/myscheme <<< "EXECUTE PROCESS_XML('$file')";

In plsql:

CREATE OR REPLACE PROCEDURE PROCESS_XML(xml IN CLOB) AS
   uncomp CLOB;
BEGIN
  UTL_COMPRESS.lz_uncompress(src => xml, dst => uncomp);
  DBMS_OUTPUT.PUT_LINE('XML processing started');
END;
TenG
  • 3,843
  • 2
  • 25
  • 42
  • Apparently `lz_uncompress` only accepts `BLOB` type. So I tried with `BLOB`, but I'm getting these errors: `SQL> SP2-0552: Bind variable "?G" not declared. SQL> SP2-0734: unknown command beginning "?? Z?..." - rest of line ignored.` – qxlab Jun 09 '17 at 00:21
  • Oh. Try this - pass in as CLOB, then convert to BLOB, then decompress. https://stackoverflow.com/questions/40526132/how-convert-clob-to-blob-in-oracle – TenG Jun 09 '17 at 00:40