How do I get the size in bytes of a CLOB
column in Oracle?
LENGTH()
and DBMS_LOB.getLength()
both return number of characters used in the CLOB
but I need to know how many bytes are used (I'm dealing with multibyte charactersets).
How do I get the size in bytes of a CLOB
column in Oracle?
LENGTH()
and DBMS_LOB.getLength()
both return number of characters used in the CLOB
but I need to know how many bytes are used (I'm dealing with multibyte charactersets).
After some thinking i came up with this solution:
LENGTHB(TO_CHAR(SUBSTR(<CLOB-Column>,1,4000)))
SUBSTR
returns only the first 4000 characters (max string size)
TO_CHAR
converts from CLOB
to VARCHAR2
LENGTHB
returns the length in Bytes used by the string.
I'm adding my comment as an answer because it solves the original problem for a wider range of cases than the accepted answer. Note: you must still know the maximum length and the approximate proportion of multi-byte characters that your data will have.
If you have a CLOB greater than 4000 bytes, you need to use DBMS_LOB.SUBSTR rather than SUBSTR. Note that the amount and offset parameters are reversed in DBMS_LOB.SUBSTR.
Next, you may need to substring an amount less than 4000, because this parameter is the number of characters, and if you have multi-byte characters then 4000 characters will be more than 4000 bytes long, and you'll get ORA-06502: PL/SQL: numeric or value error: character string buffer too small
because the substring result needs to fit in a VARCHAR2 which has a 4000 byte limit. Exactly how many characters you can retrieve depends on the average number of bytes per character in your data.
So my answer is:
LENGTHB(TO_CHAR(DBMS_LOB.SUBSTR(<CLOB-Column>,3000,1)))
+NVL(LENGTHB(TO_CHAR(DBMS_LOB.SUBSTR(<CLOB-Column>,3000,3001))),0)
+NVL(LENGTHB(TO_CHAR(DBMS_LOB.SUBSTR(<CLOB-Column>,6000,6001))),0)
+...
where you add as many chunks as you need to cover your longest CLOB, and adjust the chunk size according to average bytes-per-character of your data.
Try this one for CLOB sizes bigger than VARCHAR2:
We have to split the CLOB in parts of "VARCHAR2 compatible" sizes, run lengthb through every part of the CLOB data, and summarize all results.
declare
my_sum int;
begin
for x in ( select COLUMN, ceil(DBMS_LOB.getlength(COLUMN) / 2000) steps from TABLE )
loop
my_sum := 0;
for y in 1 .. x.steps
loop
my_sum := my_sum + lengthb(dbms_lob.substr( x.COLUMN, 2000, (y-1)*2000+1 ));
-- some additional output
dbms_output.put_line('step:' || y );
dbms_output.put_line('char length:' || DBMS_LOB.getlength(dbms_lob.substr( x.COLUMN, 2000 , (y-1)*2000+1 )));
dbms_output.put_line('byte length:' || lengthb(dbms_lob.substr( x.COLUMN, 2000, (y-1)*2000+1 )));
continue;
end loop;
dbms_output.put_line('char summary:' || DBMS_LOB.getlength(x.COLUMN));
dbms_output.put_line('byte summary:' || my_sum);
continue;
end loop;
end;
/
The simple solution is to cast CLOB to BLOB and then request length of BLOB !
The problem is that Oracle doesn't have a function that cast CLOB to BLOB, but we can simply define a function to do that
create or replace
FUNCTION clob2blob (p_in clob) RETURN blob IS
v_blob blob;
v_desc_offset PLS_INTEGER := 1;
v_src_offset PLS_INTEGER := 1;
v_lang PLS_INTEGER := 0;
v_warning PLS_INTEGER := 0;
BEGIN
dbms_lob.createtemporary(v_blob,TRUE);
dbms_lob.converttoblob
( v_blob
, p_in
, dbms_lob.getlength(p_in)
, v_desc_offset
, v_src_offset
, dbms_lob.default_csid
, v_lang
, v_warning
);
RETURN v_blob;
END;
The SQL command to use to obtain number of bytes is
SELECT length(clob2blob(fieldname)) as nr_bytes
or
SELECT dbms_lob.getlength(clob2blob(fieldname)) as nr_bytes
I have tested this on Oracle 10g without using Unicode(UTF-8). But I think that this solution must be correct using Unicode(UTF-8) Oracle instance :-)
I want render thanks to Nashev that has posted a solution to convert clob to blob How convert CLOB to BLOB in Oracle? and to this post written in german (the code is in PL/SQL) 13ter.info.blog that give additionally a function to convert blob to clob !
Can somebody test the 2 commands in Unicode(UTF-8) CLOB so I'm sure that this works with Unicode ?
Check the LOB segment name from dba_lobs using the table name.
select TABLE_NAME,OWNER,COLUMN_NAME,SEGMENT_NAME from dba_lobs where TABLE_NAME='<<TABLE NAME>>';
Now use the segment name to find the bytes used in dba_segments.
select s.segment_name, s.partition_name, bytes/1048576 "Size (MB)"
from dba_segments s, dba_lobs l
where s.segment_name = l.segment_name
and s.owner = '<< OWNER >> ' order by s.segment_name, s.partition_name;
It only works till 4000 byte, What if the clob is bigger than 4000 bytes then we use this
declare
v_clob_size clob;
begin
v_clob_size:= (DBMS_LOB.getlength(v_clob)) / 1024 / 1024;
DBMS_OUTPUT.put_line('CLOB Size ' || v_clob_size);
end;
or
select (DBMS_LOB.getlength(your_column_name))/1024/1024 from your_table