How to efficiently insert 1 million rows into an Oracle database table?

Question

I have to insert about 1 million row into a oracle database table like:

create table ID_NO_APPROVAL_TEST
(
    ID NUMBER(10)           not null
        primary key,
    REASON_ID         NUMBER(10)           not null,
    TAXCODE             VARCHAR2(30)         not null,
    STATUS            NUMBER(1),
    CREATE_DATE       DATE default SYSDATE not null,
    USER_CREATE       VARCHAR2(100)        not null
)

with values like

values(ID_SEQ.NEXTVAL,1,'xxxxxxxxxxxx2',1,sysdate,'abcxyz');

values(ID_SEQ.NEXTVAL,5,'xxxxxxxxxxxx3',1,sysdate,'abcxyz');

values(ID_SEQ.NEXTVAL,3,'xxxxxxxxxxxx3',1,sysdate,'abcxyz');

values(ID_SEQ.NEXTVAL,4,'xxxxxxxxxxxx5',1,sysdate,'abcxyz');
.....

What is the best way to resolve this problem?

I try to the statement below with 10 thousands rows:

insert /*+ APPEND*/
into ID_NO_APPROVAL_TEST(ID, reason_id, TAXCODE, status, create_date, USER_CREATE)
select GET_ID_NO_APPROVAL_ID_TEST_SEQ, 275978, '100100939', 1,sysdate,'DUYNQ7' from dual
union select GET_ID_NO_APPROVAL_ID_TEST_SEQ, 1,'xxxxxxxxxxxx2',1,sysdate,'abcxyz' from dual
union select GET_ID_NO_APPROVAL_ID_TEST_SEQ, 4,'xxxxxxxxxxxx3',1,sysdate,'abcxyz') from dual
union select GET_ID_NO_APPROVAL_ID_TEST_SEQ, 7,'xxxxxxxxxxxx3',1,sysdate,'abcxyz' from dual
....

but it took me about 2 minutes to finish this statement.

It depends on where your source data is. If it's in the database, an insert select is appropriate. If it's in a flat file, SQLLoader would be the tool to use. If it's in Excel, you can use Access to pull it in and push it to Oracle, or export it to CSV and use SQLLoader. — Paul W, May 24 '23 at 03:10
It is in the database. But my query with 10k rows took me 2 minutes, so I can provisionally calculate that 1 million rows would take me about 200 minutes to finish this. It's too long for me. So can you suggest some way to optimize the statement. — Hieu Pham JR, May 24 '23 at 03:20
Provided an answer. By the way, don't chain UNIONs together like this, it wasn't designed for that kind of use. Not only is it highly inefficient it will bog down in parsing and shred your shared pool once it gets sufficiently large. Keep data in tables, keep data out of SQL. — Paul W, May 24 '23 at 03:29

score 1 · Answer 1 · answered May 24 '23 at 03:24

1

INSERT VALUES is for single rows. You want a rowset with lots of rows where the source is another table in the database. For that, use INSERT SELECT:

INSERT INTO ID_NO_APPROVAL_TEST 
SELECT ID_SEQ.NEXTVAL, reason_id,taxcode,status,SYSDATE,user_create 
  FROM sourcetable

This can be sped up even more by increasing the CACHE setting of the ID_SEQ sequence, and enabling parallel dml (assuming you are running on a multi-core database server with CPU capacity for this):

ALTER SEQUENCE ID_SEQ CACHE 10000
/
INSERT /*+ enable_parallel_dml parallel(8) append */ INTO ID_NO_APPROVAL_TEST 
SELECT ID_SEQ.NEXTVAL, reason_id,taxcode,status,SYSDATE,user_create 
  FROM sourcetable

answered May 24 '23 at 03:24

Paul W

5,507
2
2
13

I don't have an source in another table in the database, so if create the source table, the problem return to the original. – Hieu Pham JR May 24 '23 at 03:35
1

You just said to my earlier question about your source that "it is in the database". Now you're saying it's not. Where exactly is your source data? – Paul W May 24 '23 at 03:38
Sorry because I misunderstood your question earlier. The data is not store in anywhere. The only thing I have is a list of tax code. – Hieu Pham JR May 24 '23 at 03:44
1

You're typing it in by hand using paper copy? – Paul W May 24 '23 at 03:51
I have list taxcode in a .txt file, each separated by a ; then I wrote a java program to generate the values to insert into a .txt file. – Hieu Pham JR May 24 '23 at 04:03
Where do the reason_id, status and user_create fields come from? Are they also in the txt file? – Paul W May 24 '23 at 04:13
1

If all the data fields you need are in the file, then use SQL*Loader to load the file into Oracle (comes with full Oracle client). I'm sure you can use Java too but you'd have to ask for Java help on how to do that with bulk binds. – Paul W May 24 '23 at 04:21
i have 11 reason id end divided it into 1m values, the status and usercreate is constant – Hieu Pham JR May 24 '23 at 04:21
how about the index to fill in the .csv file? – Hieu Pham JR May 24 '23 at 04:42
You need to use SQL Loader, or you can write your own data loading program in any number of languages, including Java, but you need to use binds (not literals) and they would need to be bulk/array binds. As there are so many different programming environments out there, you'll have to pursue documentation and help for the language and database adapter component that you're using to find out how to do this. If you cannot figure out how to do bulk binds, then at least loop on a single row insert but with bind variables not literals. Literals really slow things down by adding parse time. – Paul W May 24 '23 at 11:17

score 1 · Answer 2 · answered May 26 '23 at 23:57

Reduce the number of rows run in each SQL statement and the performance will vastly improve.

Building large SQL statements works well, but building enormous SQL statements can cause problems. Specifically, almost all of the performance time in your INSERT statement is is in parsing the SQL. To verify this, run the exact same statement twice, and since the statement is already parsed and cached in memory, the run time will drop to almost zero. Or, you can trace the statements and look at the parse time, like I've done in this answer.

Running 1 million independent statements will take too long, but running 1 million statements union-ed together will also take too long. The sweet spot depends on your version of Oracle. In 10g or lower, I would aim for 100. In 19c, with your table and sample data, 1000 works well. See the table below for the run times on my database using your table and sample data. Notice that the values do not increase linearly after a certain point.

# Rows   Seconds
------   -------
   500       0.5
  1000       1
  2000       4
  5000      22
 10000      95

Ideally, you would insert this data directly from another table, or use a tool like SQL*Loader, JDBC batching, etc. But that's not always possible, and inserting 1 million rows with a single SQL script is fine as long as you're careful about the size of each statement. With only 1 million rows, you don't need to worry about parallelism, UNION ALL instead of UNION, direct-path writes, sequence caching, etc. Properly sizing your INSERT batches will fix 99% of your problem; don't worry about the remaining 1%.

How to efficiently insert 1 million rows into an Oracle database table?

2 Answers2