I am trying to insert some words to database and return newly inserted id or existing id if the word is already in the database.
I found that I can do this using PreparedStatement
and including Statement.RETURN_GENERATED_KEYS
. But PreparedStatement
is terribly slow. I need to insert like 5000 words at once. Another way I could achieve it by running individual query in for loop:
public ArrayList<Integer> addWords(ArrayList<String[]> allTermsForTag) {
ArrayList ids = new ArrayList<Integer>();
ResultSet rs = null;
try{
Statement st = connection.createStatement();
for (String[] articleTerms: allTermsForTag) {
for(String term: articleTerms) {
String query = "WITH a AS (INSERT INTO tag (name) SELECT '"+term+"' WHERE NOT EXISTS (SELECT name FROM tag WHERE name = '"+term+"') " +
"RETURNING id) SELECT id FROM a UNION SELECT id FROM tag WHERE name = '"+term+"'";
rs = st.executeQuery(query);
while (rs.next())
{
int id = rs.getInt(1);
ids.add(id);
System.out.printf("id: "+id);
}
}
}
rs.close();
st.close();
}catch(SQLException e){
System.out.println("SQL exception was raised while performing SELECT: "+e);
}
return ids;
}
This does what I need nicely, but this is too slow as well.
Another method that I wrote uses executeBatch()
, however, it does not return ids:
public ArrayList<Integer> addWords(ArrayList<String[]> allTermsForTag){
ResultSet rs = null;
ArrayList ids = new ArrayList<Integer>();
try{
Statement st = connection.createStatement();
for (String[] articleTerms: allTermsForTag) {
for(String term: articleTerms) {
String query = "WITH a AS (INSERT INTO tag (name) SELECT '"+term+"' WHERE NOT EXISTS (SELECT name FROM tag WHERE name = '"+term+"') " +
"RETURNING id) SELECT id FROM a UNION SELECT id FROM tag WHERE name = '"+term+"'";
st.addBatch(query);
}
st.executeBatch();
rs = st.getGeneratedKeys();
while (rs.next()) {
int id = rs.getInt(1);
ids.add(id);
}
}
st.close();
return ids;
}catch (SQLException e){
System.out.println("SQL exception was raised while performing batch INSERT: "+e.getNextException());
System.out.println("dub");
}
return null;
}
So the question is - how to get ids when using executeBatch()
or if this is not possible, how to approach this problem? I need it to work as fast as possible, because there will be a lot of INSERT operations with large amount of data.
Thank you!