3

I am trying to insert some words to database and return newly inserted id or existing id if the word is already in the database.

I found that I can do this using PreparedStatement and including Statement.RETURN_GENERATED_KEYS. But PreparedStatement is terribly slow. I need to insert like 5000 words at once. Another way I could achieve it by running individual query in for loop:

public ArrayList<Integer> addWords(ArrayList<String[]> allTermsForTag) {
    ArrayList ids = new ArrayList<Integer>();
    ResultSet rs = null;
    try{
        Statement st = connection.createStatement();
        for (String[] articleTerms: allTermsForTag) {
            for(String term: articleTerms) {
                String query = "WITH a AS (INSERT INTO tag (name) SELECT '"+term+"' WHERE NOT EXISTS (SELECT name FROM tag WHERE name = '"+term+"') " +
                        "RETURNING id) SELECT id FROM a UNION SELECT id FROM tag WHERE name = '"+term+"'";
                rs = st.executeQuery(query);
                while (rs.next())
                {
                    int id = rs.getInt(1);
                    ids.add(id);
                    System.out.printf("id: "+id);

                }
            }
        }
        rs.close();
        st.close();
    }catch(SQLException e){
        System.out.println("SQL exception was raised while performing SELECT: "+e);
    }
    return ids;


}

This does what I need nicely, but this is too slow as well.

Another method that I wrote uses executeBatch(), however, it does not return ids:

public ArrayList<Integer> addWords(ArrayList<String[]> allTermsForTag){
        ResultSet rs = null;
        ArrayList ids = new ArrayList<Integer>();

        try{
            Statement st = connection.createStatement();
            for (String[] articleTerms: allTermsForTag) {
                for(String term: articleTerms) {
                    String query = "WITH a AS (INSERT INTO tag (name) SELECT '"+term+"' WHERE NOT EXISTS (SELECT name FROM tag WHERE name = '"+term+"') " +
                            "RETURNING id) SELECT id FROM a UNION SELECT id FROM tag WHERE name = '"+term+"'";
                    st.addBatch(query);
                }
                st.executeBatch();
                rs = st.getGeneratedKeys();
                while (rs.next()) {
                    int id = rs.getInt(1);
                    ids.add(id);
                }
            }
            st.close();
            return ids;
        }catch (SQLException e){
            System.out.println("SQL exception was raised while performing batch INSERT: "+e.getNextException());
            System.out.println("dub");
        }
        return null;
    }

So the question is - how to get ids when using executeBatch() or if this is not possible, how to approach this problem? I need it to work as fast as possible, because there will be a lot of INSERT operations with large amount of data. Thank you!

Sermilion
  • 41
  • 1
  • 5

2 Answers2

5
Set set = new HashSet();
    try {
        PreparedStatement ps = cn.prepareStatement("delete  from myTable where... ", 
        Statement.RETURN_GENERATED_KEYS);
        ps.setInt(1,200);
        ps.setInt(2,262);
        ps.setString(3, "108gf99");
        ps.addBatch();
        ps.setInt(1,200);
        ps.setInt(2,250);
        ps.setString(3, "hgfha");
        ps.addBatch();
        ps.executeBatch();
        ResultSet rs = ps.getGeneratedKeys();
        while (rs.next()){

            set.addAll(Collections.singleton(rs.getLong(1)));
        }

        System.out.println(set);

    } catch (SQLException e) {
        e.printStackTrace();
    }
  • 2
    Usually it's better to explain a solution instead of just posting some rows of anonymous code. You can read [How do I write a good answer](https://stackoverflow.com/help/how-to-answer), and also [Explaining entirely code-based answers](https://meta.stackexchange.com/questions/114762/explaining-entirely-%E2%80%8C%E2%80%8Bcode-based-answers) – Anh Pham Oct 28 '18 at 12:52
1

executeBatch can return generated keys in the latest PgJDBC versions. See issue 195 and pull 204. You must use the prepareStatement variant that takes a String[] of returned column names.

However... take a step back here. The solution isn't loops. The solution is almost never loops.

In this case, you should almost certainly use COPY via the PgJDBC CopyManager API to COPY data into a TEMPORARY table. Then do an INSERT INTO ... SELECT ... RETURNING ... to insert the temp table's contents into the final table and return any generated fields. You can also do a SELECT to join on the temp table to return any that already exist. This is basically a bulk upsert or closely related bulk insert-if-not-exists.

If for some reason you can't do that, the next-best option is probably multi-valued INSERTs with large VALUES lists, but this requires some ugly dynamic SQL. Since you need existing values if the row already exists you'll probably need a writeable CTE too. So really, just use COPY and a query to do the table merge.

Community
  • 1
  • 1
Craig Ringer
  • 307,061
  • 76
  • 688
  • 778
  • Thank you for the answer. That is some advanced stuff you talking about) Not familiar with most of those, but will try to figure out. So, if I have a table with my data, I COPY it to temp table? Then do - INSERT INTO ... SELECT ... RETURNING ... form the temp table, right? And I suppose, it all should be done in transaction? – Sermilion Jul 27 '15 at 10:22
  • @Sermilion Yep, pretty much. You can just try a very recent jdbc driver and batch insert with generated keys otherwise. It will be quite a bit slower but probably fast enough. – Craig Ringer Jul 27 '15 at 14:41
  • Well, I ve added driver version 9.4-1201-jdbc4. It gives me the same error: org.postgresql.util.PSQLException: A result was returned when none was expected. Could you look at the code for executeBatch() and see it everything is ok. Or maybe I added wrong version of jdbc? Thank you . – Sermilion Jul 27 '15 at 16:32
  • @Sermilion You can't use RETTURNING directly. You have to pass the desired colunns to the createPrepared call. – Craig Ringer Jul 27 '15 at 23:20
  • preparedStatement works) But very slow. I ll try to work with COPY. I accept your answer. Thank you a lot) – Sermilion Jul 28 '15 at 11:08