5

I've recently switched from MySQL to PostgreSQL for the back end of a project and discovered some of my database proxy methods needed reviewing. To insert linked objects I use a transaction to make sure everything is stored. I do this using jdbc methods such as setAutoCommit(false) and commit(). I've written a utility method that inserts a record into a table and returns the generated key. Basically I've followed technique 2 as described here:

http://www.selikoff.net/2008/09/03/database-key-generation-in-java-applications/

This has worked since the start of the project, but after migrating from MySQL to PostgreSQL getGeneratedKeys returns all the columns of the newly inserted record (see console output below).

Code:

final ResultSet keys = ps.getGeneratedKeys();
final ResultSetMetaData metaData = keys.getMetaData();
for (int j = 0; j < metaData.getColumnCount(); j++) {
    System.out.println("Col name: "+metaData.getColumnName(j+1));
}

Output:

Col name: pathstart
Col name: fk_id_c
Col name: xpathid
Col name: firstnodeisroot

Database signature for the table (auto generated SQL from pgAdmin III):

CREATE TABLE configuration.configuration_xpath
(
  pathstart integer NOT NULL,
  fk_id_c integer NOT NULL,
  xpathid integer NOT NULL DEFAULT nextval('configuration.configuration_xpath_id_seq'::regclass),
  firstnodeisroot boolean NOT NULL DEFAULT false,
  CONSTRAINT configuration_xpath_pkey PRIMARY KEY (xpathid),
  CONSTRAINT configuration_fk FOREIGN KEY (fk_id_c)
      REFERENCES configuration.configuration (id_c) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE CASCADE
)

Database signature for the sequence behind the PK:

CREATE SEQUENCE configuration.configuration_xpath_id_seq
  INCREMENT 1
  MINVALUE 1
  MAXVALUE 9223372036854775807
  START 242
  CACHE 1
  OWNED BY configuration.configuration_xpath.xpathid;

So the question is, why is getGeneratedKeys returning all the columns instead of just the generated key? I've searched and found someone else with a similar problem here:

http://www.postgresql.org/message-id/004801cb7518$cbc632e0$635298a0$@pravdin@disi.unitn.it

But their question has not been answered, only a suggested workaround is offered.

Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
RDM
  • 4,986
  • 4
  • 34
  • 43

2 Answers2

12

Most drivers support getGeneratedKeys() by tacking on a RETURNING-clause at the end of the query with the columns that are auto-generated. PostgreSQL returns all fields because it has RETURNING * which simply returns all columns. That means that to return the generated key it doesn't have to query the system table to determine which column(s) to return, and this saves network roundtrips (and query time).

This is implicitly allowed by the JDBC specification, because it says:

Note:If the columns which represent the auto-generated keys were not specified, the JDBC driver implementation will determine the columns which best represent the auto-generated keys.

Reading between the lines you can say that this allows for saying 'I don't know, or it is too much work, so all columns best represent the auto-generated keys'.

An additional reason might be that it is very hard to determine which columns are auto-generated and which aren't (I am not sure if that is true for PostgreSQL). For example in Jaybird (the JDBC driver for Firebird that I maintain) we also return all columns because in Firebird it is impossible to determine which columns are auto-generated (but we do need to query the system tables for the column names because Firebird 3 and earlier do not have RETURNING *).

Therefor it is always advisable to explicitly query the generated keys ResultSet by column name and not by position.

Other solutions are explicitly specifying the column names or the column positions you want returned using the alternate methods accepting a String[] or int[] (although I am not 100% sure how the PostgreSQL driver handles that).

BTW: Oracle is (was?) even worse: by default it returns the ROW_ID of the row, and you need to use a separate query to get the (generated) values from that row.

Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
  • Thanks for your reply. While what you say is true, it is not the reason for the problem. I've created a test environment with a different postgresql db and there everything works fine. I've been able to localize and solve the problem meanwhile and will post the solution shortly. Thanks anyway :) – RDM Nov 04 '13 at 12:43
  • 1
    See my comment to your answer. It looks like you assume the first column is always the primary key. – Mark Rotteveel Nov 04 '13 at 13:13
  • you are right. I will accept your answer as the correct one. I guess my findings were based on luck, because indeed the PK was just the first column in the `ResultSet`. I thought the driver would automatically find the key and only return that but you've seen the source and apparently it returns everything. Therefor, your answer is the correct one. – RDM Nov 04 '13 at 13:35
  • I've been stuck trying to get an automatic UUID PK column on insert, and your answer has solved my problem. My problem was related with RETURNING. I was getting a bigint instead of a UUID with resultSet.getObject(1). Since I wasn't defining the RETURNING column and my id column wasn't (wrongly) the first column, I was getting another column of type bignt. Changing the order of the columns on database or defining RETUNING column for "RETURNING " solved the problem. thanks! – rmpt Jan 15 '18 at 16:31
  • @rmpt Or retrieve the column by name instead of position – Mark Rotteveel Jan 15 '18 at 16:50
1

UPDATE - The accepted answer (by Mark) correctly explains what the problem is. My solution also works, but that's only because I added the PK column first when recreating the tables. Either way, all columns are returned by getGeneratedKeys().

After some research I've managed to find a possible cause of the problem. As I said before, I've changed from MySQL to PostgreSQL during the development of a software project. For this migration, I've taken an SQL dump which I loaded into PostgreSQL. Aside from the migrated tables, I've also created some new ones (using the GUI wizards in pgAdmin III). After a close investigation of the differences between two tables (one imported, one created), I've established 2 things:

  1. CREATE TABLE statements from the MySQL dump convert PKs to BIGINT NOT NULL, not to SERIAL. This lead to the fact auto generated PKs no longer worked properly, though I fixed this before I asked this question.

  2. The tables that I 'fixed' by adding a new sequence and linking it up work perfectly fine, but the SQL generation code (auto-generated by pgAdmin III, as shown in the original question) is different that that of a table that is made in PostgreSQL natively.

Note that my fixed tables work(ed) perfectly: I can insert records, update records and perform joins... basically do anything. The primary keys get auto generated and the sequence gets updated. However, the JDBC driver (psotgresql-9.2-1003.jdbc4.jar to be precise) fails to return my generated keys (though the tables are fully functional).

To illustrate the difference between a migrated and created table, here is an example of generation code for a table that I added after the migration:

CREATE TABLE configuration.configuration_xpathitem
(
  xpathitemid serial NOT NULL,
  xpathid integer,
  fk_id_c integer,
  itemname text,
  index integer,
  CONSTRAINT pk_configuration_xpathitem PRIMARY KEY (xpathitemid),
  CONSTRAINT fk_configuration_xpathitem_configuration FOREIGN KEY (fk_id_c)
      REFERENCES configuration.configuration (id_c) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT fk_configuration_xpathitem_configuration_xpath FOREIGN KEY (xpathid)
      REFERENCES configuration.configuration_xpath (xpathid) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION
)

You can clearly see here my PK has the serial keyword, where it is integer not null default ... for the migrated (and fixed) table.

Because of this, I figured maybe the JDBC driver for PostgreSQL was unable to find the PK. I had already read the specification that @Mark highlighted in his reply and this lead me to think that that was the cause for the driver to return all columns. This lead me to believe the driver could not find the PK because I think it is looking for the serial keyword.

So to solve the problem, I dumped my data, deleted my faulty tables and added them again, this time from scratch rather than with the SQL statements from the MySQL dump, and reloaded my data. This has solved the problem for me. I hope this can help anyone that is also stuck.

RDM
  • 4,986
  • 4
  • 34
  • 43
  • 1
    It looks more like you always assume that the primary key is the first column in `getGeneratedKeys()`, because I know (from looking at the [source](https://github.com/pgjdbc/pgjdbc/blob/master/org/postgresql/jdbc3/AbstractJdbc3Statement.java), around line 137), that PostgreSQL JDBC will always return all columns (BTW: The table in this answer is not the same as in your question). – Mark Rotteveel Nov 04 '13 at 13:13
  • 1
    And note that in the table in your question the primary key is the third column, which is also the reason that in my answer I suggested to always use the column name to retrieve the value from the `ResultSet`. – Mark Rotteveel Nov 04 '13 at 13:20
  • Indeed, the table is different. I have not looked at the source of the driver but I believe you if you say that it always returns all fields. I guess in that case I was just lucky I put the PK at the top of my columns during the making of my new tables. On the topic of retrieving the key by name, I thought it was "not done" to get the column by name because e.g. the MySQL driver gives a new name to the column ("GENERATED_KEY" or something), I thought the driver was supposed to just give back one column, namely the one with the key. Either way, I will test this. – RDM Nov 04 '13 at 13:31
  • Here is where I got the idea it was "not done": http://stackoverflow.com/questions/8867521/why-does-getgeneratedkeys-return-generated-key-as-column-name – RDM Nov 04 '13 at 13:41
  • That answer is specifically for MySQL, the problem is that the JDBC specification is intentionally(?) vague at this point (and others as well) so that most (all?) database vendors can shoehorn it to their definition of 'auto-generated' and the facilities their database system has for retrieving or returning those values. Even for MySQL retrieving by name would work (as long as you would use `GENERATED_KEY`) – Mark Rotteveel Nov 04 '13 at 13:44
  • I see. Though for the current project I'm sure we will deploy with PostgreSQL, the idea of the software platform is to expand it and release it without being bound to a specific database type. Therefor, I want my classes to be independent of the db type, and apparently the "retrieve by name" strategy is not portable between MySQL and PostgreSQL, unless you can tell me how I could do it anyway? :D – RDM Nov 04 '13 at 13:48
  • Remember that `getGeneratedKeys()` is not just for retrieving primary keys, but can also be used for computed columns, default values of columns not specified etc (see section 13.6 of the JDBC 4.1 specification). Although not all databases support that. – Mark Rotteveel Nov 04 '13 at 13:51