4

I'm using Scriptella for ETL operations, working with many tables referenced with autogenerated ids. I want to re-use these ids without using subqueries, this is my script fragment of the etl file:

<script connection-id="out" if="rownum>1">

SELECT nextval('SQC_CLASE') AS claseId;
INSERT INTO zoologia.clase VALUES( ?claseId, ?phylumId, ?clase, ?subclase, ?infraclase, true );

SELECT nextval('SQC_ORDEN') AS ordenId;
INSERT INTO zoologia.orden VALUES( ?ordenId, ?claseId, ?orden, ?suborden, true );

SELECT nextval('SQC_SUPERFAMILIA') AS superfamiliaId;
INSERT INTO zoologia.superfamilia VALUES( ?superfamiliaId, ?ordenId, ?superfamilia, true );

SELECT nextval('SQC_FAMILIA') AS familiaId;
INSERT INTO zoologia.familia VALUES( ?familiaId, ?superfamiliaId, ?familia, ?subfamilia, ?tribu, true );

SELECT nextval('SQC_GENERO') AS generoId;
INSERT INTO zoologia.genero VALUES( ?generoId, ?familiaId, ?genero, true );

SELECT nextval('SQC_ESPECIE') AS especieId;
INSERT INTO zoologia.especie VALUES( ?especieId, ?generoId, ?especie, ?subespecie, ?variedad, ?genero, true );

</script>

This is obviously wrong because SELECT can't be performed inside script, right? I'm not really sure how to do it whitout subqueries. I'm using PostgreSQL.

EDIT: What I want to achieve is, for example, get the value of the autogenerated id used in the insertions for the first table, for using it in the insertions of the second table, because the records should be referenced

Nelson
  • 81
  • 1
  • 7

2 Answers2

3

Thanks to a user from Scriptella forums, this is the solution, a single query with all the values of the sequences:

<query connection-id="external">
    <query connection-id="sizoo">
        SELECT nextval('SQC_PHYLUM') AS phylumId
        , nextval('SQC_CLASE') AS claseId
        , nextval('SQC_ORDEN') AS ordenId
        , nextval('SQC_SUPERFAMILIA') AS superfamiliaId
        , nextval('SQC_FAMILIA') AS familiaId
        , nextval('SQC_GENERO') AS generoId
        , nextval('SQC_ESPECIE') AS especieId;

        <script connection-id="sizoo" if="rownum>1">
            INSERT INTO zoologia.phylum VALUES( ?phylumId, ?phylum, true );
            INSERT INTO zoologia.clase VALUES( ?claseId, ?phylumId, ?clase, ?subclase, ?infraclase, true );
            INSERT INTO zoologia.orden VALUES( ?ordenId, ?claseId, ?orden, ?suborden, true );
            INSERT INTO zoologia.superfamilia VALUES( ?superfamiliaId, ?ordenId, ?superfamilia, true );
            INSERT INTO zoologia.familia VALUES( ?familiaId, ?superfamiliaId, ?familia, ?subfamilia, ?tribu, true );
            INSERT INTO zoologia.genero VALUES( ?generoId, ?familiaId, ?genero, true );
            INSERT INTO zoologia.especie VALUES( ?especieId, ?generoId, ?especie, ?subespecie, ?variedad, ?genero, true );
        </script>
    </query>
</query>
Nelson
  • 81
  • 1
  • 7
  • This only works for Postgres. Seems like it'd be a better solution if Scriptella made a variable available for the last insert using stmt.getGeneratedKeys() under the hood we could do it in a DB neutral way. – chubbsondubs Apr 18 '13 at 03:13
1

Your code initally looked totally bogus, as nothing seems to connect the SELECT to the following INSERT so even if it ran you'd just be generating an ID and throwing it away. It looks like your scripting tool might be automatically defining the results of SELECT column-aliases as variables that can be referenced in later queries, though; see "copy to another database" in the Scriptella tutorial. A quick glance at that suggest that what you want to do might work, but you'd have to use nested <query/> and <script/> blocks to do it.

The correct way to use a sequence-generated ID is one of:

INSERT INTO zoologia.especie VALUES( nextval('SQC_ESPECIE'), ?generoId, ?especie, ?subespecie, ?variedad, ?genero, true );

INSERT INTO zoologia.especie VALUES( DEFAULT, ?generoId, ?especie, ?subespecie, ?variedad, ?genero, true );

INSERT INTO zoologia.especie(generoId, especie, subespecie, variedad, genero, someothercol) 
VALUES( ?generoId, ?especie, ?subespecie, ?variedad, ?genero, true );

Your question initially appeared to be about re-using deleted IDs (ie gapless sequences) but it looks like you've clarified it to remove that.


Update after edits + comment changed the meaning of the question:

If you're trying to use the generated ID from one row in subsequent INSERTs, you must either:

  • Capture the ID using INSERT ... RETURNING or by calling currval('the_id_sequence') after the INSERT, store the ID in a client-side variable in your scripting language, and pass that to subsequent INSERTs; or

  • Use currval('the_id_sequence') in the VALUES list of subsequent inserts.

I've never even heard of Scriptella let alone used it, so I can't help with the 1st option of using client side variables. I'd be very surprised if it didn't have a way to store the results of a SELECT or INSERT ... RETURNING for later use, though. A really quick glance suggests it's done with nested <query/> and <script/> blocks, but that's just a 30-second glance at the tutorial.

The 2nd option is simple. Say you've just inserted:

INSERT INTO zoologia.genero VALUES( DEFAULT, ?familiaId, ?genero, true );

and want to insert a new row in especie that has the generoId of the just-inserted genero. Presuming that the ID sequence for genero follows the standard naming PostgreSQL uses, tablename_columnname_id_seq, you'd use:

INSERT INTO zoologia.especie VALUES( DEFAULT, currval('genero_generoId_seq'), ...);

See:

Craig Ringer
  • 307,061
  • 76
  • 688
  • 778
  • Just for clarify, this is fragment of a script what I'm using for an ETL operation, I'm using Scriptella. The reason is, there are 7 tables referenced each other by autogenerated ids, and I need to insert records in all these tables, but using Scriptella I can't get the value of the autogenerated id of the first table, for example, in order to use it for insert in the second table, because the records should be referenced – Nelson Oct 10 '12 at 06:30
  • @Neotaku OK, that makes more sense than your original question. It was quite hard to figure out what you wanted. Answer updated. – Craig Ringer Oct 11 '12 at 02:28