1

Together with my team, I am working on a functionality to generate invoice numbers. The requirements says that:

  • there should be no gaps between invoice numbers
  • the numbers should start from 0 every year (the together with the year we will have a unique key)
  • the invoice numbers should grow accordinlgy to the time of the creation of the invoices

We are using php and postgres. We tought to implement this in the following way:

  • each time a new invoice is persisted on the database we use a BEFORE INSERT trigger
  • the trigger executes a function that retrieves a new value from a postgres sequence and writes it on the invoice as its number

Considering that multiple invoices could be created during the same transaction, my question is: is this a sufficiently safe approach? What are its flaws? How would you suggest to improve it?

Gabriel's Messanger
  • 3,213
  • 17
  • 31
marcosh
  • 8,780
  • 5
  • 44
  • 74
  • Why don't you use an auto_increment field for that? You could create a new table each year... The table would only have the auto_invrement PK and the id of the associated invoice. Each time you persist and invoice, you add it's id to their yearly table... QUery the table with your invoice id and you get your invoice number – Julio Soares Oct 03 '15 at 16:33
  • @JulioSoares what you're suggesting seems almost the same thing that I'm doing. In postgres an auto_increment PK is using a sequence, so you're just proposing to store that sequence in a table – marcosh Oct 03 '15 at 16:38
  • Do you need a trigger for that ? A sequence should do the job. Assign the nextval(⋅) as the field’s default value and you are done. Set up a batch that initialize the sequence every year. – greg Oct 03 '15 at 16:42
  • @greg Actually I simplified the situation a bit. I have multiple sequences that do the job because I have multiple invoice groups and every group needs to have its autonomous gap free numeration. Moreover, using simply a sequence ca create problems, because sequences do not play very well with transactions (i.e. I can generate a new number, the transaction then fails, but the sequence does not go back so a number is lost) – marcosh Oct 03 '15 at 16:46
  • @marcosh Yes. The only thing is you don't need to worry about maintaining the numerization consistently among concurrent sessions, etc... postgress would handle that for you as it already does (probably?) for your invoice ids now... whatever invoice id you get, you insert it on this other table and it will generate your invoice number. – Julio Soares Oct 03 '15 at 17:25
  • 1
    See http://stackoverflow.com/q/9984196/398670 for a detailed writeup – Craig Ringer Oct 04 '15 at 07:22

1 Answers1

3

Introduction

I believe the most crucial point here is:

  • there should be no gaps between invoice numbers

In this case you cannot use a squence and an auto-increment field (as others propose in the comments). Auto-increment field use sequence under the hood and nextval(regclass) function increments sequence's counter no matter if transaction succeeded or failed (you point that out by yourself).


Update:

What I mean is you shouldn't use sequences at all, especially solution proposed by you doesn't eliminates gap possibility. Your trigger gets new sequence value but INSERT could still failed.


Sequences works this way because they mainly meant to be used for PRIMARY KEYs and OIDs values generation where uniqueness and non-blocking mechanism is ultimate goal and gaps between values are really no big deal.

In your case however the priorities may be different, but there are couple things to consider.

Simple solution

First possible solution to your problem could be returning new number as maximum value of currently existing ones. It can be done in your trigger:

NEW.invoice_number =
        (SELECT foo.invoice_number
         FROM invoices foo
         WHERE foo._year = NEW._year
         ORDER BY foo.invoice_number DESC NULLS LAST LIMIT 1
        ); /*query 1*/

This query could use your composite UNIQUE INDEX if it was created with "proper" syntax and columns order which would be the "year" column in the first place ex.:

CREATE UNIQUE INDEX invoice_number_unique
ON invoices (_year, invoice_number DESC NULLS LAST);

In PostgreSQL UNIQUE CONSTRAINTs are implemented simply as UNIQUE INDEXes so most of the times there no difference which command you will use. However using that particular syntax presented above, makes possible to define order on that index. It's really nice trick which makes /*query 1*/ quicker than simple SELECT max(invoice_number) FROM invoices WHERE _year = NEW.year if the invoice table gets bigger.

This is simple solution but has one big drawback. There is possibility of race condition when two transactions try to insert invoice at the same time. Both could acquire the same max value and the UNIQUE CONSTRAINT will prevent the second one from committing. Despite that it could be sufficient in some small system with special insert policy.

Better solution

You may create table

CREATE TABLE invoice_numbers(
   _year INTEGER NOT NULL PRIMARY KEY,
   next_number_within_year INTEGER
);

to store next possible number for certain year. Then, in AFTER INSERT trigger you could:

  1. Lock invoice_numbers that no other transaction could even read the number LOCK TABLE invoice_numbers IN ACCESS EXCLUSIVE;
  2. Get new invoice number new_invoice_number = (SELECT foo.next_number_within_year FROM invoice_numbers foo where foo._year = NEW.year);
  3. Update number value of new added invoice row
  4. Increment UPDATE invoice_numbers SET next_number_within_year = next_number_within_year + 1 WHERE _year = NEW._year;

Because table lock is hold by the transaction to its commit, this probably should be the last trigger fired (read more about trigger execution order here)


Update:

Instead of locking whole table with LOCK command check link provided by Craig Ringer


The drawback in this case is INSERT operation performance drop down --- only one transaction at the time can perform insert.

Community
  • 1
  • 1
Gabriel's Messanger
  • 3,213
  • 17
  • 31
  • Use `update ... returning` rather than a separate `select` then `update`. Or at least use `select ... for update`. In fact, most of your issues above are addressed by `select ... for update`. – Craig Ringer Oct 04 '15 at 07:15