0

I need to delete all data from all tables each and every time a Unit Test is fired. Right now I'm using TRUNCATE TABLE "table_name" CASCADE for each table that is included in the test suite.

The problem is that those queries take a lot of time - about 9-11 seconds!

I would like to be able to replace those TRUNCATE...CASCADE with just TRUNCATE statements, but to do that I need a list of tables ordered by the master-detail (parent-child) relationship (children first). Manually I could create such a list once or twice, but the database changes everyday. Is there a way to generate a list of tables in that order?

I'm using Postgresql.

kubal5003
  • 7,186
  • 8
  • 52
  • 90
  • You might want to read http://stackoverflow.com/a/11423886/398670 and try updating to PostgreSQL 9.2 if you're not on it already. Just how many tables are being `TRUNCATE`d here, anyway? – Craig Ringer Oct 11 '12 at 12:05
  • Just a comment, but you should really consider trying to better isolate your unit tests. IMHO, they should never actually touch the db. It's fine and good practice to have integration (aka functional) tests which write to the db. But, these usually run as a part of a CI server and then it doesn't matter if they take a few extra seconds to run (as it is not impacting the developers work). Again, not an *answer* where. Just a comment and a suggestion and maybe something to think about. Best of luck. – David S Oct 11 '12 at 14:13
  • Those tests run on a CI server. Maybe I shouldn't call them Unit Tests. They should actually test the db. I've solved the problem by writing a custom application that queries the db for tables and foreign keys and creates the TRUNCATE statements. I've reduced the time needed to run the whole suite from 20 to 11 minutes. – kubal5003 Oct 12 '12 at 11:15
  • Can you please write up a brief answer and post it as an answer so it can be upvoted or accepted? – Chris Travers Apr 10 '13 at 14:29
  • Unfortunately I can't because I no longer have this code (I've changed work place). The solution can be coded in 15 minutes - query the tables and foreign keys, then sort the tables according to information from foreign keys. This does not work if the schema has cycles (eg. table A points to B and B points back to A, of course longer cycles will break things too). – kubal5003 Apr 11 '13 at 08:03
  • In that case I will probably go over my thoughts on solving those problems in an answer along with a general framework for solving the rest. – Chris Travers Apr 11 '13 at 12:51

1 Answers1

1

The way I would look at resolving this would be as follows:

  1. Defer all deferrable constraints and ignore them. You can't effectively have cycles in non-deferrable foreign keys because there is no order of insert that will satisfy them

  2. I would build a stored procedure which would look up deferrable foreign keys and defer them all. This is relatively trivial.

  3. I would build a CTE which would build a tree of foreign key constraints and then search through them reversing the order by length of the path to reach them, so that the first tables would be the ones which had the most tables depending on their keys.

However this may be overkill. It may be easier to just have two functions:

  1. Function 1 looks through every relation in a class and for every relation, issues:

    EXECUTE $e$ ALTER TABLE $e$ || quote_ident(relname) || $e$ disable trigger all $e$;

  2. Function 2 does the reverse:

    EXECUTE $e$ ALTER TABLE $e$ || quote_ident(relname) || $e$ enable trigger all $e$;

Call function 1 at the beginning of the clear process. Call function 2 just before committing.

Chris Travers
  • 25,424
  • 6
  • 65
  • 182
  • Your way is definitely a good way to go if the schema can be modified. I had a requirement that schema must not be modified (this was a banking project, so my PM did not intend to take the risk of something going wrong with this solution). Another thing is: have you tested this for speed? – kubal5003 Apr 11 '13 at 13:07
  • Disabling triggers would be faster. Note this doesn't really modify the schema so much as it disables triggers temporarily. The triggers are still there, but they aren't run between being disabled and being enabled. – Chris Travers Apr 11 '13 at 14:56