0

I have a method boolean addIntegerColumnToDatabases(String tableName, String columnName, Connection ... conns) where I have a collection of SQL-Connections.

For every SQL-Connection I execute the schema-update-query

BEGIN
  ALTER TABLE <b>tableName</b> ADD COLUMN <b>columnName</b> int4
COMMIT

Since this method must be ACID I like to do this in parallel.

In example I have two connections (C1, C2):

C1: BEGIN

C2: BEGIN

C1: ALTER TABLE tableName ADD COLUMN columnName int4

C2: ALTER TABLE tableName ADD COLUMN columnName int4

C1: COMMIT

C2: COMMIT

This is the code:

Statement[] s = new Statement[conns.length];
int i =0;
for(Connection c:conns) {
  c.setTransactionIsolation(Connection.TRANSACTION_READ_UNCOMMITTED);
  c.setAutoCommit(false);
  s[i++]=c.createStatement();
}
for(Statement st:s) {
  st.executeUpdate("ALTER TABLE "+tableName+" ADD COLUMN "+columnName+" int4;");
}
for(Connection c:conns) {
  c.commit();
}
return true;

This works as long as the connections are on different databases.

Problem

If C1 and C2 are connected to the same database C2 waits for the C1 to be committed on Postgres-Side and C1 waits for C2 on Java-Side. Voila we have a deadlock.

I have no 100% chance to check if the connections are to the same database system because of issues like clustering/balancing, ipv4/6 and dns-issues.

Question

How to make sure to return false; if execution having two connections to the same database without executing any schema-change?

Community
  • 1
  • 1
Grim
  • 1,938
  • 10
  • 56
  • 123
  • Unrelated, but: `c.setTransactionIsolation(Connection.TRANSACTION_READ_UNCOMMITTED);` is useless in Postgres because it does not support dirty reads. –  Mar 28 '20 at 08:04
  • Are you aware that schema migration tools like Liquibase have already solved this problem? (and many more as well) –  Mar 28 '20 at 08:18
  • @a_horse_with_no_name I dont like liquibase, its architecture is dirty. – Grim Mar 29 '20 at 07:05
  • It has been working just fine for me for over 10 years in about 20 projects. –  Mar 29 '20 at 07:08
  • @a_horse_with_no_name I am sure it also works well for more that 10 years in more than 20 projects for many more people. Doesnt change the fact that its architecture is dirty. I mean who actually does the version controll? Who is the authority of concurrent version control (git, svn, cvs) or the changelog-xmls or the database or all of them? Who wins in a case of difference, the database or the changelog? – Grim Mar 29 '20 at 07:12
  • The architecture is most definitely not dirty. The "authority" is the same version control where your application code is stored, because that's where the migrations (XML, SQL) are stored. –  Mar 29 '20 at 07:14
  • @a_horse_with_no_name Keep in mind that changelog can remove columns that have been created by previous changelogs. So setting up a new database will add a column and remove it some changelogs later. But I checked out a specific version of the project what have added some Java-classes in the past and removed those classes again what is not part of the current checkout-version of the project. So why must the database do this but the vcs not? You have the whole database history checked out. You must have redundant db-informations. – Grim Mar 29 '20 at 07:21

3 Answers3

0

Use a help table database_instance filled with a GUID.

Before the schema-update-query, get the GUID and lock it in a local dictionary.

Or simply lock the table locally, no matter which database it is located on:

var tablename = new object();
lock(tablename)
{
   <your query>
}
draz
  • 793
  • 6
  • 10
  • Good idea. For my approach unfortunately useless because I am not sure that this table exists. If the table does not exists I must create it and violate this way against the `without executing any schema-change`-requirement. – Grim Mar 29 '20 at 07:08
0

You could use advisory locks for that.

An advisory lock uses an arbitrary number provided by the caller and exists until it's explicitly released or the connection is closed. You can convert the tablename to its oid and use that number as the lock identifier.

Something like (without error handling or cleanup!)

ResultSet st.executeQuery("select pg_try_advisory_lock('" + tableName + "'::regclass::oid::bigint)");
rs.next();
boolean locked = rs.getBoolean(1);
if (locked) {
  // do your migration
  .....

  // once you are done with the table, release the lock immediately
  // alternatively keep it open and it will be released automatically 
  // when the connection is closed
  st.execute("select pg_advisory_unlock('" + tableName + "'::regclass::oid::bigint)");
} else {
  // handle the conflict
}

Maybe you want to move that functionality into it's own method.


Alternatively you could just try to acquire one lock with some hard-coded number at the start of the migration (independently of the table to be modified).

  • This is a good idea, but even if the question is related to postgres it must work in other database systems, even exotic ones and system in the future. And I try to use methods that are normal SQL standard. – Grim Mar 31 '20 at 11:20
  • You are experienced, could you please check my latest suggestion? Thanks. – Grim Mar 31 '20 at 11:22
  • @PeterRader: so how will you handle databases that don't support transaction DDL? –  Mar 31 '20 at 11:32
  • Well they need to handle transaction. Thats is another restriction. – Grim Mar 31 '20 at 12:08
0

Hm, I have another idea.

I could first measure what time consumptation will be expected for every single database by simply executing ROLLBACK instead of COMMIT.

Lets say

  1. is a single time consumptation between BEGIN and COMMIT of one database and
  2. is the amount of all time-consumptations.
  3. is the maximum, the hightest value of all measured time consumptations max().

Then I will set the Watchdog timeout to

2 * 

If the watchdog balks we have duplicate databaseconnections and clearly must return false.

Grim
  • 1,938
  • 10
  • 56
  • 123