3

I'm writing a Java API, and I am using MySQL. What is the best practice for thread safe reading and writing to the database?

For instance, take the following createUser method:

// Create a user
// If the account does not have any users, make this
// user the primary account user
public void createUser(int accountId) {
    String sql =
            "INSERT INTO users " +
            "(user_id, is_primary) " +
            "VALUES " +
            "(0, ?) ";
    try (
            Connection conn = JDBCUtility.getConnection();
            PreparedStatement ps = conn.prepareStatement(sql)) {

        int numberOfAccountsUsers = getNumberOfAccountsUsers(conn, accountId);
        if (numberOfAccountsUsers > 0) {
            ps.setString(1, null);
        } else {
            ps.setString(1, "y");
        }

        ps.executeUpdate();
    } catch (SQLException e) {
        e.printStackTrace();
    }
}

// Get the number of users in the specified account
public int getNumberOfAccountsUsers(Connection conn, int accountId) throws SQLException {
    String sql =
            "SELECT COUNT(*) AS count " +
            "FROM users ";
    try (
            PreparedStatement ps = conn.prepareStatement(sql);
            ResultSet rs = ps.executeQuery()) {

        return rs.getInt("count");
    }
}

Say source A calls createUser, and has just read that account ID 100 has 0 users. Then, source B calls createUser, and before source A has executed its update, source B has also read that account ID has 0 users. As a result, both source A and source B have created primary users.

How can this situation be safely implemented?

Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
plsplox
  • 179
  • 2
  • 11
  • 1
    So this question actually has nothing to do with thread safety. Is this your actual question or are you using it just as a way to illustrate a question you have? Also, what database system are you using (MySQL, SQL Server, etc)? – Chris Maggiulli Dec 27 '17 at 19:49
  • He/She has mentioned MySQL – Ele Dec 27 '17 at 19:51
  • By the way, how are you using `accountId` in your second method? – O.Badr Dec 27 '17 at 20:35
  • Why dont you make the `is_primary` field of user derivable. For example, any user that has oldest create time for particular account can be a primary user. Otherwise you most likely will need to have table level lock. – tsolakp Dec 27 '17 at 20:38
  • Your code permits creation of many users with `user_id` 1 and `is_primary` flag `NULL`. Did you mean to insert `numberOfAccountsUsers + 1` as the `user_id`? I don't think this would be a problem with typical unique key constraints on the underlying table and a suitable transaction setting on the connection. – erickson Dec 27 '17 at 21:23

3 Answers3

2

This kind of thing is what transactions are for, they allow you to enter multiple statements with some kind of guarantees about consistency. Technically atomicity, consistency, isolation, and durability, see https://stackoverflow/q/974596.

You can lock rows with

select for update 

And the lock will be held until the end of the transaction.

For this case you might lock the whole table, run the select to count the rows, then perform the insert. With the table locked you know the row count won’t change between the select and the insert.

Removing the need for 2 statements by putting the select within the insert is a good idea. But in general if you’re using databases you need to be aware of transactions.

For local jdbc transactions (as opposed to xa ones) you need to use the same jdbc connection for all the statements that participate in the same transaction. Once all the statements have run, call commit on the connection.

Ease of using transactions is a selling point of the spring framework.

Nathan Hughes
  • 94,330
  • 19
  • 181
  • 276
0

Question Overview

To start, your question doesn't have anything to do with thread safety, it has to do with transactions and code that could be optimized better.

If I am reading this correctly you are trying to do is set the first user as the primary user. I probably wouldn't take this approach at all (that is, having a is_primary row for the first user) but if I did I would not include that condition in your Java application at all. Every time you create a user you are going to be not only evaluation a conditional but you are also making an unnecessary call to the database. Instead I would refactor like this.

Code

First, make sure your table is something like this.

CREATE TABLE `users` (
     user_id INT NOT NULL AUTO_INCREMENT,
     is_primary CHAR(1),
     name VARCHAR(30),
     PRIMARY KEY (user_id)
);

In other words your user_id should be auto_increment and not null. I included the name column because it helps illustrate my point below (but you can substitute that for the other columns on your user table other than user_id and is_primary). I also made user_id primary because it probably is.

If you want to just modify your current table it would be something like this.

ALTER TABLE `users` MODIFY COLUMN `user_id` INT not null auto_increment;

Then create a trigger such that every time you insert a row it checks to see if it's the first row and if so it updates accordingly.

delimiter |
    CREATE TRIGGER primary_user_trigger 
    AFTER INSERT ON users
      FOR EACH ROW BEGIN

        IF NEW.user_id = 1 THEN
           UPDATE users SET is_primary = 'y' where user_id = 1;
        END IF;
      END;
| delimiter ;

At that point you can just have a createUser method that inserts a new record into the database and you do not need to specify either the user_id or the primary field at all. So let's say your table is something like this:

Your createUser method would essentially just look like the following

// Create a user
// If the account does not have any users, make this
// user the primary account user
public void createUser(String name) {
    String sql =
            "INSERT INTO users (name) VALUES(?)"
    try (
            Connection conn = JDBCUtility.getConnection();
            PreparedStatement ps = conn.prepareStatement(sql));

            ps.setString(1, name);
            ps.executeUpdate();
    } catch (SQLException e) {
        e.printStackTrace();
    }
}

Ran twice the users table would look something like

|    user_id   |   is_primary   |   name   |
+--------------+----------------+----------+
|      1       |       y        |   Jimmy  |
---------------+----------------+----------+
|      2       |       null     |   Cindy  |

However...

However, even though I think the aforementioned solution is better than what was proposed in the original question, I still don't think it's the optimal way to handle this. Before suggesting anything I would need to know more about the project though. What is the first user primary? What is a primary user? Why can't the primary user just be set manually if there is only one? Is there an admin console? Etc.

If your question was an example....

So if your question was simply an example used to illustrate a larger question about transaction management you can use auto commit on the connection to false and manage your transactions manually. You can read more about autocommit on the Oracle JDBC Java documentation. Additionally as mentioned above you can change the table/row level locking depending on your specific actions but I believe that's a very unrealistic solution for this issue. You can also include the select as a subquery but again you are just putting a band-aid on bad practice.

All in all unless you want to completely change your primary user paradigm I think this is the most efficient and well organized way to do it.

Chris Maggiulli
  • 3,375
  • 26
  • 39
0

I think it's better to include your condition to set is_primary value in one insert query:

public void createUser(int accountId) {
    String sql = "INSERT INTO users (user_id, is_primary) " +
                 "SELECT 0, if(COUNT(*)=0, 'y', null) " +
                 "FROM users";
   //....
}

So it will be safe to run your code in a multithreading environment, and you can get rid of getNumberOfAccountsUsers method.

And in order to get rid of any doubt for the visibility of a concurrent insert (thanks to @tsolakp comment), as stated by the documentation:

The results of a concurrent INSERT may not be visible immediately.

You may either use the same Connection object for your insert statements so the MySql server will run them sequentially,

or

Use unique index for is_primary column, (I assume that y and null are the possible value. Note that multiple null values are permitted by all MySql engine), so in case of the unique constraint violation, you need just to re-run your insert query. This unique index solution will also work with your existing code.

O.Badr
  • 2,853
  • 2
  • 27
  • 36
  • 1
    You still need table lock to make sure two inserts wont get count as 0. – tsolakp Dec 27 '17 at 20:41
  • @tsolakp, I don't think so, check this [answer](https://stackoverflow.com/a/32288484/2114786) – O.Badr Dec 27 '17 at 21:12
  • That does not mean that insert2 will see data of insert1 and can still get 0 for count. From MySQL docs: "The results of a concurrent INSERT may not be visible immediately". – tsolakp Dec 27 '17 at 21:17
  • I may suggest to use the same connection, as it guarantees a sequential execution and a visibility as well, even It's not clear for me how visible data will be or won't be with concurrent INSERT (different connection). – O.Badr Dec 27 '17 at 22:04