-3

Added by Edit (18-Jun-2023):

I have run more tests and think it's possible the bug lies in MySQL. I'm using version 8.0.19 on MacOS.

I resumed testing this morning by transforming one table from MyISAM to InnoDB and repeating the test with pool.query(). It worked for the first time.

I transformed the table back to MyISAM and repeated the test. It still worked. While the table still ended up in the SHARED_NO_READ_WRITE state, it didn't keep the table from being written. This behavior is consistent not with connections being cleaned out when they're released but just before they're reallocated.

As to why it is now working, one possibility is that the transformation of a single table to InnoDB altered MySQL behavior, perhaps changing the setting of some internal state variable. And Even though the table was later transformed back to MyISAM it continued working.

Added by Edit (17-Jun-2023):

This is a long question, and the comments that didn't realize I was using MyISAM (where explicit table locking is required) and not InNODB and the negative votes at one point made me think I had misinterpreted what was happening, but further testing showed that something was amiss with table locking with pool.query(). It locks the tables referenced in a query and it doesn't free the locks when it releases the connection.

Switching from pool.query to pool.getConnection() -> conn.query("lock tables...") -> conn.query(query) -> <more conn.query's> -> conn.query("unlock tables") works.

It would be very informative to do further testing and track down if the misbehavior of pool.query isn't in some way due to my own code, perhaps the way I use promises and useEffect, but I have already spent too much time on this.

Original question:

I'm writing this to verify that what I think I've learned about locking tables in nodejs/MySQL is actually correct

I've been using MySQL with PHP for many years and have recently begun implementing a new website using nodejs/Express/MySQL as a backend (ReactJS/MUI is the frontend). In PHP I'm accustomed to doing this:

$Result1 = doQuery($query1);
$Result2 = doQuery($query2);
$Result3 = doQuery($query3);

Because nodejs is asynchronous, nodejs/MySQL is implemented using callbacks, so the equivalent in nodejs/MySQL would be something like this:

conn.query(query1, (error, results1, fields) => {
     if (error) throw error
     conn.query(query2, (error, results2, fields) => {
       if (error) throw error
         conn.query(query3, (error, results3, fields) => {
           if (error) throw error

The nested callback structure looked both cumbersome and inelegant, so I created a database class to allow me to code in a way very similar to PHP. I used pooling connections because it allows parallel queries. Here's an example of what my approach looks like. Error checking is inside the class method structure:

const result1 = await db.doQuery(query1)
const result2 = await db.doQuery(query2)
const result3 = await db.doQuery(query3)

This worked perfectly on a test database I'd created. I was able to respond to requests from the front end, fetch all the data I needed, then return that data to the front end for presentation.

Then I reached the point where I was ready to begin writing to the database (replace, insert, update), and that's when things fell apart. I tried to lock the tables I was writing to and found my app hanging and one table getting locked in the SHARED_NO_READ_WRITE state, essentially inaccessible, even from the MySQL command line app.

Further investigation revealed that with pooling connections each query is one independent connection to the database because it carries out the sequence pool.getConnection -> conn.query -> conn.release.

I'd read this when I originally implemented my database class but hadn't realized the implications. Take this, for example:

db.query('lock tables myTable write')

All this does is fetch a connection from the pool, execute the query that locks the table, then releases the connection, which unlocks the table. When the next query is executed a new connection is opened and no tables are locked.

But I suspect that release of the connection does not actually unlock the tables, because if it were then when my app finished handling that request then no tables would have been locked, but the tables were still locked. As I mentioned earlier, one of the tables was in the SHARED_NO_READ_WRITE state, which implies a deadly embrace (two tables each waiting on the other to unlock).

I was unable to find anything explicit online about the tables being automatically unlocked when the connection was released, but I did find several sites that implied this is the case.

But I'm not sure what to believe because what I observed is consistent with the tables not unlocking when the connection is released. Instead, the nodejs process hung and the database became wedged with one table getting in that unreadable/unwritable state until I killed the nodejs process. Again, I couldn't even access this table from the MySQL command line tool. Here is some pseudocode representative of the nodejs that I ran:

await db.doQuery('lock tables Table1 read, Table1 read, Table 3 read')
const Result1 = await db.getQueryRow(readQuery1)
const Result2 = await db.getQueryRow(readQuery2)
await db.doQuery('lock tables Table2 write')
await db.doQuery(writeQuery1)
const ID = await db.getQueryDatum('select last_insert_id()')
await db.doQuery('lock tables Table1 write')
await db.doQuery(writeQuery2)
await db.doQuery('unlock tables')

I then ran a testcase where instead of using pool connections I just used a single database connection, something like this:

const conn = mysql.connection(...)
conn.query(queryString, (error, results, fields) => {
   if (error) throw error
   ...handle results
}

This code received a MySQL error that I had not locked the table:

Error: ER_TABLE_NOT_LOCKED: Table 'myTable' was not locked with LOCK TABLES

I had never received this error when using pool connections, so this implies that pool.query automatically locks tables. After parsing the query it knows which tables are being read and written and must issue locks for them. When it releases the connection it must also unlock the tables.

But given the behavior I've observed, if the pool.query was actually a "lock tables" request then it doesn't unlock the tables you locked, and eventually you encounter a table locking conflict. When that happens the only way I've figured out to unlock the tables is to kill the nodejs process.

To solve the problem I added a getConnection method to my database class, and using that the above code now looks like this:

const conn = db.getConnection() // Fetch a connection from the pool
await conn.doQuery('lock tables Table1 read, Table2 read, Table 3 read')
const Result1 = await conn.getQueryRow(readQuery1)
const Result2 = await conn.getQueryRow(readQuery2)
await conn.doQuery('lock tables Table2 write')
await conn.doQuery(writeQuery1)
const ID = await conn.getQueryDatum('select last_insert_id()')
await conn.doQuery('lock tables Table1 write')
await conn.doQuery(writeQuery2)
await conn.doQuery('unlock tables')
conn.release()

This works perfectly. For parallelism I can fetch more than one connection and postpone my await statements strategically.

I should mention that had I not used pooling connections but had instead just fetched a single connection at the beginning of a session, locked tables as necessary, then unlocked them before ending the connection, that I would not have run into these problems, but I wanted the option of parallelism.

I've made a number of surmises and guesses. Corrections and additional information would be much appreciated.

Percy
  • 35
  • 8
  • 1
    No, you got this all wrong. 1) why do you need explicit table locks in the first place? Mysql's innodb table uses row level locks to handle concurrency issues. 2) if you need to execute multipke sql statements interacting with each other, then use a single connection. If the sql statements make multiple data changes, then wrap them into a transaction. – Shadow Jun 17 '23 at 02:04
  • Please see my comments to the answer from Bill Karwin. – Percy Jun 17 '23 at 15:04

1 Answers1

2

You're right that you need to lock tables on the same connection where you rely on the table being locked. In other words, the lock is held by the session associated with one connection.

But I agree with the comment from Shadow that it is rare that you need to lock tables at all (and you probably don't). But the comment "you got this all wrong" is an exaggeration.

Also because of your code example where you "upgrade" the locks from read to write on table2 and then table1 makes me think that you didn't notice this sentence in the manual:

LOCK TABLES implicitly releases any table locks held by the current session before acquiring new locks.

So each time you try to "amend" your locks, you give up all the other locks you thought you held.

There are some circumstances where locking tables is appropriate, but they are uncommon. You would probably be better off using transactions instead of table locks, assuming your tables are InnoDB (which they really should be).

Using transactions also requires that you use the same connection. In languages that use connection pools, it is therefore a requirement that you acquire the connection and use it for all the queries in your transaction, instead of using a new connection from the pool for each query.

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • Answering both you and Shadow, my understanding is that locking tables is essential. I'll explain by example. You gather data from tables 1, 2 and 3, process the data, then write the result to table 4. If another process writes data to tables 1, 2 or 3 while you're processing then the information you write to table 4 will be inconsistent. Locking the tables prevents you from putting the tables into an inconsistent state, and it prevents others from accessing the tables while you're still in the middle of updating them and they're in an inconsistent state. – Percy Jun 17 '23 at 13:12
  • Also, I'm using MyISAM, not InnoDB. It was a few months ago that I made the decision to use MyISAM because it has better performance when activity is heavily weighted on the read side. So I have no transaction support. – Percy Jun 17 '23 at 13:19
  • I feel like discussions would be more helpful than this Q/A format. Where do people go to just discuss nodejs? I just started up a thread at https://www.evcforum.net/dm.php?control=msg&t=20427. – Percy Jun 17 '23 at 13:31
  • Try Reddit. Stack Overflow discourages the discussion format. They want each post to be a single discrete question with a clear answer. – Bill Karwin Jun 17 '23 at 15:36
  • Since you're using MyISAM, which has no support for transactions, you may have to use table locking. I think you'll find the tradeoff to be a poor one. Even if it's true that MyISAM has better performance for some queries, you will experience a bottleneck of performance because you have made access to your tables serial. I.e. there is no concurrency possible for this operation, so throughput will be limited. – Bill Karwin Jun 17 '23 at 15:39
  • Looking up transactions, I think that you and Shadow were saying not that table locking isn't necessary, but that with InnoDB and transactions the locking is done for you, and even better, it is done at the row level rather than the entire table level. Thanks for the pointer to Reddit. – Percy Jun 17 '23 at 16:44
  • Just remembered. After I'd gotten things to work, I only decided to file a question because of the bug I hit where "pool.query('lock tables Table1 write')" causes Table1 to completely lock up and made figuring out what was going on very difficult. Doesn't that seem like a bug? – Percy Jun 17 '23 at 16:58
  • @BillKarwin SO was deliberatly designed to disadvantage long discussions. Discussions usually only help the asker. SO's Q/A format on the other hand is lot better for finding answers. – Shadow Jun 17 '23 at 17:00
  • @Percy no, it does not seem to be a bug as the table is locked in a connectuon, the connection is kept open, but the lock is never released in the same connection. – Shadow Jun 17 '23 at 17:01
  • @Shadow Finally, lightbulb goes on. Thank you. Please confirm this understanding. When you create the pool, all its connections are open. A pool.query will use one of the open connections. If that query locks a table then when the connection is released back to the pool then that table is still locked. A subsequent pool.query might use a different connection, and for that connection there is no locked table, and any locks this query issues might find themselves in contention with other table locks still active in other connections. Eventually you can end up in the SHARED_NO_READ_WRITE state. – Percy Jun 17 '23 at 17:37
  • @Percy When you release a connection back to the pool, it should be reset, so locks are released, transactions are rolled back, session variables are unset, temporary tables are dropped, etc. The next user of that connection should not find work in progress from the previous user of that connection. That would be a security leak. For example, suppose your bank stores some of your transactions in a temp table, and then subsequently another customer's session happens to read that data from the temp table? That would be bad, and that's not the way connection pools work. – Bill Karwin Jun 17 '23 at 17:47
  • @BillKarwin So the connection isn't closed when it's released, but it *is* cleared out, including locked tables. That puts me right back to my original idea, that "pool.query('lock tables Table1 write')" is a no-op. When I issued queries like that in order to lock tables, I wasn't actually ending up with any locked tables at all. And doesn't that mean that when I issued the "pool.query('lock tables Table1 write')" statement and Table1 ended up in the SHARED_NO_READ_WRITE state that it is a bug? Sorry for persistence, but maybe will reduce future stupid questions. – Percy Jun 17 '23 at 18:00
  • 1
    I don't use node.js, so I can't test whether a bug exists. The `pool.query()` example is not a no-op; it acquires a table lock, waiting if necessary because other sessions might be running queries. But it should then release the table lock as soon as it returns the connection to the pool, so the table lock is short-lived. – Bill Karwin Jun 17 '23 at 18:15
  • I've run focused tests in nodejs. A single use of "pool.query('select * from TableX')" leaves TableX in the SHARED_READ_ONLY state. A sequence of "pool.query" calls leaves many tables in that state. A single "pool.query('insert into TableX...')" leaves TableX in the SHARED_WRITE state, and after that nothing can write to it. The connection isn't being cleared after it is released. – Percy Jun 17 '23 at 20:18
  • Sounds like node.js has a serious bug, if it's not resetting connection state after returning the connection to the pool. It's another reason I'm glad I don't use node.js. – Bill Karwin Jun 17 '23 at 20:20