Problem:
I'm working on a project that consists of multiple studies and a group of users that each participates in one of the studies. Every study divides the participants in two groups based on a list that is generated using some randomization algorithm. Upon signing up each user is assigned to a study and their group is determined by the order of signup and the corresponding index in the groups list. For example if study A
has total seats of 4
and the groups list is [0, 1, 1, 0]
the first user is assigned to group 0
, the second one to 1
and so on until the study is full.
There are other user roles defined in the project which are the admins and can be assigned to multiple studies without occupying a position in the study. That means the relation of users to studies is n:m
.
The problem that occurs in the current implementation is a race condition when assigning users to studies and study groups. The code is provided below and the way it works is that it overrides the addUser
of Study
model and whenever a user is added to a study it checks how many users are already in the study and gives the user the current index of the group list which is the seatsTaken
number. This works so long as the users are added to the study in intervals. But whenever multiple users are added at the same time the asynchronous queries cause a race condition and the seatsTaken
count is affected by other users signing up at the same time.
In the example below the users which are assigned to study A
in intervals have the correct groups assigned but study B
with simultaneous queries has incorrect groups assignment.
const Sequelize = require('sequelize');
const assert = require('assert');
const sequelize = new Sequelize({
database: 'database',
username: 'username',
password: 'password',
dialect: process.env.DB_DIALECT || 'sqlite',
storage: 'db.sqlite',
logging: false
});
const User = sequelize.define('user', {
id: {
type: Sequelize.INTEGER,
autoIncrement: true,
primaryKey: true,
},
group: {
type: Sequelize.INTEGER,
allowNull: true,
defaultValue: null
}
});
// Groups list for studies 'A' and 'B'
const groupLists = {
a: [0, 1, 1, 0],
b: [1, 0, 1, 0]
}
const Study = sequelize.define('study', {
id: {
type: Sequelize.INTEGER,
autoIncrement: true,
primaryKey: true,
},
name: {
type: Sequelize.STRING,
allowNull: false
},
seatsTotal: {
type: Sequelize.INTEGER,
defaultValue: 0
}
});
// n:m relation between users and studies
User.belongsToMany(Study, {through: 'UserStudy'});
Study.belongsToMany(User, {through: 'UserStudy'});
// Overridden 'addUser' method for groups assignment
Study.prototype.addUser = async function(user) {
// Count already occupied seats
const seatsTaken = await User.count({
include: [{
model: Study,
where: {
name: this.name
}
}]
});
// Add the user to study
await Study.associations.users.add(this, user);
// Assign the group of the user based on the seatsTaken
await user.update({ group: groupLists[this.name][seatsTaken] });
}
sequelize.sync({force: true}).then(async () => {
// Studies 'A' and 'B' with 4 seats
await Study.bulkCreate([{name: 'a', seatsTotal: 4}, {name: 'b', seatsTotal: 4}]);
// 8 users
await User.bulkCreate(new Array(8).fill(0).map(() => ({})));
const studies = await Study.findAll();
const users = await User.findAll();
// Assign half of the users to study 'A' in intervals
users.filter((_, idx) => idx % 2 === 0).forEach((user, idx) => {
setTimeout(() => {
studies[0].addUser(user);
}, 100*idx);
});
// Assign the other half to study 'B' at the same time
await Promise.all(users.filter((_, idx) => idx % 2 === 1).map(user => {
return studies[1].addUser(user);
}));
setTimeout(async () => {
// Wait for all queries to finish and assert the results
const userStudies = await User.findAll({
include: [Study]
});
const studyUsersA = userStudies.filter(u => u.studies.some(s => s.name === 'a'));
const studyUsersB = userStudies.filter(u => u.studies.some(s => s.name === 'b'));
try {
console.log('Group list A actual:', studyUsersA.map(u => u.group), 'expected:', groupLists['a']);
assert.deepEqual(studyUsersA.map(u => u.group).sort((a, b) => a-b), groupLists['a'].sort((a, b) => a-b), 'Group list A is not assigned correctly');
console.log('Group list B actual:', studyUsersB.map(u => u.group), 'expected:', groupLists['b']);
assert.deepEqual(studyUsersB.map(u => u.group).sort((a, b) => a-b), groupLists['b'].sort((a, b) => a-b), 'Group list B is not assigned correctly');
console.log(`Passed: Group lists are assigned correctly.`);
} catch (e) {
console.log(`Failed: ${e.message}`);
}
}, 500);
});
Related questions that I could find are either about incrementing one value in one table or they just mention transactions and locks without providing example code:
Avoiding race condition with Nodejs Sequelize
How to lock table in sequelize, wait until another request to be complete
Addition and Subtraction Assignment Operator With Sequelize
Database race conditions
Limitations:
- The project stack is
nodejs
,expressjs
andsequelize
withmysql
database for production andsqlite
for development and tests. - The solution should work for both
sqlite
andmysql
. - It is preferred that the groups lists are not stored in the database. The lists are generated by an algorithm and randomization seeds but in the example code they're hardcoded.
- The solution should be a sequelize solution and not throttling or queuing user requests in express server.
- In the case of simultaneous requests it is not strictly required that the exact order of users signup is preserved since it's not really verifiable which user is added to the study first, but the final result must have the correct number of
0
s and1
s which are the assigned groups. - I've tried sequelize transactions but I had lots of issues with sqlite compatibilities and also express requests failing because of database locks but that might have been because of my lack of knowledge on how to do it properly. The limitation here is that requests should not fail because of database locks.
The provided code is a minimal example reproducing the issue. Please use it as a base.
To run the code
npm install sequelize sqlite3 mysql2
sqlite:
node index.js
mysql (using docker):
docker run -d --env MYSQL_DATABASE=database --env MYSQL_USER=username --env MYSQL_PASSWORD=password --env MYSQL_RANDOM_ROOT_PASSWORD=yes -p 3306:3306 mysql:5.7
DB_DIALECT=mysql node index.js
Note:
- The example code is only for demonstration of the issue in the current implementation and the intervals and timeouts are there to simulate the users interaction with the server. Please do not focus on the patterns in the example being wrong and rather focus on the problem itself and how it can be approached in a better way while meeting the requirements mentioned in the limitations section.
- This is part of a fairly big project and I might update the requirements based on the actual project requirements and the feedbacks that I receive here.
Please let me know if I should provide any other information. Thank you in advance.